cassandra 3.11.4 deadlock

2020-06-23 Thread John Doe
We are running apache-cassandra-3.11.4, 10 node cluster with -Xms32G -Xmx32G 
-Xmn8G using CMS.
after running couple of days one of the node become unresponsive and threaddump 
(jstack -F) shows deadlock.

Found one Java-level deadlock:
=

"Native-Transport-Requests-144":
  waiting to lock Monitor@0x7cd5142e4d08 (Object@0x7f6e00348268, a 
java/io/ExpiringCache),
  which is held by "CompactionExecutor:115134"
"CompactionExecutor:115134":
  waiting to lock Monitor@0x7f6bcaf130f8 (Object@0x7f6dff31faa0, a 
ch/qos/logback/core/joran/spi/ConfigurationWatchList),
  which is held by "Native-Transport-Requests-144"

Found a total of 1 deadlock.

I have seen this couple of time now with different nodes with following in 
system.log

 IndexSummaryRedistribution.java:77 - Redistributing index summaries
 NoSpamLogger.java:91 - Maximum memory usage reached (536870912), cannot 
allocate chunk of 1048576

also lookin in gc log there has not been a ParNew collection for last 10hrs, 
only CMS collections.

1739842.375: [GC (CMS Final Remark) [YG occupancy: 2712269 K (7549760 K)]
1739842.375: [Rescan (parallel) , 0.0614157 secs]
1739842.437: [weak refs processing, 0.994 secs]
1739842.437: [class unloading, 0.0231076 secs]
1739842.460: [scrub symbol table, 0.0061049 secs]
1739842.466: [scrub string table, 0.0043847 secs][1 CMS-remark: 
17696837K(25165824K)] 20409107K(32715584K), 0.0953750 secs] [Times: user=2.95 
sys=0.00, real=0.09 secs]
1739842.471: [CMS-concurrent-sweep-start]
1739848.572: [CMS-concurrent-sweep: 6.101/6.101 secs] [Times: user=6.13 
sys=0.00, real=6.10 secs]
1739848.573: [CMS-concurrent-reset-start]
1739848.645: [CMS-concurrent-reset: 0.072/0.072 secs] [Times: user=0.08 
sys=0.00, real=0.08 secs]
1739858.653: [GC (CMS Initial Mark) [1 CMS-initial-mark: 17696837K(25165824K)] 
20409111K(32715584K), 0.0584838 secs] [Times: user=2.68 sys=0.00, real=0.06 
secs]
1739858.713: [CMS-concurrent-mark-start]
1739860.496: [CMS-concurrent-mark: 1.784/1.784 secs] [Times: user=84.77 
sys=0.00, real=1.79 secs]
1739860.497: [CMS-concurrent-preclean-start]
1739860.566: [CMS-concurrent-preclean: 0.070/0.070 secs] [Times: user=0.07 
sys=0.00, real=0.07 secs]
1739860.567: [CMS-concurrent-abortable-preclean-start]
CMS: abort preclean due to time
1739866.333: [CMS-concurrent-abortable-preclean: 5.766/5.766 secs] [Times: 
user=5.80 sys=0.00, real=5.76 secs]

Java HotSpot(TM) 64-Bit Server VM (25.162-b12) for linux-amd64 JRE 
(1.8.0_162-b12)
Memory: 4k page, physical 792290076k(2780032k free), swap 16777212k(16693756k 
free)

CommandLine flags:
-XX:+AlwaysPreTouch
-XX:CICompilerCount=15
-XX:+CMSClassUnloadingEnabled
-XX:+CMSEdenChunksRecordAlways
-XX:CMSInitiatingOccupancyFraction=40
-XX:+CMSParallelInitialMarkEnabled
-XX:+CMSParallelRemarkEnabled
-XX:CMSWaitDuration=1
-XX:ConcGCThreads=50
-XX:+CrashOnOutOfMemoryError
-XX:GCLogFileSize=10485760
-XX:+HeapDumpOnOutOfMemoryError
-XX:InitialHeapSize=34359738368
-XX:InitialTenuringThreshold=1
-XX:+ManagementServer
-XX:MaxHeapSize=34359738368
-XX:MaxNewSize=8589934592
-XX:MaxTenuringThreshold=1
-XX:MinHeapDeltaBytes=196608
-XX:NewSize=8589934592
-XX:NumberOfGCLogFiles=10
-XX:OldPLABSize=16
-XX:OldSize=25769803776
-XX:OnOutOfMemoryError=kill -9 %p
-XX:ParallelGCThreads=50
-XX:+PerfDisableSharedMem
-XX:+PrintGC
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-XX:+ResizeTLAB
-XX:StringTableSize=103
-XX:SurvivorRatio=8
-XX:ThreadPriorityPolicy=42
-XX:ThreadStackSize=256
-XX:-UseBiasedLocking
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+UseConcMarkSweepGC
-XX:+UseCondCardMark
-XX:+UseFastUnorderedTimeStamps
-XX:+UseGCLogFileRotation
-XX:+UseNUMA
-XX:+UseNUMAInterleaving
-XX:+UseParNewGC
-XX:+UseTLAB
-XX:+UseThreadPriorities

--Thanks

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: cassandra 3.11.4 deadlock

2020-06-23 Thread Jeff Jirsa
Can you open a JIRA?

https://issues.apache.org/jira/projects/CASSANDRA/issues

On Tue, Jun 23, 2020 at 11:56 AM John Doe  wrote:

> We are running apache-cassandra-3.11.4, 10 node cluster with -Xms32G
> -Xmx32G -Xmn8G using CMS.
> after running couple of days one of the node become unresponsive and
> threaddump (jstack -F) shows deadlock.
>
> Found one Java-level deadlock:
> =
>
> "Native-Transport-Requests-144":
>   waiting to lock Monitor@0x7cd5142e4d08 (Object@0x7f6e00348268,
> a java/io/ExpiringCache),
>   which is held by "CompactionExecutor:115134"
> "CompactionExecutor:115134":
>   waiting to lock Monitor@0x7f6bcaf130f8 (Object@0x7f6dff31faa0,
> a ch/qos/logback/core/joran/spi/ConfigurationWatchList),
>   which is held by "Native-Transport-Requests-144"
>
> Found a total of 1 deadlock.
>
> I have seen this couple of time now with different nodes with following in
> system.log
>
>  IndexSummaryRedistribution.java:77 - Redistributing index summaries
>  NoSpamLogger.java:91 - Maximum memory usage reached (536870912), cannot
> allocate chunk of 1048576
>
> also lookin in gc log there has not been a ParNew collection for last
> 10hrs, only CMS collections.
>
> 1739842.375: [GC (CMS Final Remark) [YG occupancy: 2712269 K (7549760 K)]
> 1739842.375: [Rescan (parallel) , 0.0614157 secs]
> 1739842.437: [weak refs processing, 0.994 secs]
> 1739842.437: [class unloading, 0.0231076 secs]
> 1739842.460: [scrub symbol table, 0.0061049 secs]
> 1739842.466: [scrub string table, 0.0043847 secs][1 CMS-remark:
> 17696837K(25165824K)] 20409107K(32715584K), 0.0953750 secs] [Times:
> user=2.95 sys=0.00, real=0.09 secs]
> 1739842.471: [CMS-concurrent-sweep-start]
> 1739848.572: [CMS-concurrent-sweep: 6.101/6.101 secs] [Times: user=6.13
> sys=0.00, real=6.10 secs]
> 1739848.573: [CMS-concurrent-reset-start]
> 1739848.645: [CMS-concurrent-reset: 0.072/0.072 secs] [Times: user=0.08
> sys=0.00, real=0.08 secs]
> 1739858.653: [GC (CMS Initial Mark) [1 CMS-initial-mark:
> 17696837K(25165824K)] 20409111K(32715584K), 0.0584838 secs] [Times:
> user=2.68 sys=0.00, real=0.06 secs]
> 1739858.713: [CMS-concurrent-mark-start]
> 1739860.496: [CMS-concurrent-mark: 1.784/1.784 secs] [Times: user=84.77
> sys=0.00, real=1.79 secs]
> 1739860.497: [CMS-concurrent-preclean-start]
> 1739860.566: [CMS-concurrent-preclean: 0.070/0.070 secs] [Times: user=0.07
> sys=0.00, real=0.07 secs]
> 1739860.567: [CMS-concurrent-abortable-preclean-start]
> CMS: abort preclean due to time
> 1739866.333: [CMS-concurrent-abortable-preclean: 5.766/5.766 secs] [Times:
> user=5.80 sys=0.00, real=5.76 secs]
>
> Java HotSpot(TM) 64-Bit Server VM (25.162-b12) for linux-amd64 JRE
> (1.8.0_162-b12)
> Memory: 4k page, physical 792290076k(2780032k free), swap
> 16777212k(16693756k free)
>
> CommandLine flags:
> -XX:+AlwaysPreTouch
> -XX:CICompilerCount=15
> -XX:+CMSClassUnloadingEnabled
> -XX:+CMSEdenChunksRecordAlways
> -XX:CMSInitiatingOccupancyFraction=40
> -XX:+CMSParallelInitialMarkEnabled
> -XX:+CMSParallelRemarkEnabled
> -XX:CMSWaitDuration=1
> -XX:ConcGCThreads=50
> -XX:+CrashOnOutOfMemoryError
> -XX:GCLogFileSize=10485760
> -XX:+HeapDumpOnOutOfMemoryError
> -XX:InitialHeapSize=34359738368
> -XX:InitialTenuringThreshold=1
> -XX:+ManagementServer
> -XX:MaxHeapSize=34359738368
> -XX:MaxNewSize=8589934592
> -XX:MaxTenuringThreshold=1
> -XX:MinHeapDeltaBytes=196608
> -XX:NewSize=8589934592
> -XX:NumberOfGCLogFiles=10
> -XX:OldPLABSize=16
> -XX:OldSize=25769803776
> -XX:OnOutOfMemoryError=kill -9 %p
> -XX:ParallelGCThreads=50
> -XX:+PerfDisableSharedMem
> -XX:+PrintGC
> -XX:+PrintGCDetails
> -XX:+PrintGCTimeStamps
> -XX:+ResizeTLAB
> -XX:StringTableSize=103
> -XX:SurvivorRatio=8
> -XX:ThreadPriorityPolicy=42
> -XX:ThreadStackSize=256
> -XX:-UseBiasedLocking
> -XX:+UseCMSInitiatingOccupancyOnly
> -XX:+UseConcMarkSweepGC
> -XX:+UseCondCardMark
> -XX:+UseFastUnorderedTimeStamps
> -XX:+UseGCLogFileRotation
> -XX:+UseNUMA
> -XX:+UseNUMAInterleaving
> -XX:+UseParNewGC
> -XX:+UseTLAB
> -XX:+UseThreadPriorities
>
> --Thanks
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: cassandra 3.11.4 deadlock

2020-06-23 Thread John Doe


Yes
here is one more example, happended on another node
 
using TWCS
compaction = {'class': 
'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy',
  'compaction_window_size': '30',
  'compaction_window_unit': 'MINUTES',
  'max_threshold': '32',
  'min_threshold': '4'}

##
"CompactionExecutor:244228":
  waiting to lock Monitor@0x7bb425df7ee8 (Object@0x7fd71f91e2d8, a 
org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy),
  which is held by "CompactionExecutor:244235"
"CompactionExecutor:244235":
  waiting to lock Monitor@0x7bb3d83ea188 (Object@0x7fd487f937b0, a 
java/io/ExpiringCache),
  which is held by "CompactionExecutor:244228"


I didnt know which thread is it, so I got all CompactionExecutor and 
SizeTieredCompactionStrategy threads.


### CompactionExecutor Threads

Thread 111461: (state = BLOCKED)
 - java.io.ExpiringCache.get(java.lang.String) @bci=0, line=75 (Compiled frame)
 - java.io.UnixFileSystem.canonicalize(java.lang.String) @bci=17, line=152 
(Compiled frame)
 - java.io.File.getCanonicalPath() @bci=27, line=618 (Compiled frame)
 - java.io.FilePermission$1.run() @bci=103, line=215 (Compiled frame)
 - java.io.FilePermission$1.run() @bci=1, line=203 (Compiled frame)
 - java.security.AccessController.doPrivileged(java.security.PrivilegedAction) 
@bci=0 (Compiled frame)
 - java.io.FilePermission.init(int) @bci=97, line=203 (Compiled frame)
 - java.io.FilePermission.(java.lang.String, java.lang.String) @bci=10, 
line=277 (Compiled frame)
 - java.lang.SecurityManager.checkRead(java.lang.String) @bci=8, line=888 
(Compiled frame)
 - java.io.File.length() @bci=13, line=969 (Compiled frame)
 - org.apache.cassandra.io.sstable.format.big.BigTableWriter.openEarly() 
@bci=87, line=284 (Compiled frame)
 - 
org.apache.cassandra.io.sstable.SSTableRewriter.maybeReopenEarly(org.apache.cassandra.db.DecoratedKey)
 @bci=111, line=179 (Compiled frame)
 - 
org.apache.cassandra.io.sstable.SSTableRewriter.append(org.apache.cassandra.db.rows.UnfilteredRowIterator)
 @bci=9, line=134 (Compiled frame)
 - 
org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.realAppend(org.apache.cassandra.db.rows.UnfilteredRowIterator)
 @bci=5, line=65 (Compiled frame)
 - 
org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(org.apache.cassandra.db.rows.UnfilteredRowIterator)
 @bci=12, line=142 (Compiled frame)
 - org.apache.cassandra.db.compaction.CompactionTask.runMayThrow() @bci=490, 
line=201 (Compiled frame)
 - org.apache.cassandra.utils.WrappedRunnable.run() @bci=1, line=28 (Compiled 
frame)
 - 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(org.apache.cassandra.db.compaction.CompactionManager$CompactionExecutorStatsCollector)
 @bci=6, line=85 (Compiled frame)
 - 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(org.apache.cassandra.db.compaction.CompactionManager$CompactionExecutorStatsCollector)
 @bci=2, line=61 (Compiled frame)
 - 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run()
 @bci=128, line=266 (Compiled frame)
 - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=511 
(Compiled frame)
 - java.util.concurrent.FutureTask.run() @bci=42, line=266 (Compiled frame)
 - 
java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
 @bci=95, line=1149 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=624 
(Compiled frame)
 - 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(java.lang.Runnable)
 @bci=1, line=81 (Compiled frame)
 - org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$6.run() @bci=4 
(Compiled frame)
 - java.lang.Thread.run() @bci=11, line=748 (Compiled frame)
Thread 111416: (state = BLOCKED)
 - java.io.ExpiringCache.get(java.lang.String) @bci=0, line=75 (Compiled frame)
 - java.io.UnixFileSystem.canonicalize(java.lang.String) @bci=17, line=152 
(Compiled frame)
 - java.io.File.getCanonicalPath() @bci=27, line=618 (Compiled frame)
 - java.io.FilePermission$1.run() @bci=103, line=215 (Compiled frame)
 - java.io.FilePermission$1.run() @bci=1, line=203 (Compiled frame)
 - java.security.AccessController.doPrivileged(java.security.PrivilegedAction) 
@bci=0 (Compiled frame)
 - java.io.FilePermission.init(int) @bci=97, line=203 (Compiled frame)
 - java.io.FilePermission.(java.lang.String, java.lang.String) @bci=10, 
line=277 (Compiled frame)
 - java.lang.SecurityManager.checkRead(java.lang.String) @bci=8, line=888 
(Compiled frame)
 - java.io.File.length() @bci=13, line=969 (Compiled frame)
 - org.apache.cassandra.io.sstable.format.big.BigTableWriter.openEarly() 
@bci=87, line=284 (Compiled frame)
 - 
org.apache.cassandra.io.sstable.SSTableRewriter.maybeReo

Re: cassandra 3.11.4 deadlock

2020-06-23 Thread John Doe
https://issues.apache.org/jira/browse/CASSANDRA-15898


> Sent: Tuesday, June 23, 2020 at 4:16 PM
> From: "John Doe" 
> To: user@cassandra.apache.org
> Subject: Re: cassandra 3.11.4 deadlock
>
> 
> Yes
> here is one more example, happended on another node
>  
> using TWCS
> compaction = {'class': 
> 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy',
>   'compaction_window_size': '30',
>   'compaction_window_unit': 'MINUTES',
>   'max_threshold': '32',
>   'min_threshold': '4'}
> 
> ##
> "CompactionExecutor:244228":
>   waiting to lock Monitor@0x7bb425df7ee8 (Object@0x7fd71f91e2d8, a 
> org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy),
>   which is held by "CompactionExecutor:244235"
> "CompactionExecutor:244235":
>   waiting to lock Monitor@0x7bb3d83ea188 (Object@0x7fd487f937b0, a 
> java/io/ExpiringCache),
>   which is held by "CompactionExecutor:244228"
> 
> 
> I didnt know which thread is it, so I got all CompactionExecutor and 
> SizeTieredCompactionStrategy threads.
> 
> 
> ### CompactionExecutor Threads
> 
> Thread 111461: (state = BLOCKED)
>  - java.io.ExpiringCache.get(java.lang.String) @bci=0, line=75 (Compiled 
> frame)
>  - java.io.UnixFileSystem.canonicalize(java.lang.String) @bci=17, line=152 
> (Compiled frame)
>  - java.io.File.getCanonicalPath() @bci=27, line=618 (Compiled frame)
>  - java.io.FilePermission$1.run() @bci=103, line=215 (Compiled frame)
>  - java.io.FilePermission$1.run() @bci=1, line=203 (Compiled frame)
>  - 
> java.security.AccessController.doPrivileged(java.security.PrivilegedAction) 
> @bci=0 (Compiled frame)
>  - java.io.FilePermission.init(int) @bci=97, line=203 (Compiled frame)
>  - java.io.FilePermission.(java.lang.String, java.lang.String) @bci=10, 
> line=277 (Compiled frame)
>  - java.lang.SecurityManager.checkRead(java.lang.String) @bci=8, line=888 
> (Compiled frame)
>  - java.io.File.length() @bci=13, line=969 (Compiled frame)
>  - org.apache.cassandra.io.sstable.format.big.BigTableWriter.openEarly() 
> @bci=87, line=284 (Compiled frame)
>  - 
> org.apache.cassandra.io.sstable.SSTableRewriter.maybeReopenEarly(org.apache.cassandra.db.DecoratedKey)
>  @bci=111, line=179 (Compiled frame)
>  - 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(org.apache.cassandra.db.rows.UnfilteredRowIterator)
>  @bci=9, line=134 (Compiled frame)
>  - 
> org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.realAppend(org.apache.cassandra.db.rows.UnfilteredRowIterator)
>  @bci=5, line=65 (Compiled frame)
>  - 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(org.apache.cassandra.db.rows.UnfilteredRowIterator)
>  @bci=12, line=142 (Compiled frame)
>  - org.apache.cassandra.db.compaction.CompactionTask.runMayThrow() @bci=490, 
> line=201 (Compiled frame)
>  - org.apache.cassandra.utils.WrappedRunnable.run() @bci=1, line=28 (Compiled 
> frame)
>  - 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(org.apache.cassandra.db.compaction.CompactionManager$CompactionExecutorStatsCollector)
>  @bci=6, line=85 (Compiled frame)
>  - 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(org.apache.cassandra.db.compaction.CompactionManager$CompactionExecutorStatsCollector)
>  @bci=2, line=61 (Compiled frame)
>  - 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run()
>  @bci=128, line=266 (Compiled frame)
>  - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=511 
> (Compiled frame)
>  - java.util.concurrent.FutureTask.run() @bci=42, line=266 (Compiled frame)
>  - 
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
>  @bci=95, line=1149 (Compiled frame)
>  - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=624 
> (Compiled frame)
>  - 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(java.lang.Runnable)
>  @bci=1, line=81 (Compiled frame)
>  - org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$6.run() @bci=4 
> (Compiled frame)
>  - java.lang.Thread.run() @bci=11, line=748 (Compiled frame)
> Thread 111416: (state = BLOCKED)
>  - java.io.ExpiringCache.get(java.lang.String) @bci=0, line=75 (Compiled 
> frame)
>  - java.io.UnixFileSystem.canonicalize(java.lang.String) @bci=17, line=152 
> (Compiled frame)
>  - java.io.File.getCanonicalPath() @bci=27, line=618 (Compiled frame)
>  - java.io.FilePermission$1.run() @bci=103, line=215 (Compiled frame)
>  - java.io.FilePermission$1.run() @bci=1, line=203 (Compiled frame)
>  - 
> java.security.AccessController.doPrivileged(java.security.PrivilegedAction) 
> @bci=0 (Compiled frame)
>  - java.io.FilePermission.init(int) @bci=97, line=203 (Compiled frame)
>  - java.io.FilePermissio

Cassandra upgrade from 3.11.3 -> 3.11.6

2020-06-23 Thread Jai Bheemsen Rao Dhanwada
Hello,

I am trying to upgrade from 3.11.3 to 3.11.6.
Can I add new nodes with the 3.11.6  version to the cluster running with
3.11.3?
Also, I see the SSTable format changed from mc-* to md-*, does this cause
any issues?


Re: Cassandra upgrade from 3.11.3 -> 3.11.6

2020-06-23 Thread Surbhi Gupta
Hi ,

We have recently upgraded from 3.11.0 to 3.11.5 . There is a sstable format
change from 3.11.4 .
We also had to expand the cluster and we also discussed about expansion
first and than upgrade. But finally we upgraded and than expanded.
As per our experience what I could tell you is, it is not advisable to add
new nodes on higher version.
There are many bugs which got fixed from 3.11.3 to 3.11.6.

Thanks
Surbhi

On Tue, Jun 23, 2020 at 5:04 PM Jai Bheemsen Rao Dhanwada <
jaibheem...@gmail.com> wrote:

> Hello,
>
> I am trying to upgrade from 3.11.3 to 3.11.6.
> Can I add new nodes with the 3.11.6  version to the cluster running with
> 3.11.3?
> Also, I see the SSTable format changed from mc-* to md-*, does this cause
> any issues?
>
>


Re: Cassandra upgrade from 3.11.3 -> 3.11.6

2020-06-23 Thread Jürgen Albersdorfer
Hi, I would say „It depends“ - as it always does. I have had a 21 Node Cluster 
running in Production in one DC with versions ranging from 3.11.1 to 3.11.6 
without having had any single issue for over a year. I just upgraded all nodes 
to 3.11.6 for the sake of consistency.

Von meinem iPhone gesendet

> Am 24.06.2020 um 02:56 schrieb Surbhi Gupta :
> 
> 
> 
> Hi ,
> 
> We have recently upgraded from 3.11.0 to 3.11.5 . There is a sstable format 
> change from 3.11.4 . 
> We also had to expand the cluster and we also discussed about expansion first 
> and than upgrade. But finally we upgraded and than expanded. 
> As per our experience what I could tell you is, it is not advisable to add 
> new nodes on higher version. 
> There are many bugs which got fixed from 3.11.3 to 3.11.6. 
> 
> Thanks 
> Surbhi
> 
>> On Tue, Jun 23, 2020 at 5:04 PM Jai Bheemsen Rao Dhanwada 
>>  wrote:
>> Hello,
>> 
>> I am trying to upgrade from 3.11.3 to 3.11.6.
>> Can I add new nodes with the 3.11.6  version to the cluster running with 
>> 3.11.3?
>> Also, I see the SSTable format changed from mc-* to md-*, does this cause 
>> any issues?
>> 


Re: Cassandra upgrade from 3.11.3 -> 3.11.6

2020-06-23 Thread Surbhi Gupta
In case of any issue, it gets very difficult to debug when we have multiple
versions.

On Tue, 23 Jun 2020 at 22:23, Jürgen Albersdorfer 
wrote:

> Hi, I would say „It depends“ - as it always does. I have had a 21 Node
> Cluster running in Production in one DC with versions ranging from 3.11.1
> to 3.11.6 without having had any single issue for over a year. I just
> upgraded all nodes to 3.11.6 for the sake of consistency.
>
> Von meinem iPhone gesendet
>
> Am 24.06.2020 um 02:56 schrieb Surbhi Gupta :
>
> 
>
> Hi ,
>
> We have recently upgraded from 3.11.0 to 3.11.5 . There is a sstable
> format change from 3.11.4 .
> We also had to expand the cluster and we also discussed about expansion
> first and than upgrade. But finally we upgraded and than expanded.
> As per our experience what I could tell you is, it is not advisable to add
> new nodes on higher version.
> There are many bugs which got fixed from 3.11.3 to 3.11.6.
>
> Thanks
> Surbhi
>
> On Tue, Jun 23, 2020 at 5:04 PM Jai Bheemsen Rao Dhanwada <
> jaibheem...@gmail.com> wrote:
>
>> Hello,
>>
>> I am trying to upgrade from 3.11.3 to 3.11.6.
>> Can I add new nodes with the 3.11.6  version to the cluster running with
>> 3.11.3?
>> Also, I see the SSTable format changed from mc-* to md-*, does this cause
>> any issues?
>>
>>


Cassandra container, Google Cloud and Kubernetes

2020-06-23 Thread Manu Chadha
Hi

I want to run Cassandra container in GCP and want to orchestrate the containers 
using Kubernetes.

I have a Cassandra image in my Docker Registry. Following the steps from the 
following tutorial, I have created a cluster and have Cassandra running on it. 
https://cloud.google.com/kubernetes-engine/docs/tutorials/hello-app#step_4_create_a_cluster

These are the steps I executed

  *   get cassandra image - docker pull manuchadha25/codingjedi:3.11.4
  *   check that the images exists locally- docker images. Also check that the 
image runs - docker run manuchadha25/cassandra:3.11.4
  *   run on GCP console - export PROJECT_ID=project-id of google project
  *   run on GCP console - gcloud auth configure-docker - maybe not required as 
I want to skip putting the image in container registry.
  *   gcloud config set project $PROJECT_ID
  *   gcloud config set compute/zone compute-zone --num-nodes 2
  *   gcloud container clusters create codingjedi-cassandra-cluster 
--num-nodes=2
  *   check cluster is up - gcloud compute instances list
  *   kubectl create deployment codingjedi-cassandra-app 
--image=docker.io/manuchadha25/cassandra:3.11.4
  *   kubectl scale deployment codingjedi-cassandra-app --replicas=3
  *   kubectl autoscale deployment codingjedi-cassandra-app --cpu-percent=80 
--min=3 --max=5
  *   check all is fine- kubectl get pods
  *   kubectl expose deployment codingjedi-cassandra-app 
--name=codingjedi-cassandra-app-service --type=LoadBalancer --port 9042 
--target-port 9042
  *   check service is running - kubectl get service
  *   copy external ip address and from laptop run cqlsh external-ip 9042. This 
should start cqlsh. I am able to connect with the cluster from my laptop using 
the external IP. – cqlsh external-ip 9042

My concern is that I have not provided any configuration anywhere which would 
make the different Cassandra nodes work together. So while I have created a 
Kubernetes cluster, I have not created a Cassandra cluster. Am I correct? I 
suppose I need to do more to make the nodes work as Cassandra cluster. How do I 
do that? I mean specify SEED_ADDRESS, LISTENING address, set an external data 
volume so that the data persists etc.?

Regards
Manu


Sent from Mail for Windows 10