Re: Nodes unresponsive after upgrade 3.9 -> 3.11.2
Martin, Would you pls share settings you had before and what did you change? We have similar issue. > On Mar 23, 2018, at 8:47 AM, Martin Mačurawrote: > > Nevermind, we resolved the issue JVM heap settings were misconfigured > > Martin > >> On Fri, Mar 23, 2018 at 1:18 PM, Martin Mačura wrote: >> Hi all, >> >> We have a cluster of 3 nodes with RF 3 that ran fine until we upgraded >> it to 3.11.2. >> >> Each node has 32 GB RAM, 8 GB Cassandra heap size. >> >> After the upgrade, clients started reporting connection issues: >> >> cassandra | [ERROR] Closing established connection pool to host >> because of the following error: Read error 'connection >> reset by peer' (src/pool.cpp:384) >> cassandra | [ERROR] Unable to establish a control connection to host >> because of the following error: Error: 'Request timed out' >> (0x010E) (src/control_connection.cpp:263) >> >> >> Cassandra logs are full of garbage collection warnings: >> >> WARN [Service Thread] 2018-03-23 05:04:17,780 GCInspector.java:282 - >> ConcurrentMarkSweep GC in 7858ms. Par Eden Space: 6871908352 -> >> 1774446288; Par Survivor Space: 858980344 -> 0 >> INFO [Service Thread] 2018-03-23 05:04:17,780 StatusLogger.java:47 - >> Pool NameActive Pending Completed Blocked >> All Time Blocked >> INFO [Service Thread] 2018-03-23 05:04:17,784 StatusLogger.java:51 - >> MutationStage10 92526002 0 >>0 >> INFO [Service Thread] 2018-03-23 05:04:17,784 StatusLogger.java:51 - >> ViewMutationStage 0 0 0 0 >>0 >> INFO [Service Thread] 2018-03-23 05:04:17,784 StatusLogger.java:51 - >> ReadStage 2 2 943544 0 >>0 >> INFO [Service Thread] 2018-03-23 05:04:17,784 StatusLogger.java:51 - >> RequestResponseStage 0 01666876 0 >>0 >> INFO [Service Thread] 2018-03-23 05:04:17,785 StatusLogger.java:51 - >> ReadRepairStage 0 0 10362 0 >>0 >> INFO [Service Thread] 2018-03-23 05:04:17,785 StatusLogger.java:51 - >> CounterMutationStage 0 0 0 0 >>0 >> INFO [Service Thread] 2018-03-23 05:04:17,785 StatusLogger.java:51 - >> MiscStage 0 0 0 0 >>0 >> INFO [Service Thread] 2018-03-23 05:04:17,785 StatusLogger.java:51 - >> CompactionExecutor0 0 3076 0 >>0 >> INFO [Service Thread] 2018-03-23 05:04:17,785 StatusLogger.java:51 - >> MemtableReclaimMemory 0 0 44 0 >>0 >> INFO [Service Thread] 2018-03-23 05:04:17,786 StatusLogger.java:51 - >> PendingRangeCalculator0 0 4 0 >>0 >> INFO [Service Thread] 2018-03-23 05:04:17,786 StatusLogger.java:51 - >> GossipStage 0 0 14287 0 >>0 >> INFO [Service Thread] 2018-03-23 05:04:17,786 StatusLogger.java:51 - >> SecondaryIndexManagement 0 0 0 0 >>0 >> INFO [Service Thread] 2018-03-23 05:04:17,786 StatusLogger.java:51 - >> HintsDispatcher 0 0 1 0 >>0 >> INFO [Service Thread] 2018-03-23 05:04:17,804 StatusLogger.java:51 - >> PerDiskMemtableFlushWriter_1 0 0 37 >> 0 0 >> INFO [Service Thread] 2018-03-23 05:04:17,805 StatusLogger.java:51 - >> PerDiskMemtableFlushWriter_2 0 0 37 >> 0 0 >> INFO [Service Thread] 2018-03-23 05:04:17,805 StatusLogger.java:51 - >> MigrationStage0 0 2 0 >>0 >> INFO [Service Thread] 2018-03-23 05:04:17,806 StatusLogger.java:51 - >> MemtablePostFlush 0 0 72 0 >>0 >> INFO [Service Thread] 2018-03-23 05:04:17,806 StatusLogger.java:51 - >> PerDiskMemtableFlushWriter_0 0 0 44 >> 0 0 >> INFO [Service Thread] 2018-03-23 05:04:17,806 StatusLogger.java:51 - >> ValidationExecutor0 0 0 0 >>0 >> INFO [Service Thread] 2018-03-23 05:04:17,806 StatusLogger.java:51 - >> Sampler 0 0 0 0 >>0 >> INFO [Service Thread] 2018-03-23 05:04:17,807 StatusLogger.java:51 - >> MemtableFlushWriter 0 0 44 0 >>0 >> INFO [Service Thread] 2018-03-23 05:04:17,807 StatusLogger.java:51 - >> PerDiskMemtableFlushWriter_5 0 0
Re: Nodes unresponsive after upgrade 3.9 -> 3.11.2
Nevermind, we resolved the issue JVM heap settings were misconfigured Martin On Fri, Mar 23, 2018 at 1:18 PM, Martin Mačurawrote: > Hi all, > > We have a cluster of 3 nodes with RF 3 that ran fine until we upgraded > it to 3.11.2. > > Each node has 32 GB RAM, 8 GB Cassandra heap size. > > After the upgrade, clients started reporting connection issues: > > cassandra | [ERROR] Closing established connection pool to host > because of the following error: Read error 'connection > reset by peer' (src/pool.cpp:384) > cassandra | [ERROR] Unable to establish a control connection to host > because of the following error: Error: 'Request timed out' > (0x010E) (src/control_connection.cpp:263) > > > Cassandra logs are full of garbage collection warnings: > > WARN [Service Thread] 2018-03-23 05:04:17,780 GCInspector.java:282 - > ConcurrentMarkSweep GC in 7858ms. Par Eden Space: 6871908352 -> > 1774446288; Par Survivor Space: 858980344 -> 0 > INFO [Service Thread] 2018-03-23 05:04:17,780 StatusLogger.java:47 - > Pool NameActive Pending Completed Blocked > All Time Blocked > INFO [Service Thread] 2018-03-23 05:04:17,784 StatusLogger.java:51 - > MutationStage10 92526002 0 > 0 > INFO [Service Thread] 2018-03-23 05:04:17,784 StatusLogger.java:51 - > ViewMutationStage 0 0 0 0 > 0 > INFO [Service Thread] 2018-03-23 05:04:17,784 StatusLogger.java:51 - > ReadStage 2 2 943544 0 > 0 > INFO [Service Thread] 2018-03-23 05:04:17,784 StatusLogger.java:51 - > RequestResponseStage 0 01666876 0 > 0 > INFO [Service Thread] 2018-03-23 05:04:17,785 StatusLogger.java:51 - > ReadRepairStage 0 0 10362 0 > 0 > INFO [Service Thread] 2018-03-23 05:04:17,785 StatusLogger.java:51 - > CounterMutationStage 0 0 0 0 > 0 > INFO [Service Thread] 2018-03-23 05:04:17,785 StatusLogger.java:51 - > MiscStage 0 0 0 0 > 0 > INFO [Service Thread] 2018-03-23 05:04:17,785 StatusLogger.java:51 - > CompactionExecutor0 0 3076 0 > 0 > INFO [Service Thread] 2018-03-23 05:04:17,785 StatusLogger.java:51 - > MemtableReclaimMemory 0 0 44 0 > 0 > INFO [Service Thread] 2018-03-23 05:04:17,786 StatusLogger.java:51 - > PendingRangeCalculator0 0 4 0 > 0 > INFO [Service Thread] 2018-03-23 05:04:17,786 StatusLogger.java:51 - > GossipStage 0 0 14287 0 > 0 > INFO [Service Thread] 2018-03-23 05:04:17,786 StatusLogger.java:51 - > SecondaryIndexManagement 0 0 0 0 > 0 > INFO [Service Thread] 2018-03-23 05:04:17,786 StatusLogger.java:51 - > HintsDispatcher 0 0 1 0 > 0 > INFO [Service Thread] 2018-03-23 05:04:17,804 StatusLogger.java:51 - > PerDiskMemtableFlushWriter_1 0 0 37 > 0 0 > INFO [Service Thread] 2018-03-23 05:04:17,805 StatusLogger.java:51 - > PerDiskMemtableFlushWriter_2 0 0 37 > 0 0 > INFO [Service Thread] 2018-03-23 05:04:17,805 StatusLogger.java:51 - > MigrationStage0 0 2 0 > 0 > INFO [Service Thread] 2018-03-23 05:04:17,806 StatusLogger.java:51 - > MemtablePostFlush 0 0 72 0 > 0 > INFO [Service Thread] 2018-03-23 05:04:17,806 StatusLogger.java:51 - > PerDiskMemtableFlushWriter_0 0 0 44 > 0 0 > INFO [Service Thread] 2018-03-23 05:04:17,806 StatusLogger.java:51 - > ValidationExecutor0 0 0 0 > 0 > INFO [Service Thread] 2018-03-23 05:04:17,806 StatusLogger.java:51 - > Sampler 0 0 0 0 > 0 > INFO [Service Thread] 2018-03-23 05:04:17,807 StatusLogger.java:51 - > MemtableFlushWriter 0 0 44 0 > 0 > INFO [Service Thread] 2018-03-23 05:04:17,807 StatusLogger.java:51 - > PerDiskMemtableFlushWriter_5 0 0 37 > 0 0 > INFO [Service Thread] 2018-03-23 05:04:17,807 StatusLogger.java:51 - > InternalResponseStage 0 0 0 0 > 0 > INFO [Service Thread] 2018-03-23 05:04:17,819 StatusLogger.java:51 - >
Nodes unresponsive after upgrade 3.9 -> 3.11.2
Hi all, We have a cluster of 3 nodes with RF 3 that ran fine until we upgraded it to 3.11.2. Each node has 32 GB RAM, 8 GB Cassandra heap size. After the upgrade, clients started reporting connection issues: cassandra | [ERROR] Closing established connection pool to host because of the following error: Read error 'connection reset by peer' (src/pool.cpp:384) cassandra | [ERROR] Unable to establish a control connection to host because of the following error: Error: 'Request timed out' (0x010E) (src/control_connection.cpp:263) Cassandra logs are full of garbage collection warnings: WARN [Service Thread] 2018-03-23 05:04:17,780 GCInspector.java:282 - ConcurrentMarkSweep GC in 7858ms. Par Eden Space: 6871908352 -> 1774446288; Par Survivor Space: 858980344 -> 0 INFO [Service Thread] 2018-03-23 05:04:17,780 StatusLogger.java:47 - Pool NameActive Pending Completed Blocked All Time Blocked INFO [Service Thread] 2018-03-23 05:04:17,784 StatusLogger.java:51 - MutationStage10 92526002 0 0 INFO [Service Thread] 2018-03-23 05:04:17,784 StatusLogger.java:51 - ViewMutationStage 0 0 0 0 0 INFO [Service Thread] 2018-03-23 05:04:17,784 StatusLogger.java:51 - ReadStage 2 2 943544 0 0 INFO [Service Thread] 2018-03-23 05:04:17,784 StatusLogger.java:51 - RequestResponseStage 0 01666876 0 0 INFO [Service Thread] 2018-03-23 05:04:17,785 StatusLogger.java:51 - ReadRepairStage 0 0 10362 0 0 INFO [Service Thread] 2018-03-23 05:04:17,785 StatusLogger.java:51 - CounterMutationStage 0 0 0 0 0 INFO [Service Thread] 2018-03-23 05:04:17,785 StatusLogger.java:51 - MiscStage 0 0 0 0 0 INFO [Service Thread] 2018-03-23 05:04:17,785 StatusLogger.java:51 - CompactionExecutor0 0 3076 0 0 INFO [Service Thread] 2018-03-23 05:04:17,785 StatusLogger.java:51 - MemtableReclaimMemory 0 0 44 0 0 INFO [Service Thread] 2018-03-23 05:04:17,786 StatusLogger.java:51 - PendingRangeCalculator0 0 4 0 0 INFO [Service Thread] 2018-03-23 05:04:17,786 StatusLogger.java:51 - GossipStage 0 0 14287 0 0 INFO [Service Thread] 2018-03-23 05:04:17,786 StatusLogger.java:51 - SecondaryIndexManagement 0 0 0 0 0 INFO [Service Thread] 2018-03-23 05:04:17,786 StatusLogger.java:51 - HintsDispatcher 0 0 1 0 0 INFO [Service Thread] 2018-03-23 05:04:17,804 StatusLogger.java:51 - PerDiskMemtableFlushWriter_1 0 0 37 0 0 INFO [Service Thread] 2018-03-23 05:04:17,805 StatusLogger.java:51 - PerDiskMemtableFlushWriter_2 0 0 37 0 0 INFO [Service Thread] 2018-03-23 05:04:17,805 StatusLogger.java:51 - MigrationStage0 0 2 0 0 INFO [Service Thread] 2018-03-23 05:04:17,806 StatusLogger.java:51 - MemtablePostFlush 0 0 72 0 0 INFO [Service Thread] 2018-03-23 05:04:17,806 StatusLogger.java:51 - PerDiskMemtableFlushWriter_0 0 0 44 0 0 INFO [Service Thread] 2018-03-23 05:04:17,806 StatusLogger.java:51 - ValidationExecutor0 0 0 0 0 INFO [Service Thread] 2018-03-23 05:04:17,806 StatusLogger.java:51 - Sampler 0 0 0 0 0 INFO [Service Thread] 2018-03-23 05:04:17,807 StatusLogger.java:51 - MemtableFlushWriter 0 0 44 0 0 INFO [Service Thread] 2018-03-23 05:04:17,807 StatusLogger.java:51 - PerDiskMemtableFlushWriter_5 0 0 37 0 0 INFO [Service Thread] 2018-03-23 05:04:17,807 StatusLogger.java:51 - InternalResponseStage 0 0 0 0 0 INFO [Service Thread] 2018-03-23 05:04:17,819 StatusLogger.java:51 - PerDiskMemtableFlushWriter_3 0 0 37 0 0 INFO [Service Thread] 2018-03-23 05:04:17,819 StatusLogger.java:51 - PerDiskMemtableFlushWriter_4 0 0 37 0 0 INFO [Service Thread] 2018-03-23 05:04:17,820 StatusLogger.java:51 - AntiEntropyStage 0