Hi,
We have a cluster where if reads are increased 2-3 times suddenly then
cassandra cpu goes around 100% (We have 48 cpu machines with 128GB RAM) for
few nodes and cassandra becomes unresponsive .
We are on 3.11.5 and using G1GC with 16GB heap size.
When going through the system.logs and gc.log , i see in system.log it is
just printing messages like below every 5 secs. I have removed lines for
many keyspaces to reduce the size of the text. , and lot of messages are
getting printed in gc.log . I feel that may be i need to increase heap size
on these nodes but i wanted to understand , how do we determine if heap
size should be increased or not. Nodes are not dying due to OOMs . When we
have OOMs , we know for sure we need to increase heap size but *what to see
in gc.log , system.log and debug.log to determine if we have to increase
heap size.*
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,368 MessagingService.java:1246
- READ messages were dropped in last 5000 ms: 199 internal and 232 cross
node. Mean internal dropped latency: 10443 ms and Mean cross-node dropped
latency: 10402 ms
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,369 StatusLogger.java:47 -
Pool Name Active Pending Completed Blocked All
Time Blocked
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,377 StatusLogger.java:51 -
MutationStage 0 0 80051890 0
0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,378 StatusLogger.java:51 -
ViewMutationStage 0 0 0 0
0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,378 StatusLogger.java:51 -
ReadStage 192 1331 152624049 0
0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,378 StatusLogger.java:51 -
RequestResponseStage 0 0 172822890 0
0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,378 StatusLogger.java:51 -
ReadRepairStage 0 0 1545869 0
0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,379 StatusLogger.java:51 -
CounterMutationStage 0 0 0 0
0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,379 StatusLogger.java:51 -
MiscStage 0 0 0 0
0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,379 StatusLogger.java:51 -
CompactionExecutor 0 0 623536 0
0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,379 StatusLogger.java:51 -
MemtableReclaimMemory 0 0 6700 0
0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,380 StatusLogger.java:51 -
PendingRangeCalculator 0 0 18 0
0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,380 StatusLogger.java:51 -
GossipStage 0 0 1613366 0
0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,380 StatusLogger.java:51 -
SecondaryIndexManagement 0 0 0 0
0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,380 StatusLogger.java:51 -
HintsDispatcher 0 0 5 0
0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,381 StatusLogger.java:51 -
MigrationStage 0 0 1 0
0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,381 StatusLogger.java:51 -
MemtablePostFlush 0 0 14830 0
0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,381 StatusLogger.java:51 -
PerDiskMemtableFlushWriter_0 0 0 6700 0
0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,381 StatusLogger.java:51 -
ValidationExecutor 0 0 0 0
0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,382 StatusLogger.java:51 -
Sampler 0 0 0 0
0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,382 StatusLogger.java:51 -
MemtableFlushWriter 0 0 6700 0
0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,382 StatusLogger.java:51 -
InternalResponseStage 0 0 33229 0
0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,383 StatusLogger.java:51 -
AntiEntropyStage 0 0 0 0
0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,383 StatusLogger.java:51 -
CacheCleanupExecutor 0 0 0 0
0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,383 StatusLogger.java:51 -
Native-Transport-Requests 661 0 84577742 0
0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,383 StatusLogger.java:61 -
CompactionManager 0 0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,384 StatusLogger.java:73 -
MessagingService n/a 0/0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,384 StatusLogger.java:83 -
Cache Type Size Capacity
KeysToSave
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,384 StatusLogger.java:85 -
KeyCache 104857576 104857600
all
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,384 StatusLogger.java:91 -
RowCache 0 0
all
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,384 StatusLogger.java:98 -
Table Memtable ops,data
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,429 StatusLogger.java:101 -
system_distributed.parent_repair_history 0,0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,429 StatusLogger.java:101 -
system_distributed.repair_history 0,0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,430 StatusLogger.java:101 -
system_distributed.view_build_status 0,0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,430 StatusLogger.java:101 -
system.compaction_history 12,3327
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,430 StatusLogger.java:101 -
system.schema_aggregates 0,0
INFO [ScheduledTasks:1] 2020-08-19 08:13:12,430 StatusLogger.java:101 -
system.schema_triggers 0,0
Thanks