Hello, It could happen if your GC pauses are too long and/or too frequent. If your heap sizes are not large enough. When a long GC happens, Cassandra node effectively behaves like a dead node (unresponsive). Other nodes start collecting hints for it etc. Maybe you should look into your logs to see if your GC pauses are happening too often. Grep for GCInspector in system.log. Could be a possibility.
Meg Mara From: Peng Xiao [mailto:2535...@qq.com] Sent: Thursday, October 26, 2017 9:24 AM To: user <user@cassandra.apache.org> Subject: how to identify the root cause of cassandra hang Hi, We have a cluster with 48 nodes configured with RACK,sometimes it's hang for even 2 minutes.the response time jump from 300ms to 15s. Could anyone please advise how to identified the root cause ? The following is from the system log INFO [Service Thread] 2017-10-26 21:45:46,796 GCInspector.java:258 - G1 Young Generation GC in 222ms. G1 Eden Space: 939524096 -> 0; G1 Old Gen: 6652738584 -> 6662878232; G1 Survivor Space: 134217728 -> 109051904; INFO [Service Thread] 2017-10-26 21:45:46,796 StatusLogger.java:51 - Pool Name Active Pending Completed Blocked All Time Blocked INFO [Service Thread] 2017-10-26 21:45:46,796 StatusLogger.java:66 - MutationStage 0 3 3612475121 0 0 INFO [Service Thread] 2017-10-26 21:45:46,796 StatusLogger.java:66 - RequestResponseStage 0 0 6333593550 0 0 INFO [Service Thread] 2017-10-26 21:45:46,796 StatusLogger.java:66 - ReadRepairStage 0 0 2773154 0 0 INFO [Service Thread] 2017-10-26 21:45:46,796 StatusLogger.java:66 - CounterMutationStage 0 0 0 0 0 INFO [Service Thread] 2017-10-26 21:45:46,796 StatusLogger.java:66 - ReadStage 0 4 417419357 0 0 Thanks.