[ 
https://issues.apache.org/jira/browse/CASSANDRA-15898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-15898:
---------------------------------------
    Fix Version/s: 3.11.x

> cassandra 3.11.4 deadlock
> -------------------------
>
>                 Key: CASSANDRA-15898
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15898
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: john doe
>            Priority: Normal
>             Fix For: 3.11.x
>
>
> We are running apache-cassandra-3.11.4, 10 node cluster with -Xms32G -Xmx32G 
> -Xmn8G using CMS.
> after running couple of days one of the node become unresponsive and 
> threaddump (jstack -F) shows deadlock.
> Found one Java-level deadlock:
> =============================
> "Native-Transport-Requests-144": waiting to lock Monitor@0x00007cd5142e4d08 
> (Object@0x00007f6e00348268, a java/io/ExpiringCache),
>  which is held by "CompactionExecutor:115134"
> "CompactionExecutor:115134": waiting to lock Monitor@0x00007f6bcaf130f8 
> (Object@0x00007f6dff31faa0, a 
> ch/qos/logback/core/joran/spi/ConfigurationWatchList),
>  which is held by "Native-Transport-Requests-144"
> Found a total of 1 deadlock.
> I have seen this couple of time now with different nodes with following in 
> system.log
> IndexSummaryRedistribution.java:77 - Redistributing index summaries
>  NoSpamLogger.java:91 - Maximum memory usage reached (536870912), cannot 
> allocate chunk of 1048576
> also lookin in gc log there has not been a ParNew collection for last 10hrs, 
> only CMS collections.
> 1739842.375: [GC (CMS Final Remark) [YG occupancy: 2712269 K (7549760 K)]
> 1739842.375: [Rescan (parallel) , 0.0614157 secs]
> 1739842.437: [weak refs processing, 0.0000994 secs]
> 1739842.437: [class unloading, 0.0231076 secs]
> 1739842.460: [scrub symbol table, 0.0061049 secs]
> 1739842.466: [scrub string table, 0.0043847 secs][1 CMS-remark: 
> 17696837K(25165824K)] 20409107K(32715584K), 0.0953750 secs] [Times: user=2.95 
> sys=0.00, real=0.09 secs]
> 1739842.471: [CMS-concurrent-sweep-start]
> 1739848.572: [CMS-concurrent-sweep: 6.101/6.101 secs] [Times: user=6.13 
> sys=0.00, real=6.10 secs]
> 1739848.573: [CMS-concurrent-reset-start]
> 1739848.645: [CMS-concurrent-reset: 0.072/0.072 secs] [Times: user=0.08 
> sys=0.00, real=0.08 secs]
> 1739858.653: [GC (CMS Initial Mark) [1 CMS-initial-mark: 
> 17696837K(25165824K)] 
> 20409111K(32715584K), 0.0584838 secs] [Times: user=2.68 sys=0.00, real=0.06 
> secs]
> 1739858.713: [CMS-concurrent-mark-start]
> 1739860.496: [CMS-concurrent-mark: 1.784/1.784 secs] [Times: user=84.77 
> sys=0.00, real=1.79 secs]
> 1739860.497: [CMS-concurrent-preclean-start]
> 1739860.566: [CMS-concurrent-preclean: 0.070/0.070 secs] [Times: user=0.07 
> sys=0.00, real=0.07 secs]
> 1739860.567: [CMS-concurrent-abortable-preclean-start]CMS: abort preclean due 
> to time
> 1739866.333: [CMS-concurrent-abortable-preclean: 5.766/5.766 secs] [Times: 
> user=5.80 sys=0.00, real=5.76 secs]
> Java HotSpot(TM) 64-Bit Server VM (25.162-b12) for linux-amd64 JRE 
> (1.8.0_162-b12)
> Memory: 4k page, physical 792290076k(2780032k free), swap 16777212k(16693756k 
> free)
> CommandLine flags:
> -XX:+AlwaysPreTouch
> -XX:CICompilerCount=15
> -XX:+CMSClassUnloadingEnabled
> -XX:+CMSEdenChunksRecordAlways
> -XX:CMSInitiatingOccupancyFraction=40
> -XX:+CMSParallelInitialMarkEnabled
> -XX:+CMSParallelRemarkEnabled
> -XX:CMSWaitDuration=10000
> -XX:ConcGCThreads=50
> -XX:+CrashOnOutOfMemoryError
> -XX:GCLogFileSize=10485760
> -XX:+HeapDumpOnOutOfMemoryError
> -XX:InitialHeapSize=34359738368
> -XX:InitialTenuringThreshold=1
> -XX:+ManagementServer
> -XX:MaxHeapSize=34359738368
> -XX:MaxNewSize=8589934592
> -XX:MaxTenuringThreshold=1
> -XX:MinHeapDeltaBytes=196608
> -XX:NewSize=8589934592
> -XX:NumberOfGCLogFiles=10
> -XX:OldPLABSize=16
> -XX:OldSize=25769803776
> -XX:OnOutOfMemoryError=kill -9 %p
> -XX:ParallelGCThreads=50
> -XX:+PerfDisableSharedMem
> -XX:+PrintGC
> -XX:+PrintGCDetails
> -XX:+PrintGCTimeStamps
> -XX:+ResizeTLAB
> -XX:StringTableSize=1000003
> -XX:SurvivorRatio=8
> -XX:ThreadPriorityPolicy=42
> -XX:ThreadStackSize=256
> -XX:-UseBiasedLocking
> -XX:+UseCMSInitiatingOccupancyOnly
> -XX:+UseConcMarkSweepGC
> -XX:+UseCondCardMark
> -XX:+UseFastUnorderedTimeStamps
> -XX:+UseGCLogFileRotation
> -XX:+UseNUMA
> -XX:+UseNUMAInterleaving
> -XX:+UseParNewGC
> -XX:+UseTLAB
> -XX:+UseThreadPriorities



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to