[ https://issues.apache.org/jira/browse/CASSANDRA-12882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16435829#comment-16435829 ]
Jeff Jirsa edited comment on CASSANDRA-12882 at 4/12/18 3:57 PM: ----------------------------------------------------------------- Have seen something similar when either a flush thread or read thread is holding the memtable oporder barrier and the slab allocator can't allocate because it's at it's max allocation size. I see a blocked flush thread in your stack, and it's waiting on: {code} "CompactionExecutor:87214" #790268 daemon prio=1 os_prio=4 tid=0x00007f8f60346800 nid=0x2d3e runnable [0x00007fd8acf9b000] java.lang.Thread.State: RUNNABLE at org.apache.cassandra.dht.Bounds.contains(Bounds.java:52) at org.apache.cassandra.dht.Bounds.intersects(Bounds.java:80) at org.apache.cassandra.db.compaction.LeveledManifest.overlapping(LeveledManifest.java:534) at org.apache.cassandra.db.compaction.LeveledManifest.overlapping(LeveledManifest.java:520) at org.apache.cassandra.db.compaction.LeveledManifest.getCandidatesFor(LeveledManifest.java:651) at org.apache.cassandra.db.compaction.LeveledManifest.getCompactionCandidates(LeveledManifest.java:330) - locked <0x00000005580ed570> (a org.apache.cassandra.db.compaction.LeveledManifest) at org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:98) - locked <0x0000000558020740> (a org.apache.cassandra.db.compaction.LeveledCompactionStrategy) at org.apache.cassandra.db.compaction.CompactionStrategyManager.getNextBackgroundTask(CompactionStrategyManager.java:108) - locked <0x0000000558056bd0> (a org.apache.cassandra.db.compaction.CompactionStrategyManager) at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:258) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} Is that thread making progress? Do you have some incredible number of sstables? How often are you trying to flush (what are your memtable settings)? was (Author: jjirsa): Have seen something similar when either a flush thread or read thread is holding the memtable oporder barrier and the slab allocator can't allocate because it's at it's max allocation size. I don't see a blocked flush thread in your stack, and it's waiting on: {code} "CompactionExecutor:87214" #790268 daemon prio=1 os_prio=4 tid=0x00007f8f60346800 nid=0x2d3e runnable [0x00007fd8acf9b000] java.lang.Thread.State: RUNNABLE at org.apache.cassandra.dht.Bounds.contains(Bounds.java:52) at org.apache.cassandra.dht.Bounds.intersects(Bounds.java:80) at org.apache.cassandra.db.compaction.LeveledManifest.overlapping(LeveledManifest.java:534) at org.apache.cassandra.db.compaction.LeveledManifest.overlapping(LeveledManifest.java:520) at org.apache.cassandra.db.compaction.LeveledManifest.getCandidatesFor(LeveledManifest.java:651) at org.apache.cassandra.db.compaction.LeveledManifest.getCompactionCandidates(LeveledManifest.java:330) - locked <0x00000005580ed570> (a org.apache.cassandra.db.compaction.LeveledManifest) at org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:98) - locked <0x0000000558020740> (a org.apache.cassandra.db.compaction.LeveledCompactionStrategy) at org.apache.cassandra.db.compaction.CompactionStrategyManager.getNextBackgroundTask(CompactionStrategyManager.java:108) - locked <0x0000000558056bd0> (a org.apache.cassandra.db.compaction.CompactionStrategyManager) at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:258) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} Is that thread making progress? Do you have some incredible number of sstables? How often are you trying to flush (what are your memtable settings)? > Deadlock in MemtableAllocator > ----------------------------- > > Key: CASSANDRA-12882 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12882 > Project: Cassandra > Issue Type: Bug > Environment: Ubuntu 14.40 > Cassandra 3.5 > Reporter: Nimi Wariboko Jr. > Priority: Major > Fix For: 3.11.x > > Attachments: cassandra.yaml, threaddump.txt > > > I'm seeing an issue where a node will eventually lock up and their thread > pools - I looked into jstack, and a lot of threads are stuck in the Memtable > Allocator > {code} > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) > at > org.apache.cassandra.utils.concurrent.WaitQueue$AbstractSignal.awaitUninterruptibly(WaitQueue.java:279) > at > org.apache.cassandra.utils.memory.MemtableAllocator$SubAllocator.allocate(MemtableAllocator.java:198) > at > org.apache.cassandra.utils.memory.SlabAllocator.allocate(SlabAllocator.java:89) > at > org.apache.cassandra.utils.memory.ContextAllocator.allocate(ContextAllocator.java:57) > at > org.apache.cassandra.utils.memory.ContextAllocator.clone(ContextAllocator.java:47) > at > org.apache.cassandra.utils.memory.MemtableBufferAllocator.clone(MemtableBufferAllocator.java:41) > {code} > I looked into the code, and its not immediately apparent to me what thread > might hold the relevant lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org