[ 
https://issues.apache.org/jira/browse/CASSANDRA-12882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16435829#comment-16435829
 ] 

Jeff Jirsa edited comment on CASSANDRA-12882 at 4/12/18 3:57 PM:
-----------------------------------------------------------------

Have seen something similar when either a flush thread or read thread is 
holding the memtable oporder barrier and the slab allocator can't allocate 
because it's at it's max allocation size.

I see a blocked flush thread in your stack, and it's waiting on:

{code}
"CompactionExecutor:87214" #790268 daemon prio=1 os_prio=4 
tid=0x00007f8f60346800 nid=0x2d3e runnable [0x00007fd8acf9b000]
   java.lang.Thread.State: RUNNABLE
        at org.apache.cassandra.dht.Bounds.contains(Bounds.java:52)
        at org.apache.cassandra.dht.Bounds.intersects(Bounds.java:80)
        at 
org.apache.cassandra.db.compaction.LeveledManifest.overlapping(LeveledManifest.java:534)
        at 
org.apache.cassandra.db.compaction.LeveledManifest.overlapping(LeveledManifest.java:520)
        at 
org.apache.cassandra.db.compaction.LeveledManifest.getCandidatesFor(LeveledManifest.java:651)
        at 
org.apache.cassandra.db.compaction.LeveledManifest.getCompactionCandidates(LeveledManifest.java:330)
        - locked <0x00000005580ed570> (a 
org.apache.cassandra.db.compaction.LeveledManifest)
        at 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:98)
        - locked <0x0000000558020740> (a 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy)
        at 
org.apache.cassandra.db.compaction.CompactionStrategyManager.getNextBackgroundTask(CompactionStrategyManager.java:108)
        - locked <0x0000000558056bd0> (a 
org.apache.cassandra.db.compaction.CompactionStrategyManager)
        at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:258)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
{code}

Is that thread making progress? Do you have some incredible number of sstables? 
How often are you trying to flush (what are your memtable settings)? 



was (Author: jjirsa):
Have seen something similar when either a flush thread or read thread is 
holding the memtable oporder barrier and the slab allocator can't allocate 
because it's at it's max allocation size.

I don't see a blocked flush thread in your stack, and it's waiting on:

{code}
"CompactionExecutor:87214" #790268 daemon prio=1 os_prio=4 
tid=0x00007f8f60346800 nid=0x2d3e runnable [0x00007fd8acf9b000]
   java.lang.Thread.State: RUNNABLE
        at org.apache.cassandra.dht.Bounds.contains(Bounds.java:52)
        at org.apache.cassandra.dht.Bounds.intersects(Bounds.java:80)
        at 
org.apache.cassandra.db.compaction.LeveledManifest.overlapping(LeveledManifest.java:534)
        at 
org.apache.cassandra.db.compaction.LeveledManifest.overlapping(LeveledManifest.java:520)
        at 
org.apache.cassandra.db.compaction.LeveledManifest.getCandidatesFor(LeveledManifest.java:651)
        at 
org.apache.cassandra.db.compaction.LeveledManifest.getCompactionCandidates(LeveledManifest.java:330)
        - locked <0x00000005580ed570> (a 
org.apache.cassandra.db.compaction.LeveledManifest)
        at 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:98)
        - locked <0x0000000558020740> (a 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy)
        at 
org.apache.cassandra.db.compaction.CompactionStrategyManager.getNextBackgroundTask(CompactionStrategyManager.java:108)
        - locked <0x0000000558056bd0> (a 
org.apache.cassandra.db.compaction.CompactionStrategyManager)
        at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:258)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
{code}

Is that thread making progress? Do you have some incredible number of sstables? 
How often are you trying to flush (what are your memtable settings)? 


> Deadlock in MemtableAllocator
> -----------------------------
>
>                 Key: CASSANDRA-12882
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12882
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Ubuntu 14.40
> Cassandra 3.5
>            Reporter: Nimi Wariboko Jr.
>            Priority: Major
>             Fix For: 3.11.x
>
>         Attachments: cassandra.yaml, threaddump.txt
>
>
> I'm seeing an issue where a node will eventually lock up and their thread 
> pools - I looked into jstack, and a lot of threads are stuck in the Memtable 
> Allocator
> {code}
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>       at 
> org.apache.cassandra.utils.concurrent.WaitQueue$AbstractSignal.awaitUninterruptibly(WaitQueue.java:279)
>       at 
> org.apache.cassandra.utils.memory.MemtableAllocator$SubAllocator.allocate(MemtableAllocator.java:198)
>       at 
> org.apache.cassandra.utils.memory.SlabAllocator.allocate(SlabAllocator.java:89)
>       at 
> org.apache.cassandra.utils.memory.ContextAllocator.allocate(ContextAllocator.java:57)
>       at 
> org.apache.cassandra.utils.memory.ContextAllocator.clone(ContextAllocator.java:47)
>       at 
> org.apache.cassandra.utils.memory.MemtableBufferAllocator.clone(MemtableBufferAllocator.java:41)
> {code}
> I looked into the code, and its not immediately apparent to me what thread 
> might hold the relevant lock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to