[ 
https://issues.apache.org/jira/browse/CASSANDRA-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16931810#comment-16931810
 ] 

Jordan West edited comment on CASSANDRA-15295 at 9/18/19 5:10 PM:
------------------------------------------------------------------

Happy to [~gzh1992n]. A few comments:
 
 * I verified it does not affect 3.0 branch because 
{{CommitLogSegmentManager#start}} exits immediately after starting 
{{managerThread}} instead of calling {{advaceAllocatingFrom(null)}};
 * The database doesn’t start, which causes many tests to fail as well, because 
there is no default commit log segment manager factory set. Test runs: 
https://circleci.com/gh/jrwest/cassandra/tree/bug-commitlog-deadlock
 * CommitLogInitWithExpcetionTest#L63 - should check prior to this call that 
initThread is not null

Minor naming nits (do with them what you please):
 * Rename KillerHook => OnKillHook, and onKill => execute
 * Drop the “I*” naming for the CommitLogSegmentMgrFactoryInterface. Consider 
renaming it CommitLogSegmentManagerFactory


was (Author: jrwest):
Happy to [~gzh1992n]. A few comments:
 
 * I verified it does not affect 3.0 branch because 
CommitLogSegmentManager#start exits immediately after starting managerThread 
instead of calling advaceAllocatingFrom(null);
 * The database doesn’t start, which causes many tests to fail as well, because 
there is no default commit log segment manager factory set. Test runs: 
https://circleci.com/gh/jrwest/cassandra/tree/bug-commitlog-deadlock
 * CommitLogInitWithExpcetionTest#L63 - should check prior to this call that 
initThread is not null

Minor naming nits (do with them what you please):
 * Rename KillerHook => OnKillHook, and onKill => execute
 * Drop the “I*” naming for the CommitLogSegmentMgrFactoryInterface. Consider 
renaming it CommitLogSegmentManagerFactory

> Running into deadlock when do CommitLog initialization
> ------------------------------------------------------
>
>                 Key: CASSANDRA-15295
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15295
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Commit Log
>            Reporter: Zephyr Guo
>            Assignee: Zephyr Guo
>            Priority: Normal
>         Attachments: jstack.log, pstack.log, screenshot-1.png, 
> screenshot-2.png, screenshot-3.png
>
>
> Recently, I found a cassandra(3.11.4) node stuck in STARTING status for a 
> long time.
>  I used jstack to saw what happened. The main thread stuck in 
> *AbstractCommitLogSegmentManager.awaitAvailableSegment*
>  !screenshot-1.png! 
> The strange thing is COMMIT-LOG-ALLOCATOR thread state was runnable but it 
> was not actually running.  
>  !screenshot-2.png! 
> And then I used pstack to troubleshoot. I found COMMIT-LOG-ALLOCATOR block on 
> java class initialization.
>   !screenshot-3.png! 
> This is a deadlock obviously. CommitLog waits for a CommitLogSegment when 
> initializing. In this moment, the CommitLog class is not initialized and the 
> main thread holds the class lock. After that, COMMIT-LOG-ALLOCATOR creates a 
> CommitLogSegment with exception and call *CommitLog.handleCommitError*(static 
> method).  COMMIT-LOG-ALLOCATOR will block on this line because CommitLog 
> class is still initializing.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to