Zephyr Guo commented on CASSANDRA-15295:

Hi, [~djoshi] thanks for the review. I agree with most of what you said. Your 
change is very minimal for this issue. But something else I have to remind you.

1. Most of the changes in my patch are going to build a UT that ensures the 
exception case for CommitLog. It's worth it for Cassandra. Not only we have to 
fix this problem but we also need to understand the root cause of the problem 
(lack of exception tests).

2. Your change introduces the risk of starting twice. CommitLog was designed to 
a singleton and It manages lifecycle by itself. When other modules call 
CommitLog.instance, they expect an initialized CommitLog. You change the 
original initialization process.

3. The major change (move to a different class) is very simple. The change DOES 
NOT change any original initialization process.

4. I agree with that "I think it is important to get the correctness issue 
resolved first".  Don't you think that moving the code to another class is the 

I respect your decision, incorporate my patch to get a better one. What's next?

> Running into deadlock when do CommitLog initialization
> ------------------------------------------------------
>                 Key: CASSANDRA-15295
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15295
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Commit Log
>            Reporter: Zephyr Guo
>            Assignee: Zephyr Guo
>            Priority: Normal
>         Attachments: jstack.log, pstack.log, screenshot-1.png, 
> screenshot-2.png, screenshot-3.png
> Recently, I found a cassandra(3.11.4) node stuck in STARTING status for a 
> long time.
>  I used jstack to saw what happened. The main thread stuck in 
> *AbstractCommitLogSegmentManager.awaitAvailableSegment*
>  !screenshot-1.png! 
> The strange thing is COMMIT-LOG-ALLOCATOR thread state was runnable but it 
> was not actually running.  
>  !screenshot-2.png! 
> And then I used pstack to troubleshoot. I found COMMIT-LOG-ALLOCATOR block on 
> java class initialization.
>   !screenshot-3.png! 
> This is a deadlock obviously. CommitLog waits for a CommitLogSegment when 
> initializing. In this moment, the CommitLog class is not initialized and the 
> main thread holds the class lock. After that, COMMIT-LOG-ALLOCATOR creates a 
> CommitLogSegment with exception and call *CommitLog.handleCommitError*(static 
> method).  COMMIT-LOG-ALLOCATOR will block on this line because CommitLog 
> class is still initializing.

This message was sent by Atlassian Jira

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to