[ https://issues.apache.org/jira/browse/CASSANDRA-3882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13281373#comment-13281373 ]
Peter Schuller commented on CASSANDRA-3882: ------------------------------------------- Yes, that is the issue I'm seeing. And I completely agree about synchronous (with respect to other nodes) operations on stages. I also agree that I can't think of why specifically this would be unsafe to do asynchronously; I was just taking the ultra-conservative approach. Looked through the patch very briefly and it seems reasonable, but I haven't looked at it carefully or tested it. One thing I'm concerned with for large clusters is the total number of messages back and fourth, since I believe it ends up being quadratic in cluster size. That said, maybe that's okay up to at least several hundred nodes. > avoid distributed deadlock in migration stage > --------------------------------------------- > > Key: CASSANDRA-3882 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3882 > Project: Cassandra > Issue Type: Bug > Components: Core > Affects Versions: 1.1.0 > Reporter: Peter Schuller > Assignee: Richard Low > Fix For: 1.1.1 > > Attachments: CASSANDRA-3882-async.patch, CASSANDRA-3882-hack.txt > > > This is follow-up work for the remainders of CASSANDRA-3832 which was only a > partial fix. The deadlock in the migration stage needs to be fixed, as it can > cause bootstrap (at least) to take potentially a very very long time to > complete, and might also cause a lack of schema propagation until otherwise > "poked". -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira