[jira] [Commented] (CASSANDRA-13603) Change repair midpoint logging from CASSANDRA-13052
[ https://issues.apache.org/jira/browse/CASSANDRA-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16147829#comment-16147829 ] Blake Eggleston commented on CASSANDRA-13603: - opened CASSANDRA-13830 > Change repair midpoint logging from CASSANDRA-13052 > > > Key: CASSANDRA-13603 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13603 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa >Priority: Trivial > Fix For: 3.0.x, 3.11.x, 4.x > > > In CASSANDRA-13052 , we changed the way we handle repairs on small ranges to > make them more sane in general, but {{MerkleTree.differenceHelper}} now > erroneously logs at error. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13603) Change repair midpoint logging from CASSANDRA-13052
[ https://issues.apache.org/jira/browse/CASSANDRA-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16147802#comment-16147802 ] Blake Eggleston commented on CASSANDRA-13603: - I think the current state of {{MerkleTree.differenceHelper}} is sort of the wrong approach for what it's trying to do in general. The current implementation has weird edge cases like this, and is too complex and difficult to follow. Also, it shares some of it's responsibilities with {{difference}}. Really, we're just trying to recursively compare these trees and record the largest contiguous out of sync ranges. The only complication here is that a given invocation of the method doesn't know if it's adjacent range is also out of sync, so it has to defer to the caller to do something if it's range is {{FULLY_INCONSISTENT}}. I put together an alternate implementation [here|https://github.com/bdeggleston/cassandra/tree/differencer-cleanup]. Having said all that, those changes are out of scope for this ticket and 3.0, so I'm +1 on the changes as proposed, and will open another ticket to refactor differece/differenceHelper. > Change repair midpoint logging from CASSANDRA-13052 > > > Key: CASSANDRA-13603 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13603 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa >Priority: Trivial > Fix For: 3.0.x, 3.11.x, 4.x > > > In CASSANDRA-13052 , we changed the way we handle repairs on small ranges to > make them more sane in general, but {{MerkleTree.differenceHelper}} now > erroneously logs at error. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13603) Change repair midpoint logging from CASSANDRA-13052
[ https://issues.apache.org/jira/browse/CASSANDRA-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16146346#comment-16146346 ] Jeff Jirsa commented on CASSANDRA-13603: Would love to revisit this. [~spo...@gmail.com] / [~bdeggleston] , pushed rebased versions, and here's shortcut links to combined diffs for the three branches: https://github.com/apache/cassandra/compare/cassandra-3.0...jeffjirsa:cassandra-3.0-13603?ws=1 https://github.com/apache/cassandra/compare/cassandra-3.11...jeffjirsa:cassandra-3.11-13603?ws=1 https://github.com/apache/cassandra/compare/trunk...jeffjirsa:cassandra-13603?ws=1 Either of you see anything else you'd like to change that I didn't catch already? Either of you up to mark yourselves as formal reviewer? > Change repair midpoint logging from CASSANDRA-13052 > > > Key: CASSANDRA-13603 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13603 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa >Priority: Trivial > Fix For: 3.0.x, 3.11.x, 4.x > > > In CASSANDRA-13052 , we changed the way we handle repairs on small ranges to > make them more sane in general, but {{MerkleTree.differenceHelper}} now > erroneously logs at error. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13603) Change repair midpoint logging from CASSANDRA-13052
[ https://issues.apache.org/jira/browse/CASSANDRA-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057292#comment-16057292 ] Stefan Podkowinski commented on CASSANDRA-13603: We should continue to check the midpoint return value. Even if it's just for troubleshooting similar problems in the future. Throwing an AssertionError instead of just logging an error message is probably not a good idea in 2.2. See my comment from before the patch: {quote}Unfortunately we can't throw here to abort the validation process, as the code is executed in it's own thread with the caller waiting for a condition to be signaled after completion and without an option to indicate an error (2.x only). {quote} But 3.0+ should be able to deal with failed SyncTasks, as those are now handled as futures. SyncTask.run is currently missing any exception handling, but that could be changed. > Change repair midpoint logging from CASSANDRA-13052 > > > Key: CASSANDRA-13603 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13603 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa >Priority: Trivial > Fix For: 3.0.x, 3.11.x, 4.x > > > In CASSANDRA-13052 , we changed the way we handle repairs on small ranges to > make them more sane in general, but {{MerkleTree.differenceHelper}} now > erroneously logs at error. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13603) Change repair midpoint logging from CASSANDRA-13052
[ https://issues.apache.org/jira/browse/CASSANDRA-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16056539#comment-16056539 ] Jeff Jirsa commented on CASSANDRA-13603: {quote} We already check node instanceof Leaf before recursively calling differenceHelper() with a sub-range. I'd suggest to do the same in difference(). It just doesn't make sense to call differenceHelper() with two leaf nodes, doesn't it? {quote} I think you're right - pushed a new commit to each branch that checks to see if either node is {{instanceof Leaf}} and avoids that tree traversal. Also marked {{MerkleTree.differenceHelper}} as {{@VisibleForTesting}} I think this makes the midpoint check below irrelevant, but may protect us from another bug later? On the fence with removing it - seems like we shouldn't hit us, but maybe is saves us in a future release? {code} if (midpoint.equals(active.left) || midpoint.equals(active.right)) {code} > Change repair midpoint logging from CASSANDRA-13052 > > > Key: CASSANDRA-13603 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13603 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa >Priority: Trivial > Fix For: 3.0.x, 3.11.x, 4.x > > > In CASSANDRA-13052 , we changed the way we handle repairs on small ranges to > make them more sane in general, but {{MerkleTree.differenceHelper}} now > erroneously logs at error. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13603) Change repair midpoint logging from CASSANDRA-13052
[ https://issues.apache.org/jira/browse/CASSANDRA-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16053833#comment-16053833 ] Stefan Podkowinski commented on CASSANDRA-13603: I agree that it's easy to misuse MerkleTree.differenceHelper(). It should only really be called from MerkleTree.difference() and should have a {{@VisibleForTesting}} annotation to indicate that it's internal use only. As for the suggested changes around midpoint result handling, I'm not really convinced that we should use this value to check if we should stop tree traversal. Although we should be at a leaf nodes already when getting an invalid midpoint value, we wouldn't be able to tell otherwise without any kind of error message, blindly following this assumption. This may cover other potential errors. We already check {{node instanceof Leaf}} before recursively calling differenceHelper() with a sub-range. I'd suggest to do the same in difference(). It just doesn't make sense to call differenceHelper() with two leaf nodes, doesn't it? > Change repair midpoint logging from CASSANDRA-13052 > > > Key: CASSANDRA-13603 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13603 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa >Priority: Trivial > Fix For: 3.0.x, 3.11.x, 4.x > > > In CASSANDRA-13052 , we changed the way we handle repairs on small ranges to > make them more sane in general, but {{MerkleTree.differenceHelper}} now > erroneously logs at error. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13603) Change repair midpoint logging from CASSANDRA-13052
[ https://issues.apache.org/jira/browse/CASSANDRA-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049666#comment-16049666 ] Jeff Jirsa commented on CASSANDRA-13603: || Branch || Testall || Dtest || | [3.0|https://github.com/jeffjirsa/cassandra/tree/cassandra-3.0-13603] | [circle|https://circleci.com/gh/jeffjirsa/cassandra/tree/cassandra-3.0-13603] | [asf dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/88/] | | [3.11|https://github.com/jeffjirsa/cassandra/tree/cassandra-3.11-13603] | [circle|https://circleci.com/gh/jeffjirsa/cassandra/tree/cassandra-3.11-13603] | [asf dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/89/] | | [trunk|https://github.com/jeffjirsa/cassandra/tree/cassandra-13603] | [circle|https://circleci.com/gh/jeffjirsa/cassandra/tree/cassandra-13603] | [asf dtest|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/90/] | For the reviewer: I really have mixed feelings about {{MerkleTree.differenceHelper}} in general. In particular, it presumes that there are differences in the trees, which is always true in the existing codepath, but may not be true later if someone misuses it in the future. It seems [fairly straightforward|https://github.com/jeffjirsa/cassandra/commit/cee5a2d75e8e867dc20f3d94f0fe5df360210d46] to make it properly handle the case where the trees are consistent for a given range, though I'm not sure it's necessary. Thoughts [~spo...@gmail.com] / [~bdeggleston] ? > Change repair midpoint logging from CASSANDRA-13052 > > > Key: CASSANDRA-13603 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13603 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa >Priority: Trivial > Fix For: 3.0.x, 3.11.x, 4.x > > > In CASSANDRA-13052 , we changed the way we handle repairs on small ranges to > make them more sane in general, but {{MerkleTree.differenceHelper}} now > erroneously logs at error. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org