[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17815251#comment-17815251 ] Jacek Lewandowski commented on CASSANDRA-18824: --- ok, merging > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x > > Time Spent: 1.5h > Remaining Estimate: 0h > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814339#comment-17814339 ] Brandon Williams commented on CASSANDRA-18824: -- I think that is a good plan and I am +1 on it, and +1 on this ticket also. > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x > > Time Spent: 1.5h > Remaining Estimate: 0h > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814322#comment-17814322 ] Jacek Lewandowski commented on CASSANDRA-18824: --- I've created https://issues.apache.org/jira/browse/CASSANDRA-19363 and https://issues.apache.org/jira/browse/CASSANDRA-19364 as a result of investigating the flakiness. The fact that it didn't fail in 5k runs, assuming all of those runs were executed under very similar cluster conditions, can be misleading. Adding a slight delay in an async code of pending ranges calculator leads to consistent test failures even on 4.0. This is not related to this issue though - it is only the test added here which can accidentally detect the problem. Since those separate tickets are now created, I think we can merge this ticket. However, those who asked for this fix should be notified about those possible issues. > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x > > Time Spent: 1.5h > Remaining Estimate: 0h > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17806019#comment-17806019 ] Brandon Williams commented on CASSANDRA-18824: -- I did a run of [5k iterations|https://app.circleci.com/pipelines/github/driftx/cassandra/1457/workflows/990588d3-c789-4d16-b426-19f4dc2d1642/jobs/71261] on 4.0 and it passed. I think that's good enough for me, if it's not failing on 4.0 in that many iterations maybe it's only slightly flaky in 3.11, and that will probably be EOL before it surfaces again. > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x > > Time Spent: 1.5h > Remaining Estimate: 0h > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17805741#comment-17805741 ] Szymon Miezal commented on CASSANDRA-18824: --- Yes it failed in the final assertion which is odd. The reason to start the 1k run for 3.11 again was that I am curious whether it's a failure that occurs more frequently on 3.11 than other versions. I would think that if there is any flakiness then it exists in all version and those failures will surface again. I also wonder whether there were any failures of that test recorded on 4.x in the past. We have two options: * Try to track down why does this failure happen - is it a testcase itself being imperfect or maybe there is still a race in the code. * Merge what we have considering it's still an improvement in comparison to not guarding against the not-safe cleanup at all. > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x > > Time Spent: 1.5h > Remaining Estimate: 0h > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17804407#comment-17804407 ] Jacek Lewandowski commented on CASSANDRA-18824: --- [~brandon.williams] thank you for running the tests. I didn't run them, was waiting for feedback > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x > > Time Spent: 1.5h > Remaining Estimate: 0h > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17804392#comment-17804392 ] Brandon Williams commented on CASSANDRA-18824: -- Given that it failed an assertion and wasn't a timeout, I think it shows flakiness regardless of whether or not 1k iterations is enough to reproduce it, but since [it's not in 3.11|https://app.circleci.com/pipelines/github/driftx/cassandra/1456/workflows/40a5e77e-5637-4010-96bc-55fce00eec2f/jobs/71133] nor in [4.0|https://app.circleci.com/pipelines/github/driftx/cassandra/1455/workflows/3dadc395-5c98-4670-9647-324a845b96a5/jobs/71101], I guess if there's a problem we'll deal with it when we get there. > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x > > Time Spent: 1.5h > Remaining Estimate: 0h > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17804342#comment-17804342 ] Szymon Miezal commented on CASSANDRA-18824: --- It seem the failure happened only on 3.11 branch which is suspicious. Can we run a batch of 1k executions again to verify whether that will persist? > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x > > Time Spent: 1.5h > Remaining Estimate: 0h > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17803329#comment-17803329 ] Brandon Williams commented on CASSANDRA-18824: -- There was an unused import in 5.0 I removed. Unfortunately however, it looks like cleanupDuringDecommissionTest is [flaky|https://app.circleci.com/pipelines/github/driftx/cassandra/1442/workflows/d06f2dcf-1382-4d3a-8b02-0b40934c6329/jobs/69573/tests]. > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x > > Time Spent: 1h > Remaining Estimate: 0h > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17803115#comment-17803115 ] Brandon Williams commented on CASSANDRA-18824: -- [~jlewandowski] sure, I see you committed circle configs, has CI been run? > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x > > Time Spent: 1h > Remaining Estimate: 0h > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17802896#comment-17802896 ] Jacek Lewandowski commented on CASSANDRA-18824: --- [~brandon.williams] - would you review my PRs? > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x > > Time Spent: 1h > Remaining Estimate: 0h > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17802776#comment-17802776 ] Jacek Lewandowski commented on CASSANDRA-18824: --- np, I'll handle that in the PRs > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x > > Time Spent: 1h > Remaining Estimate: 0h > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17802730#comment-17802730 ] Szymon Miezal commented on CASSANDRA-18824: --- It seems to be failing on 4.x branches. I think I haven't cleared the directory properly when checking out compiling it locally hence I missed this before sending the patches. > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x > > Time Spent: 1h > Remaining Estimate: 0h > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17802659#comment-17802659 ] Jacek Lewandowski commented on CASSANDRA-18824: --- Compilation fails on some branches - I've created PRs yesterday, they are attached in the links section. I'm applying some fixes on each of them. When ready, I'll rerun the CI > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x > > Time Spent: 1h > Remaining Estimate: 0h > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17799089#comment-17799089 ] Szymon Miezal commented on CASSANDRA-18824: --- I have prepared the following patches: * 3.0 - [https://github.com/szymon-miezal/cassandra/commit/68625daa5f55dbb0873ac4603fa05f47853cbeff] this one also has a PR - [https://github.com/apache/cassandra/pull/2921] * 3.11 - [https://github.com/szymon-miezal/cassandra/commit/f21777525862b8da3345e21faac40a16631b8194] * 4.0 - [https://github.com/szymon-miezal/cassandra/commit/9652bf53d09d66609356f2110ee9110f6c8d9eb2] * 4.1 - [https://github.com/szymon-miezal/cassandra/commit/51624a811449b988d16efd396187e4825b0cc5ce] * 5.0 - [https://github.com/szymon-miezal/cassandra/commit/c7dd3bfad97b18d834f89a04a3076f9e8f9a353c] * trunk - [https://github.com/szymon-miezal/cassandra/commit/096805afc658e80a8265bae3911ad3834331d325] (this patch intentionally contains only the test refactoring) The patches differ between major versions I suspect merging will require a bit of effort. > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x > > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790663#comment-17790663 ] Brandon Williams commented on CASSANDRA-18824: -- I don't see any related failures, +1 from me. > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x > > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790621#comment-17790621 ] Szymon Miezal commented on CASSANDRA-18824: --- [https://github.com/szymon-miezal/cassandra/commit/47245b247f500a35ebc779ff8f8f81b7e9ae0d78] has a patch that has been rebased on current trunk. > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.0.x, 5.x > > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17784430#comment-17784430 ] Szymon Miezal commented on CASSANDRA-18824: --- Going back to it after a while, I have prepared new branches, 3.1 - 3.11 commits contain backport plus a test adjustment, 4.0 - trunk commits contain only the test adjustment which makes it independent from the other test. [https://github.com/szymon-miezal/cassandra/commit/a56ad6916e85a6956ec71fd3d85ed685c053d087] (3.0) [https://github.com/szymon-miezal/cassandra/commit/456863025285c76f37d1e219e3688aaee33f0269] (3.11) [https://github.com/szymon-miezal/cassandra/commit/7d4b2b4e9b2648ad1948650a803a221fe395a61c] (4.0) [https://github.com/szymon-miezal/cassandra/commit/cc704cd3654e6f5db1ba6f3fa4c7e8e49a51a783] (4.1) [https://github.com/szymon-miezal/cassandra/commit/017997b7778cfd4381477bed5c4df1aa16ef1cab] (5.0) [https://github.com/szymon-miezal/cassandra/commit/035d8fa5af05467f5e519731a4d45d74b5d4738] (trunk) I haven't modified _CHANGES.txt_ file as IIUC it should be done during merging. > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17784390#comment-17784390 ] Szymon Miezal commented on CASSANDRA-18824: --- Going back to it after a while, I have prepared new branches, 3.1 - 3.11 commits contain backport plus a test adjustment, 4.0 - trunk commits contain only the test adjustment which makes it independent from the other test. * [https://github.com/szymon-miezal/cassandra/commit/e3b6bb2761dceeb09b2a4142486cdc4dfbb79edf] (3.0) * [https://github.com/szymon-miezal/cassandra/commit/bde29b98a96dd6e7923ad76d4f2421cfc3f9435f] (3.11) * [https://github.com/szymon-miezal/cassandra/commit/131cd3180da8668168bfcea0bcbec39627fef0ab] (4.0) * [https://github.com/szymon-miezal/cassandra/commit/5db9873af77ba1669af43ac9f6d1e4d3a5cbffc5] (4.1) * [https://github.com/szymon-miezal/cassandra/commit/e34e95011265788b9d61786ee4c54cd71f86b4b9] (5.0) * [https://github.com/szymon-miezal/cassandra/commit/4c41cb72f6917df00d17436b7342e21a5ec8766e] (trunk) I am not sure whether modifying _CHANGES.txt_ file was what should be done at this stage or during merging. > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763190#comment-17763190 ] Stefan Miklosovic commented on CASSANDRA-18824: --- It can be all done on one merge up to trunk. In 3.0 - add the feature and add the test and make cleanup test independent from bootstrap (by copying population method) In 3.11 - same as 3.0 in 4.0 -> trunk -> since cleanup test is already there, just make it independent from bootstrap one > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763173#comment-17763173 ] Szymon Miezal commented on CASSANDRA-18824: --- Well it has been shipped like that already with https://issues.apache.org/jira/browse/CASSANDRA-15935 ;) but I get the point we should strive to apply a scout rule. What I think makes sense is to add another commit where we either: * extract the "population" logic to a common place OR * implement something that does the same job locally at CleanupFailureTest (this approach has the disadvantage that it will probably require to just reinvent the wheel). and then we merge this particular commit only to versions >= 4.x. Having said that I think I will create a PR as it a much better place to discuss implementation details. > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763166#comment-17763166 ] Stefan Miklosovic commented on CASSANDRA-18824: --- yeah I know it is like that there as well. So this ticket should cover both adding the patch as such for 3.0 and 3.11 _as well as_ cleaning up the tests from 4.0 to trunk. I mean ... we are not going to ship it like that if we clearly see that test is dependent on another one. > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763163#comment-17763163 ] Szymon Miezal commented on CASSANDRA-18824: --- The proper approach to the problem would try to extract a common test utility logic to a third entity, that's for sure. The reason why it isn't like that in the current patch is because I have followed the changes from 4.0 and did not want to diverge to much with additional refactoring. You might be surprised that it's written in exactly the same way there - [https://github.com/apache/cassandra/blob/2a5e1b77c9f8a205dbec1afdea3f4ed1eaf6a4eb/test/distributed/org/apache/cassandra/distributed/test/ring/BootstrapTest.java#L307|https://github.com/apache/cassandra/blob/2a5e1b77c9f8a205dbec1afdea3f4ed1eaf6a4eb/test/distributed/org/apache/cassandra/distributed/test/ring/BootstrapTest.java#L307.] and then use in Cleanup test. > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763155#comment-17763155 ] Stefan Miklosovic commented on CASSANDRA-18824: --- wow this is messy :) So ... why do we actually deal with changes in BootstrapTest in the first place? Because we are using populate method in CleanupFailureTest? So one test depends on some methods from another one? This is just wrong. If you need these methods, either find some common place to put it and reference it from both places or just implement a population method just for CleanupFailureTest. > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763123#comment-17763123 ] Szymon Miezal commented on CASSANDRA-18824: --- It looks like it has been reported before - https://issues.apache.org/jira/browse/CASSANDRA-17139. I think it's going to be valuable to take a closer look at it but I am thinking if it would have been cleaner to tackle that as a separate ticket. > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763121#comment-17763121 ] Stefan Miklosovic commented on CASSANDRA-18824: --- It was flaky before. https://app.circleci.com/pipelines/github/instaclustr/cassandra?branch=CASSANDRA-18824-3.0-test-flakiness Do you think you could take a look at the source of this flakiness? It would be cool if we left that that in better shape than we found it. > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763110#comment-17763110 ] Szymon Miezal commented on CASSANDRA-18824: --- That's a good catch, thank you. I have updated the branches. Maybe I should have open a regular PR instead of linking commits. Re the flaky BootstrapTest the questions is: did it become flaky after porting the changes or was it flaky in the first place? Looking at https://butler.cassandra.apache.org/#/ci/upstream/workflow/Cassandra-3.11/failure/org.apache.cassandra.distributed.test/BootstrapTest/readWriteDuringBootstrapTest this test does not seem to have a stable history. > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763090#comment-17763090 ] Stefan Miklosovic commented on CASSANDRA-18824: --- 3.0 and 3.11 branch of yours have wrong package name on added test. It ends on "ring" but it is not in that package. Since in 3.0 nor in 3.11 there is not "ring" package in dtests, the solution is to just remove "ring" string from there. Secondly, BootstrapTest is flaky (1) when run 500x (2) (1) https://app.circleci.com/pipelines/github/instaclustr/cassandra?branch=CASSANDRA-18824-3.0 (2) https://app.circleci.com/pipelines/github/instaclustr/cassandra/3127/workflows/0b3af9e9-b941-40f3-937a-dac94b9a468a/jobs/115586/tests#failed-test-0 > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763003#comment-17763003 ] Szymon Miezal commented on CASSANDRA-18824: --- The 3.11 branch: [szymon-miezal/CASSANDRA-18824-3.11|https://github.com/szymon-miezal/cassandra/commits/CASSANDRA-18824-3.11]. I think I haven't setup CircleCI properly yet [https://app.circleci.com/pipelines/github/szymon-miezal/cassandra/2/workflows/c914be88-f1df-4fb5-93d7-997f09b13972], the build got canceled. > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762983#comment-17762983 ] Stefan Miklosovic commented on CASSANDRA-18824: --- Could you just prepare the branch for 3.11without merging 3.0 into that? I will do the merging on my end. > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762976#comment-17762976 ] Szymon Miezal commented on CASSANDRA-18824: --- I was thinking that we might end up with a similar situation - conflict of cleanup and streaming which results in pending ranges. Namely when a cleanup process has been already started and passed the safeguard on let's say a node A and then concurrently the node B starts decommission process. Initially I thought it might be valuable to break the cleanup (which might take time for sizeable data) in the process by throwing the exception inside the method that processes a single SSTable: https://github.com/apache/cassandra/blob/65ee0d082caac70de704852deed52b9dd52120e6/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L907 but then I realized that the list of SSTables to process is generated before touching the first SSTable: https://github.com/apache/cassandra/blob/65ee0d082caac70de704852deed52b9dd52120e6/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L299. It means that all new SSTables that are generated due to streaming after the cleanup started are not going to be touched by cleanup and breaking the cleanup in the process would not have much value. > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18824) Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused missing replica
[ https://issues.apache.org/jira/browse/CASSANDRA-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762971#comment-17762971 ] Szymon Miezal commented on CASSANDRA-18824: --- It seems the patch from 3.0 was easily applicable to 3.11, it merged cleanly - [szymon-miezal/CASSANDRA-18824-3.11|https://github.com/szymon-miezal/cassandra/commits/CASSANDRA-18824-3.11] > Backport CASSANDRA-16418: Cleanup behaviour during node decommission caused > missing replica > --- > > Key: CASSANDRA-18824 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18824 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Szymon Miezal >Assignee: Szymon Miezal >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > Node decommission triggers data transfer to other nodes. While this transfer > is in progress, > receiving nodes temporarily hold token ranges in a pending state. However, > the cleanup process currently doesn't consider these pending ranges when > calculating token ownership. > As a consequence, data that is already stored in sstables gets inadvertently > cleaned up. > STR: > * Create two node cluster > * Create keyspace with RF=1 > * Insert sample data (assert data is available when querying both nodes) > * Start decommission process of node 1 > * Start running cleanup in a loop on node 2 until decommission on node 1 > finishes > * Verify of all rows are in the cluster - it will fail as the previous step > removed some of the rows > It seems that the cleanup process does not take into account the pending > ranges, it uses only the local ranges - > [https://github.com/apache/cassandra/blob/caad2f24f95b494d05c6b5d86a8d25fbee58d7c2/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L466]. > There are two solutions to the problem. > One would be to change the cleanup process in a way that it start taking > pending ranges into account. Even thought it might sound tempting at first it > will require involving changes and a lot of testing effort. > Alternatively we could interrupt/prevent the cleanup process from running > when any pending range on a node is detected. That sounds like a reasonable > alternative to the problem and something that is relatively easy to implement. > The bug has been already fixed in 4.x with CASSANDRA-16418, the goal of this > ticket is to backport it to 3.x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org