[jira] [Commented] (CASSANDRA-16290) Consistency can be violated when bootstrap or decommission is resumed after node restart
[ https://issues.apache.org/jira/browse/CASSANDRA-16290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17756779#comment-17756779 ] Bartlomiej commented on CASSANDRA-16290: I tested it "manually", I have two nodes locally (what took me most of the time to setup this scenario), so yes, that change clears transferred ranges table (both legacy and v2) after startup. I will write tests (unit or dtest) and will be back :) > Consistency can be violated when bootstrap or decommission is resumed after > node restart > > > Key: CASSANDRA-16290 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16290 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Paulo Motta >Assignee: Bartlomiej >Priority: Normal > Labels: lhf > Fix For: 3.0.x, 3.11.x, 4.0.x > > Time Spent: 10m > Remaining Estimate: 0h > > Since CASSANDRA-12008, successfully transferred ranges during decommission > are saved on the {{system.transferred_ranges}} table. This allow skipping > ranges already transferred when a failed decommission is retried with > {{nodetool decommission}}. > If instead of resuming the decommission, an operator restarts the node, waits > N minutes and then performs a new decommission, the previously transferred > ranges will be skipped during streaming, and any writes received by the > decommissioned node during these N minutes will not be replicated to the new > range owner, what violates consistency. > This issue is analogous to the issue mentioned [on this > comment|https://issues.apache.org/jira/browse/CASSANDRA-8838?focusedCommentId=16900234=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16900234] > for resumable bootstrap (CASSANDRA-8838). > In order to prevent consistency violations we should clear the > {{system.transferred_ranges}} state during node restart, and maybe a system > property to disable it. While we're at this, we should change the default of > {{-Dcassandra.reset_bootstrap_progress}} to {{true}} to clear the > {{system.available_ranges}} state by default when a bootstrapping node is > restarted. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16290) Consistency can be violated when bootstrap or decommission is resumed after node restart
[ https://issues.apache.org/jira/browse/CASSANDRA-16290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17756777#comment-17756777 ] Stefan Miklosovic commented on CASSANDRA-16290: --- Well that is hard to say yet, but if you test that it will support your case, maybe you will find out it is not doing what you want. > Consistency can be violated when bootstrap or decommission is resumed after > node restart > > > Key: CASSANDRA-16290 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16290 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Paulo Motta >Assignee: Bartlomiej >Priority: Normal > Labels: lhf > Fix For: 3.0.x, 3.11.x, 4.0.x > > Time Spent: 10m > Remaining Estimate: 0h > > Since CASSANDRA-12008, successfully transferred ranges during decommission > are saved on the {{system.transferred_ranges}} table. This allow skipping > ranges already transferred when a failed decommission is retried with > {{nodetool decommission}}. > If instead of resuming the decommission, an operator restarts the node, waits > N minutes and then performs a new decommission, the previously transferred > ranges will be skipped during streaming, and any writes received by the > decommissioned node during these N minutes will not be replicated to the new > range owner, what violates consistency. > This issue is analogous to the issue mentioned [on this > comment|https://issues.apache.org/jira/browse/CASSANDRA-8838?focusedCommentId=16900234=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16900234] > for resumable bootstrap (CASSANDRA-8838). > In order to prevent consistency violations we should clear the > {{system.transferred_ranges}} state during node restart, and maybe a system > property to disable it. While we're at this, we should change the default of > {{-Dcassandra.reset_bootstrap_progress}} to {{true}} to clear the > {{system.available_ranges}} state by default when a bootstrapping node is > restarted. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16290) Consistency can be violated when bootstrap or decommission is resumed after node restart
[ https://issues.apache.org/jira/browse/CASSANDRA-16290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17756771#comment-17756771 ] Bartlomiej commented on CASSANDRA-16290: sure [~smiklosovic] - I will try to test it as other properties are tested, just wanted to ensure I go in the right direction :) > Consistency can be violated when bootstrap or decommission is resumed after > node restart > > > Key: CASSANDRA-16290 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16290 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Paulo Motta >Assignee: Bartlomiej >Priority: Normal > Labels: lhf > Fix For: 3.0.x, 3.11.x, 4.0.x > > Time Spent: 10m > Remaining Estimate: 0h > > Since CASSANDRA-12008, successfully transferred ranges during decommission > are saved on the {{system.transferred_ranges}} table. This allow skipping > ranges already transferred when a failed decommission is retried with > {{nodetool decommission}}. > If instead of resuming the decommission, an operator restarts the node, waits > N minutes and then performs a new decommission, the previously transferred > ranges will be skipped during streaming, and any writes received by the > decommissioned node during these N minutes will not be replicated to the new > range owner, what violates consistency. > This issue is analogous to the issue mentioned [on this > comment|https://issues.apache.org/jira/browse/CASSANDRA-8838?focusedCommentId=16900234=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16900234] > for resumable bootstrap (CASSANDRA-8838). > In order to prevent consistency violations we should clear the > {{system.transferred_ranges}} state during node restart, and maybe a system > property to disable it. While we're at this, we should change the default of > {{-Dcassandra.reset_bootstrap_progress}} to {{true}} to clear the > {{system.available_ranges}} state by default when a bootstrapping node is > restarted. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16290) Consistency can be violated when bootstrap or decommission is resumed after node restart
[ https://issues.apache.org/jira/browse/CASSANDRA-16290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17756770#comment-17756770 ] Stefan Miklosovic commented on CASSANDRA-16290: --- Seems to be correct on face value but I think you also need to write a proper test for this, probably in-jvm dtest. > Consistency can be violated when bootstrap or decommission is resumed after > node restart > > > Key: CASSANDRA-16290 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16290 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Paulo Motta >Assignee: Bartlomiej >Priority: Normal > Labels: lhf > Fix For: 3.0.x, 3.11.x, 4.0.x > > Time Spent: 10m > Remaining Estimate: 0h > > Since CASSANDRA-12008, successfully transferred ranges during decommission > are saved on the {{system.transferred_ranges}} table. This allow skipping > ranges already transferred when a failed decommission is retried with > {{nodetool decommission}}. > If instead of resuming the decommission, an operator restarts the node, waits > N minutes and then performs a new decommission, the previously transferred > ranges will be skipped during streaming, and any writes received by the > decommissioned node during these N minutes will not be replicated to the new > range owner, what violates consistency. > This issue is analogous to the issue mentioned [on this > comment|https://issues.apache.org/jira/browse/CASSANDRA-8838?focusedCommentId=16900234=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16900234] > for resumable bootstrap (CASSANDRA-8838). > In order to prevent consistency violations we should clear the > {{system.transferred_ranges}} state during node restart, and maybe a system > property to disable it. While we're at this, we should change the default of > {{-Dcassandra.reset_bootstrap_progress}} to {{true}} to clear the > {{system.available_ranges}} state by default when a bootstrapping node is > restarted. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16290) Consistency can be violated when bootstrap or decommission is resumed after node restart
[ https://issues.apache.org/jira/browse/CASSANDRA-16290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17756713#comment-17756713 ] Bartlomiej commented on CASSANDRA-16290: Hi, I played around this task and wanted to ask if the direction is correct ( pr [https://github.com/apache/cassandra/pull/2614/files] ). > Consistency can be violated when bootstrap or decommission is resumed after > node restart > > > Key: CASSANDRA-16290 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16290 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Paulo Motta >Assignee: Bartlomiej >Priority: Normal > Labels: lhf > Fix For: 3.0.x, 3.11.x, 4.0.x > > Time Spent: 10m > Remaining Estimate: 0h > > Since CASSANDRA-12008, successfully transferred ranges during decommission > are saved on the {{system.transferred_ranges}} table. This allow skipping > ranges already transferred when a failed decommission is retried with > {{nodetool decommission}}. > If instead of resuming the decommission, an operator restarts the node, waits > N minutes and then performs a new decommission, the previously transferred > ranges will be skipped during streaming, and any writes received by the > decommissioned node during these N minutes will not be replicated to the new > range owner, what violates consistency. > This issue is analogous to the issue mentioned [on this > comment|https://issues.apache.org/jira/browse/CASSANDRA-8838?focusedCommentId=16900234=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16900234] > for resumable bootstrap (CASSANDRA-8838). > In order to prevent consistency violations we should clear the > {{system.transferred_ranges}} state during node restart, and maybe a system > property to disable it. While we're at this, we should change the default of > {{-Dcassandra.reset_bootstrap_progress}} to {{true}} to clear the > {{system.available_ranges}} state by default when a bootstrapping node is > restarted. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16290) Consistency can be violated when bootstrap or decommission is resumed after node restart
[ https://issues.apache.org/jira/browse/CASSANDRA-16290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748694#comment-17748694 ] David Paulk commented on CASSANDRA-16290: - Sorry for the delay! Thanks for picking this up Brandon. > Consistency can be violated when bootstrap or decommission is resumed after > node restart > > > Key: CASSANDRA-16290 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16290 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Paulo Motta >Assignee: Bartlomiej >Priority: Normal > Labels: lhf > Fix For: 3.0.x, 3.11.x, 4.0.x > > > Since CASSANDRA-12008, successfully transferred ranges during decommission > are saved on the {{system.transferred_ranges}} table. This allow skipping > ranges already transferred when a failed decommission is retried with > {{nodetool decommission}}. > If instead of resuming the decommission, an operator restarts the node, waits > N minutes and then performs a new decommission, the previously transferred > ranges will be skipped during streaming, and any writes received by the > decommissioned node during these N minutes will not be replicated to the new > range owner, what violates consistency. > This issue is analogous to the issue mentioned [on this > comment|https://issues.apache.org/jira/browse/CASSANDRA-8838?focusedCommentId=16900234=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16900234] > for resumable bootstrap (CASSANDRA-8838). > In order to prevent consistency violations we should clear the > {{system.transferred_ranges}} state during node restart, and maybe a system > property to disable it. While we're at this, we should change the default of > {{-Dcassandra.reset_bootstrap_progress}} to {{true}} to clear the > {{system.available_ranges}} state by default when a bootstrapping node is > restarted. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16290) Consistency can be violated when bootstrap or decommission is resumed after node restart
[ https://issues.apache.org/jira/browse/CASSANDRA-16290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748303#comment-17748303 ] Brandon Williams commented on CASSANDRA-16290: -- Assigned to you, go for it! :) > Consistency can be violated when bootstrap or decommission is resumed after > node restart > > > Key: CASSANDRA-16290 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16290 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Paulo Motta >Assignee: Bartlomiej >Priority: Normal > Labels: lhf > Fix For: 3.0.x, 3.11.x, 4.0.x > > > Since CASSANDRA-12008, successfully transferred ranges during decommission > are saved on the {{system.transferred_ranges}} table. This allow skipping > ranges already transferred when a failed decommission is retried with > {{nodetool decommission}}. > If instead of resuming the decommission, an operator restarts the node, waits > N minutes and then performs a new decommission, the previously transferred > ranges will be skipped during streaming, and any writes received by the > decommissioned node during these N minutes will not be replicated to the new > range owner, what violates consistency. > This issue is analogous to the issue mentioned [on this > comment|https://issues.apache.org/jira/browse/CASSANDRA-8838?focusedCommentId=16900234=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16900234] > for resumable bootstrap (CASSANDRA-8838). > In order to prevent consistency violations we should clear the > {{system.transferred_ranges}} state during node restart, and maybe a system > property to disable it. While we're at this, we should change the default of > {{-Dcassandra.reset_bootstrap_progress}} to {{true}} to clear the > {{system.available_ranges}} state by default when a bootstrapping node is > restarted. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16290) Consistency can be violated when bootstrap or decommission is resumed after node restart
[ https://issues.apache.org/jira/browse/CASSANDRA-16290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748301#comment-17748301 ] Bartlomiej commented on CASSANDRA-16290: Hi, I would like to try implement this (hope it will not overwhelm me :D ). Can I assign myself ? > Consistency can be violated when bootstrap or decommission is resumed after > node restart > > > Key: CASSANDRA-16290 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16290 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Paulo Motta >Priority: Normal > Labels: lhf > Fix For: 3.0.x, 3.11.x, 4.0.x > > > Since CASSANDRA-12008, successfully transferred ranges during decommission > are saved on the {{system.transferred_ranges}} table. This allow skipping > ranges already transferred when a failed decommission is retried with > {{nodetool decommission}}. > If instead of resuming the decommission, an operator restarts the node, waits > N minutes and then performs a new decommission, the previously transferred > ranges will be skipped during streaming, and any writes received by the > decommissioned node during these N minutes will not be replicated to the new > range owner, what violates consistency. > This issue is analogous to the issue mentioned [on this > comment|https://issues.apache.org/jira/browse/CASSANDRA-8838?focusedCommentId=16900234=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16900234] > for resumable bootstrap (CASSANDRA-8838). > In order to prevent consistency violations we should clear the > {{system.transferred_ranges}} state during node restart, and maybe a system > property to disable it. While we're at this, we should change the default of > {{-Dcassandra.reset_bootstrap_progress}} to {{true}} to clear the > {{system.available_ranges}} state by default when a bootstrapping node is > restarted. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16290) Consistency can be violated when bootstrap or decommission is resumed after node restart
[ https://issues.apache.org/jira/browse/CASSANDRA-16290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17680790#comment-17680790 ] Paulo Motta commented on CASSANDRA-16290: - unassigning since I don't think this is being actively worked on > Consistency can be violated when bootstrap or decommission is resumed after > node restart > > > Key: CASSANDRA-16290 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16290 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Paulo Motta >Priority: Normal > Labels: lhf > Fix For: 3.0.x, 3.11.x, 4.0.x > > > Since CASSANDRA-12008, successfully transferred ranges during decommission > are saved on the {{system.transferred_ranges}} table. This allow skipping > ranges already transferred when a failed decommission is retried with > {{nodetool decommission}}. > If instead of resuming the decommission, an operator restarts the node, waits > N minutes and then performs a new decommission, the previously transferred > ranges will be skipped during streaming, and any writes received by the > decommissioned node during these N minutes will not be replicated to the new > range owner, what violates consistency. > This issue is analogous to the issue mentioned [on this > comment|https://issues.apache.org/jira/browse/CASSANDRA-8838?focusedCommentId=16900234=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16900234] > for resumable bootstrap (CASSANDRA-8838). > In order to prevent consistency violations we should clear the > {{system.transferred_ranges}} state during node restart, and maybe a system > property to disable it. While we're at this, we should change the default of > {{-Dcassandra.reset_bootstrap_progress}} to {{true}} to clear the > {{system.available_ranges}} state by default when a bootstrapping node is > restarted. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16290) Consistency can be violated when bootstrap or decommission is resumed after node restart
[ https://issues.apache.org/jira/browse/CASSANDRA-16290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17679102#comment-17679102 ] Stefan Miklosovic commented on CASSANDRA-16290: --- [~davidpaulk] any progress? If you have some in-the-progress branch available it would be great to share it! > Consistency can be violated when bootstrap or decommission is resumed after > node restart > > > Key: CASSANDRA-16290 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16290 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Paulo Motta >Assignee: David Paulk >Priority: Normal > Labels: lhf > Fix For: 3.0.x, 3.11.x, 4.0.x > > > Since CASSANDRA-12008, successfully transferred ranges during decommission > are saved on the {{system.transferred_ranges}} table. This allow skipping > ranges already transferred when a failed decommission is retried with > {{nodetool decommission}}. > If instead of resuming the decommission, an operator restarts the node, waits > N minutes and then performs a new decommission, the previously transferred > ranges will be skipped during streaming, and any writes received by the > decommissioned node during these N minutes will not be replicated to the new > range owner, what violates consistency. > This issue is analogous to the issue mentioned [on this > comment|https://issues.apache.org/jira/browse/CASSANDRA-8838?focusedCommentId=16900234=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16900234] > for resumable bootstrap (CASSANDRA-8838). > In order to prevent consistency violations we should clear the > {{system.transferred_ranges}} state during node restart, and maybe a system > property to disable it. While we're at this, we should change the default of > {{-Dcassandra.reset_bootstrap_progress}} to {{true}} to clear the > {{system.available_ranges}} state by default when a bootstrapping node is > restarted. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16290) Consistency can be violated when bootstrap or decommission is resumed after node restart
[ https://issues.apache.org/jira/browse/CASSANDRA-16290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17483954#comment-17483954 ] David Paulk commented on CASSANDRA-16290: - Synced with [~tejavadali] and [~paulo] on this ticket today - re-assigning to myself as I will be working on it in the upcoming month. > Consistency can be violated when bootstrap or decommission is resumed after > node restart > > > Key: CASSANDRA-16290 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16290 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Paulo Motta >Assignee: David Paulk >Priority: Normal > Labels: lhf > Fix For: 3.0.x, 3.11.x, 4.0.x > > > Since CASSANDRA-12008, successfully transferred ranges during decommission > are saved on the {{system.transferred_ranges}} table. This allow skipping > ranges already transferred when a failed decommission is retried with > {{nodetool decommission}}. > If instead of resuming the decommission, an operator restarts the node, waits > N minutes and then performs a new decommission, the previously transferred > ranges will be skipped during streaming, and any writes received by the > decommissioned node during these N minutes will not be replicated to the new > range owner, what violates consistency. > This issue is analogous to the issue mentioned [on this > comment|https://issues.apache.org/jira/browse/CASSANDRA-8838?focusedCommentId=16900234=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16900234] > for resumable bootstrap (CASSANDRA-8838). > In order to prevent consistency violations we should clear the > {{system.transferred_ranges}} state during node restart, and maybe a system > property to disable it. While we're at this, we should change the default of > {{-Dcassandra.reset_bootstrap_progress}} to {{true}} to clear the > {{system.available_ranges}} state by default when a bootstrapping node is > restarted. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16290) Consistency can be violated when bootstrap or decommission is resumed after node restart
[ https://issues.apache.org/jira/browse/CASSANDRA-16290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17469510#comment-17469510 ] Krishna Vadali commented on CASSANDRA-16290: I am taking this. > Consistency can be violated when bootstrap or decommission is resumed after > node restart > > > Key: CASSANDRA-16290 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16290 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Paulo Motta >Assignee: Krishna Vadali >Priority: Normal > Labels: lhf > Fix For: 3.0.x, 3.11.x, 4.0.x > > > Since CASSANDRA-12008, successfully transferred ranges during decommission > are saved on the {{system.transferred_ranges}} table. This allow skipping > ranges already transferred when a failed decommission is retried with > {{nodetool decommission}}. > If instead of resuming the decommission, an operator restarts the node, waits > N minutes and then performs a new decommission, the previously transferred > ranges will be skipped during streaming, and any writes received by the > decommissioned node during these N minutes will not be replicated to the new > range owner, what violates consistency. > This issue is analogous to the issue mentioned [on this > comment|https://issues.apache.org/jira/browse/CASSANDRA-8838?focusedCommentId=16900234=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16900234] > for resumable bootstrap (CASSANDRA-8838). > In order to prevent consistency violations we should clear the > {{system.transferred_ranges}} state during node restart, and maybe a system > property to disable it. While we're at this, we should change the default of > {{-Dcassandra.reset_bootstrap_progress}} to {{true}} to clear the > {{system.available_ranges}} state by default when a bootstrapping node is > restarted. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16290) Consistency can be violated when bootstrap or decommission is resumed after node restart
[ https://issues.apache.org/jira/browse/CASSANDRA-16290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17264363#comment-17264363 ] Brandon Williams commented on CASSANDRA-16290: -- +1 to that, fixvers updated. > Consistency can be violated when bootstrap or decommission is resumed after > node restart > > > Key: CASSANDRA-16290 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16290 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Paulo Motta >Assignee: Paulo Motta >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x > > > Since CASSANDRA-12008, successfully transferred ranges during decommission > are saved on the {{system.transferred_ranges}} table. This allow skipping > ranges already transferred when a failed decommission is retried with > {{nodetool decommission}}. > If instead of resuming the decommission, an operator restarts the node, waits > N minutes and then performs a new decommission, the previously transferred > ranges will be skipped during streaming, and any writes received by the > decommissioned node during these N minutes will not be replicated to the new > range owner, what violates consistency. > This issue is analogous to the issue mentioned [on this > comment|https://issues.apache.org/jira/browse/CASSANDRA-8838?focusedCommentId=16900234=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16900234] > for resumable bootstrap (CASSANDRA-8838). > In order to prevent consistency violations we should clear the > {{system.transferred_ranges}} state during node restart, and maybe a system > property to disable it. While we're at this, we should change the default of > {{-Dcassandra.reset_bootstrap_progress}} to {{true}} to clear the > {{system.available_ranges}} state by default when a bootstrapping node is > restarted. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16290) Consistency can be violated when bootstrap or decommission is resumed after node restart
[ https://issues.apache.org/jira/browse/CASSANDRA-16290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17264328#comment-17264328 ] Jon Meredith commented on CASSANDRA-16290: -- Given this is a longstanding issue, I don't think it should be a blocker for 4.0 release. Could we move it out to 4.0.x and not consider it a blocker for getting to a 4.0 RC? > Consistency can be violated when bootstrap or decommission is resumed after > node restart > > > Key: CASSANDRA-16290 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16290 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Paulo Motta >Assignee: Paulo Motta >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0-beta > > > Since CASSANDRA-12008, successfully transferred ranges during decommission > are saved on the {{system.transferred_ranges}} table. This allow skipping > ranges already transferred when a failed decommission is retried with > {{nodetool decommission}}. > If instead of resuming the decommission, an operator restarts the node, waits > N minutes and then performs a new decommission, the previously transferred > ranges will be skipped during streaming, and any writes received by the > decommissioned node during these N minutes will not be replicated to the new > range owner, what violates consistency. > This issue is analogous to the issue mentioned [on this > comment|https://issues.apache.org/jira/browse/CASSANDRA-8838?focusedCommentId=16900234=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16900234] > for resumable bootstrap (CASSANDRA-8838). > In order to prevent consistency violations we should clear the > {{system.transferred_ranges}} state during node restart, and maybe a system > property to disable it. While we're at this, we should change the default of > {{-Dcassandra.reset_bootstrap_progress}} to {{true}} to clear the > {{system.available_ranges}} state by default when a bootstrapping node is > restarted. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16290) Consistency can be violated when bootstrap or decommission is resumed after node restart
[ https://issues.apache.org/jira/browse/CASSANDRA-16290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17235772#comment-17235772 ] Stefan Miklosovic commented on CASSANDRA-16290: --- I take this out of bravery and drop it if I feel like its too much. > Consistency can be violated when bootstrap or decommission is resumed after > node restart > > > Key: CASSANDRA-16290 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16290 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Paulo Motta >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0-beta > > > Since CASSANDRA-12008, successfully transferred ranges during decommission > are saved on the {{system.transferred_ranges}} table. This allow skipping > ranges already transferred when a failed decommission is retried with > {{nodetool decommission}}. > If instead of resuming the decommission, an operator restarts the node, waits > N minutes and then performs a new decommission, the previously transferred > ranges will be skipped during streaming, and any writes received by the > decommissioned node during these N minutes will not be replicated to the new > range owner, what violates consistency. > This issue is analogous to the issue mentioned [on this > comment|https://issues.apache.org/jira/browse/CASSANDRA-8838?focusedCommentId=16900234=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16900234] > for resumable bootstrap (CASSANDRA-8838). > In order to prevent consistency violations we should clear the > {{system.transferred_ranges}} state during node restart, and maybe a system > property to disable it. While we're at this, we should change the default of > {{-Dcassandra.reset_bootstrap_progress}} to {{true}} to clear the > {{system.available_ranges}} state by default when a bootstrapping node is > restarted. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16290) Consistency can be violated when bootstrap or decommission is resumed after node restart
[ https://issues.apache.org/jira/browse/CASSANDRA-16290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17235735#comment-17235735 ] Paulo Motta commented on CASSANDRA-16290: - bq. we should also fix it in 3.0 and 3.11 +1 > Consistency can be violated when bootstrap or decommission is resumed after > node restart > > > Key: CASSANDRA-16290 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16290 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Paulo Motta >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0-beta > > > Since CASSANDRA-12008, successfully transferred ranges during decommission > are saved on the {{system.transferred_ranges}} table. This allow skipping > ranges already transferred when a failed decommission is retried with > {{nodetool decommission}}. > If instead of resuming the decommission, an operator restarts the node, waits > N minutes and then performs a new decommission, the previously transferred > ranges will be skipped during streaming, and any writes received by the > decommissioned node during these N minutes will not be replicated to the new > range owner, what violates consistency. > This issue is analogous to the issue mentioned [on this > comment|https://issues.apache.org/jira/browse/CASSANDRA-8838?focusedCommentId=16900234=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16900234] > for resumable bootstrap (CASSANDRA-8838). > In order to prevent consistency violations we should clear the > {{system.transferred_ranges}} state during node restart, and maybe a system > property to disable it. While we're at this, we should change the default of > {{-Dcassandra.reset_bootstrap_progress}} to {{true}} to clear the > {{system.available_ranges}} state by default when a bootstrapping node is > restarted. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16290) Consistency can be violated when bootstrap or decommission is resumed after node restart
[ https://issues.apache.org/jira/browse/CASSANDRA-16290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17235734#comment-17235734 ] Marcus Eriksson commented on CASSANDRA-16290: - ok, duped here, and agree on 4.0-beta blocker, but we should also fix it in 3.0 and 3.11 > Consistency can be violated when bootstrap or decommission is resumed after > node restart > > > Key: CASSANDRA-16290 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16290 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Paulo Motta >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0-beta > > > Since CASSANDRA-12008, successfully transferred ranges during decommission > are saved on the {{system.transferred_ranges}} table. This allow skipping > ranges already transferred when a failed decommission is retried with > {{nodetool decommission}}. > If instead of resuming the decommission, an operator restarts the node, waits > N minutes and then performs a new decommission, the previously transferred > ranges will be skipped during streaming, and any writes received by the > decommissioned node during these N minutes will not be replicated to the new > range owner, what violates consistency. > This issue is analogous to the issue mentioned [on this > comment|https://issues.apache.org/jira/browse/CASSANDRA-8838?focusedCommentId=16900234=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16900234] > for resumable bootstrap (CASSANDRA-8838). > In order to prevent consistency violations we should clear the > {{system.transferred_ranges}} state during node restart, and maybe a system > property to disable it. While we're at this, we should change the default of > {{-Dcassandra.reset_bootstrap_progress}} to {{true}} to clear the > {{system.available_ranges}} state by default when a bootstrapping node is > restarted. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16290) Consistency can be violated when bootstrap or decommission is resumed after node restart
[ https://issues.apache.org/jira/browse/CASSANDRA-16290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17235703#comment-17235703 ] Paulo Motta commented on CASSANDRA-16290: - Thanks for the heads up, missed that one [~marcuse]. Mind if we keep this one since the description is more general (ie. also affect decommission) ? I won't have cycles to grab this one soon but I'd be happy to review it, feel free to take it. > Consistency can be violated when bootstrap or decommission is resumed after > node restart > > > Key: CASSANDRA-16290 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16290 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Paulo Motta >Priority: Normal > Fix For: 4.0-beta > > > Since CASSANDRA-12008, successfully transferred ranges during decommission > are saved on the {{system.transferred_ranges}} table. This allow skipping > ranges already transferred when a failed decommission is retried with > {{nodetool decommission}}. > If instead of resuming the decommission, an operator restarts the node, waits > N minutes and then performs a new decommission, the previously transferred > ranges will be skipped during streaming, and any writes received by the > decommissioned node during these N minutes will not be replicated to the new > range owner, what violates consistency. > This issue is analogous to the issue mentioned [on this > comment|https://issues.apache.org/jira/browse/CASSANDRA-8838?focusedCommentId=16900234=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16900234] > for resumable bootstrap (CASSANDRA-8838). > In order to prevent consistency violations we should clear the > {{system.transferred_ranges}} state during node restart, and maybe a system > property to disable it. While we're at this, we should change the default of > {{-Dcassandra.reset_bootstrap_progress}} to {{true}} to clear the > {{system.available_ranges}} state by default when a bootstrapping node is > restarted. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16290) Consistency can be violated when bootstrap or decommission is resumed after node restart
[ https://issues.apache.org/jira/browse/CASSANDRA-16290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17235697#comment-17235697 ] Marcus Eriksson commented on CASSANDRA-16290: - this is probably a dupe of https://issues.apache.org/jira/browse/CASSANDRA-15264 - feel free to grab that one if you have cycles [~paulo]! > Consistency can be violated when bootstrap or decommission is resumed after > node restart > > > Key: CASSANDRA-16290 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16290 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Bootstrap and Decommission >Reporter: Paulo Motta >Priority: Normal > Fix For: 4.0-beta > > > Since CASSANDRA-12008, successfully transferred ranges during decommission > are saved on the {{system.transferred_ranges}} table. This allow skipping > ranges already transferred when a failed decommission is retried with > {{nodetool decommission}}. > If instead of resuming the decommission, an operator restarts the node, waits > N minutes and then performs a new decommission, the previously transferred > ranges will be skipped during streaming, and any writes received by the > decommissioned node during these N minutes will not be replicated to the new > range owner, what violates consistency. > This issue is analogous to the issue mentioned [on this > comment|https://issues.apache.org/jira/browse/CASSANDRA-8838?focusedCommentId=16900234=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16900234] > for resumable bootstrap (CASSANDRA-8838). > In order to prevent consistency violations we should clear the > {{system.transferred_ranges}} state during node restart, and maybe a system > property to disable it. While we're at this, we should change the default of > {{-Dcassandra.reset_bootstrap_progress}} to {{true}} to clear the > {{system.available_ranges}} state by default when a bootstrapping node is > restarted. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org