[jira] [Commented] (CASSANDRA-957) convenience workflow for replacing dead node
[ https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097250#comment-13097250 ] Jonathan Ellis commented on CASSANDRA-957: -- Can you add a short how-to to NEWS.txt describing this feature? convenience workflow for replacing dead node Key: CASSANDRA-957 URL: https://issues.apache.org/jira/browse/CASSANDRA-957 Project: Cassandra Issue Type: Wish Components: Core, Tools Affects Versions: 0.8.2 Reporter: Jonathan Ellis Assignee: Vijay Fix For: 1.0 Attachments: 0001-Support-bringing-back-a-node-to-the-cluster-that-exi.patch, 0001-support-for-replace-token-v3.patch, 0001-support-token-replace-v4.patch, 0001-support-token-replace-v5.patch, 0001-support-token-replace-v6.patch, 0001-support-token-replace-v7.patch, 0002-Do-not-include-local-node-when-computing-workMap.patch, 0002-hints-on-token-than-ip-v4.patch, 0002-hints-on-token-than-ip-v5.patch, 0002-hints-on-token-than-ip-v6.patch, 0002-upport-for-hints-on-token-v3.patch Original Estimate: 24h Remaining Estimate: 24h Replacing a dead node with a new one is a common operation, but nodetool removetoken followed by bootstrap is inefficient (re-replicating data first to the remaining nodes, then to the new one) and manually bootstrapping to a token just less than the old one's, followed by nodetool removetoken is slightly painful and prone to manual errors. First question: how would you expose this in our tool ecosystem? It needs to be a startup-time option to the new node, so it can't be nodetool, and messing with the config xml definitely takes the convenience out. A one-off -DreplaceToken=XXY argument? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-957) convenience workflow for replacing dead node
[ https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097708#comment-13097708 ] Hudson commented on CASSANDRA-957: -- Integrated in Cassandra #1076 (See [https://builds.apache.org/job/Cassandra/1076/]) convenience workflow for replacing dead node patch by Vijay; reviewed by Nick Bailey for CASSANDRA-957 jbellis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1165468 Files : * /cassandra/trunk/NEWS.txt * /cassandra/trunk/src/java/org/apache/cassandra/config/DatabaseDescriptor.java * /cassandra/trunk/src/java/org/apache/cassandra/db/HintedHandOffManager.java * /cassandra/trunk/src/java/org/apache/cassandra/db/RowMutation.java * /cassandra/trunk/src/java/org/apache/cassandra/dht/BootStrapper.java * /cassandra/trunk/src/java/org/apache/cassandra/gms/EndpointState.java * /cassandra/trunk/src/java/org/apache/cassandra/gms/Gossiper.java * /cassandra/trunk/src/java/org/apache/cassandra/gms/VersionedValue.java * /cassandra/trunk/src/java/org/apache/cassandra/service/LoadBroadcaster.java * /cassandra/trunk/src/java/org/apache/cassandra/service/MigrationManager.java * /cassandra/trunk/src/java/org/apache/cassandra/service/StorageProxy.java * /cassandra/trunk/src/java/org/apache/cassandra/service/StorageService.java convenience workflow for replacing dead node Key: CASSANDRA-957 URL: https://issues.apache.org/jira/browse/CASSANDRA-957 Project: Cassandra Issue Type: Wish Components: Core, Tools Affects Versions: 0.8.2 Reporter: Jonathan Ellis Assignee: Vijay Fix For: 1.0 Attachments: 0001-Support-bringing-back-a-node-to-the-cluster-that-exi.patch, 0001-adding-NEWS.patch, 0001-support-for-replace-token-v3.patch, 0001-support-token-replace-v4.patch, 0001-support-token-replace-v5.patch, 0001-support-token-replace-v6.patch, 0001-support-token-replace-v7.patch, 0002-Do-not-include-local-node-when-computing-workMap.patch, 0002-hints-on-token-than-ip-v4.patch, 0002-hints-on-token-than-ip-v5.patch, 0002-hints-on-token-than-ip-v6.patch, 0002-upport-for-hints-on-token-v3.patch Original Estimate: 24h Remaining Estimate: 24h Replacing a dead node with a new one is a common operation, but nodetool removetoken followed by bootstrap is inefficient (re-replicating data first to the remaining nodes, then to the new one) and manually bootstrapping to a token just less than the old one's, followed by nodetool removetoken is slightly painful and prone to manual errors. First question: how would you expose this in our tool ecosystem? It needs to be a startup-time option to the new node, so it can't be nodetool, and messing with the config xml definitely takes the convenience out. A one-off -DreplaceToken=XXY argument? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-957) convenience workflow for replacing dead node
[ https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13096078#comment-13096078 ] Nick Bailey commented on CASSANDRA-957: --- Looks like an onRestart() method was added for the state subscribers. I think it was accidental but looks like your change removes that call and replaces it with markDead(). Was that accidental? convenience workflow for replacing dead node Key: CASSANDRA-957 URL: https://issues.apache.org/jira/browse/CASSANDRA-957 Project: Cassandra Issue Type: Wish Components: Core, Tools Affects Versions: 0.8.2 Reporter: Jonathan Ellis Assignee: Vijay Fix For: 1.0 Attachments: 0001-Support-bringing-back-a-node-to-the-cluster-that-exi.patch, 0001-support-for-replace-token-v3.patch, 0001-support-token-replace-v4.patch, 0001-support-token-replace-v5.patch, 0001-support-token-replace-v6.patch, 0002-Do-not-include-local-node-when-computing-workMap.patch, 0002-hints-on-token-than-ip-v4.patch, 0002-hints-on-token-than-ip-v5.patch, 0002-hints-on-token-than-ip-v6.patch, 0002-upport-for-hints-on-token-v3.patch Original Estimate: 24h Remaining Estimate: 24h Replacing a dead node with a new one is a common operation, but nodetool removetoken followed by bootstrap is inefficient (re-replicating data first to the remaining nodes, then to the new one) and manually bootstrapping to a token just less than the old one's, followed by nodetool removetoken is slightly painful and prone to manual errors. First question: how would you expose this in our tool ecosystem? It needs to be a startup-time option to the new node, so it can't be nodetool, and messing with the config xml definitely takes the convenience out. A one-off -DreplaceToken=XXY argument? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-957) convenience workflow for replacing dead node
[ https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13096360#comment-13096360 ] Nick Bailey commented on CASSANDRA-957: --- +1 on 0001 v7 and 0002 v6 convenience workflow for replacing dead node Key: CASSANDRA-957 URL: https://issues.apache.org/jira/browse/CASSANDRA-957 Project: Cassandra Issue Type: Wish Components: Core, Tools Affects Versions: 0.8.2 Reporter: Jonathan Ellis Assignee: Vijay Fix For: 1.0 Attachments: 0001-Support-bringing-back-a-node-to-the-cluster-that-exi.patch, 0001-support-for-replace-token-v3.patch, 0001-support-token-replace-v4.patch, 0001-support-token-replace-v5.patch, 0001-support-token-replace-v6.patch, 0001-support-token-replace-v7.patch, 0002-Do-not-include-local-node-when-computing-workMap.patch, 0002-hints-on-token-than-ip-v4.patch, 0002-hints-on-token-than-ip-v5.patch, 0002-hints-on-token-than-ip-v6.patch, 0002-upport-for-hints-on-token-v3.patch Original Estimate: 24h Remaining Estimate: 24h Replacing a dead node with a new one is a common operation, but nodetool removetoken followed by bootstrap is inefficient (re-replicating data first to the remaining nodes, then to the new one) and manually bootstrapping to a token just less than the old one's, followed by nodetool removetoken is slightly painful and prone to manual errors. First question: how would you expose this in our tool ecosystem? It needs to be a startup-time option to the new node, so it can't be nodetool, and messing with the config xml definitely takes the convenience out. A one-off -DreplaceToken=XXY argument? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-957) convenience workflow for replacing dead node
[ https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091364#comment-13091364 ] Nick Bailey commented on CASSANDRA-957: --- The hint rework looks good. The only comment I have there is that it would be nice if the logging statements for sending hints creating hints indicated the ip as well as the token. Even though it's stored by token it would be nice to immediately see the ip in the log without having to look it up. I'm also unsure about the reasoning behind the last patch. Why increase the initial sleep in joinTokenRing? convenience workflow for replacing dead node Key: CASSANDRA-957 URL: https://issues.apache.org/jira/browse/CASSANDRA-957 Project: Cassandra Issue Type: Wish Components: Core, Tools Affects Versions: 0.8.2 Reporter: Jonathan Ellis Assignee: Vijay Fix For: 1.0 Attachments: 0001-Support-Token-Replace.patch, 0001-Support-bringing-back-a-node-to-the-cluster-that-exi.patch, 0001-Support-token-replace.patch, 0001-support-for-replace-token-v3.patch, 0002-Do-not-include-local-node-when-computing-workMap.patch, 0002-Rework-Hints-to-be-on-token.patch, 0002-Rework-Hints-to-be-on-token.patch, 0002-upport-for-hints-on-token-v3.patch, 0003-Make-HintedHandoff-More-reliable.patch, 0003-Make-hints-More-reliable.patch, 0003-making-bootstrap-sleep-longer.patch Original Estimate: 24h Remaining Estimate: 24h Replacing a dead node with a new one is a common operation, but nodetool removetoken followed by bootstrap is inefficient (re-replicating data first to the remaining nodes, then to the new one) and manually bootstrapping to a token just less than the old one's, followed by nodetool removetoken is slightly painful and prone to manual errors. First question: how would you expose this in our tool ecosystem? It needs to be a startup-time option to the new node, so it can't be nodetool, and messing with the config xml definitely takes the convenience out. A one-off -DreplaceToken=XXY argument? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-957) convenience workflow for replacing dead node
[ https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091376#comment-13091376 ] Vijay commented on CASSANDRA-957: - * In Gossiper.doStatusCheck() you made it ignore any state that is for the local endpoint and is not a dead state. Shouldn't it just always ignore any state about the local endpoint though? Basically what it was doing previously? * Basically the same question about Gossiper.applyStateLocally() the loop continues if the state is for the local node and the state is dead. Why would we want to apply a live local state? - Fixed, initial intention was to find the old state of the node, Seems like it is not possible now… * Does the hibernate state need the true/false value? Seems like all we care about is that it is set at all. Looks like we we are starting up right now we automatically go into a hibernate state, then we go into a bootstrap state afterwards if the specified a replace token. Seems like we shouldn't set a state at all until we know we are doing one of replace/bootstrap/just joining. - it will be either true or false (If not a replace, or overwrite with the state normal)… if you don't then Gossiper.applyStateLocally will mark it alive on all the other nodes. * It looks like right now you could specify a replace token that isn't part of the cluster. If that happens we should throw an exception and tell the user to do the normal bootstrap process. - As we are ignoring the local states… this information is hard to gather when we are trying to replace the same node…. The check is to see no other live node owns this token…. - We can document in the wiki about the effects if they replace a token which is not part of the ring…. (repair/decommission) * Why use the last gossip time to determine if the node we are replacing is alive? Why not just check gossip to see if the ring thinks it is alive? - because by default when we hear about someone we consider them to be alive…. the idea is to check and see if we heard from them back or not (After the ring delay) if not then there is more probability that the dead node is dead (Thats why we have to wait for 90 + delay * We should update the the message for the exception that is thrown when you try to bootstrap to an existing token. It should indicate either remove the dead node or follow this replacement process. - I am not sure if i parse that, i have added more to it plz check. * I'm not sure why we are calling updateNormalToken() in the StorageService.bootstrap() method when it's a token replacement. - Thats because you don't want the range request sent to the node which is not existing. * A little bit of doc on this would be good, maybe in cassandra.yaml? Just on how to pass the argument to the startup process. - Yaml is bad because this is a one time thing…. Wiki page? like the don't join ring property convenience workflow for replacing dead node Key: CASSANDRA-957 URL: https://issues.apache.org/jira/browse/CASSANDRA-957 Project: Cassandra Issue Type: Wish Components: Core, Tools Affects Versions: 0.8.2 Reporter: Jonathan Ellis Assignee: Vijay Fix For: 1.0 Attachments: 0001-Support-Token-Replace.patch, 0001-Support-bringing-back-a-node-to-the-cluster-that-exi.patch, 0001-Support-token-replace.patch, 0001-support-for-replace-token-v3.patch, 0002-Do-not-include-local-node-when-computing-workMap.patch, 0002-Rework-Hints-to-be-on-token.patch, 0002-Rework-Hints-to-be-on-token.patch, 0002-upport-for-hints-on-token-v3.patch, 0003-Make-HintedHandoff-More-reliable.patch, 0003-Make-hints-More-reliable.patch, 0003-making-bootstrap-sleep-longer.patch Original Estimate: 24h Remaining Estimate: 24h Replacing a dead node with a new one is a common operation, but nodetool removetoken followed by bootstrap is inefficient (re-replicating data first to the remaining nodes, then to the new one) and manually bootstrapping to a token just less than the old one's, followed by nodetool removetoken is slightly painful and prone to manual errors. First question: how would you expose this in our tool ecosystem? It needs to be a startup-time option to the new node, so it can't be nodetool, and messing with the config xml definitely takes the convenience out. A one-off -DreplaceToken=XXY argument? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-957) convenience workflow for replacing dead node
[ https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091386#comment-13091386 ] Vijay commented on CASSANDRA-957: - I'm also unsure about the reasoning behind the last patch. Why increase the initial sleep in joinTokenRing? -- Ring delay + extra time so we can check if there is any live server before actually replacing the node. convenience workflow for replacing dead node Key: CASSANDRA-957 URL: https://issues.apache.org/jira/browse/CASSANDRA-957 Project: Cassandra Issue Type: Wish Components: Core, Tools Affects Versions: 0.8.2 Reporter: Jonathan Ellis Assignee: Vijay Fix For: 1.0 Attachments: 0001-Support-Token-Replace.patch, 0001-Support-bringing-back-a-node-to-the-cluster-that-exi.patch, 0001-Support-token-replace.patch, 0001-support-for-replace-token-v3.patch, 0001-support-token-replace-v4.patch, 0002-Do-not-include-local-node-when-computing-workMap.patch, 0002-Rework-Hints-to-be-on-token.patch, 0002-Rework-Hints-to-be-on-token.patch, 0002-upport-for-hints-on-token-v3.patch, 0003-Make-HintedHandoff-More-reliable.patch, 0003-Make-hints-More-reliable.patch, 0003-making-bootstrap-sleep-longer.patch Original Estimate: 24h Remaining Estimate: 24h Replacing a dead node with a new one is a common operation, but nodetool removetoken followed by bootstrap is inefficient (re-replicating data first to the remaining nodes, then to the new one) and manually bootstrapping to a token just less than the old one's, followed by nodetool removetoken is slightly painful and prone to manual errors. First question: how would you expose this in our tool ecosystem? It needs to be a startup-time option to the new node, so it can't be nodetool, and messing with the config xml definitely takes the convenience out. A one-off -DreplaceToken=XXY argument? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-957) convenience workflow for replacing dead node
[ https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13090577#comment-13090577 ] Nick Bailey commented on CASSANDRA-957: --- So a few questions: * In Gossiper.doStatusCheck() you made it ignore any state that is for the local endpoint and is not a dead state. Shouldn't it just always ignore any state about the local endpoint though? Basically what it was doing previously? * Basically the same question about Gossiper.applyStateLocally() the loop continues if the state is for the local node and the state is dead. Why would we want to apply a live local state? * Does the hibernate state need the true/false value? Seems like all we care about is that it is set at all. Looks like we we are starting up right now we automatically go into a hibernate state, then we go into a bootstrap state afterwards if the specified a replace token. Seems like we shouldn't set a state at all until we know we are doing one of replace/bootstrap/just joining. * It looks like right now you could specify a replace token that isn't part of the cluster. If that happens we should throw an exception and tell the user to do the normal bootstrap process. * Why use the last gossip time to determine if the node we are replacing is alive? Why not just check gossip to see if the ring thinks it is alive? * We should update the the message for the exception that is thrown when you try to bootstrap to an existing token. It should indicate either remove the dead node or follow this replacement process. * I'm not sure why we are calling updateNormalToken() in the StorageService.bootstrap() method when it's a token replacement. * A little bit of doc on this would be good, maybe in cassandra.yaml? Just on how to pass the argument to the startup process. I also need to dive into the hint stuff a little bit more, I'm less familiar with that code. convenience workflow for replacing dead node Key: CASSANDRA-957 URL: https://issues.apache.org/jira/browse/CASSANDRA-957 Project: Cassandra Issue Type: Wish Components: Core, Tools Affects Versions: 0.8.2 Reporter: Jonathan Ellis Assignee: Vijay Fix For: 1.0 Attachments: 0001-Support-Token-Replace.patch, 0001-Support-bringing-back-a-node-to-the-cluster-that-exi.patch, 0001-Support-token-replace.patch, 0001-support-for-replace-token-v3.patch, 0002-Do-not-include-local-node-when-computing-workMap.patch, 0002-Rework-Hints-to-be-on-token.patch, 0002-Rework-Hints-to-be-on-token.patch, 0002-upport-for-hints-on-token-v3.patch, 0003-Make-HintedHandoff-More-reliable.patch, 0003-Make-hints-More-reliable.patch, 0003-making-bootstrap-sleep-longer.patch Original Estimate: 24h Remaining Estimate: 24h Replacing a dead node with a new one is a common operation, but nodetool removetoken followed by bootstrap is inefficient (re-replicating data first to the remaining nodes, then to the new one) and manually bootstrapping to a token just less than the old one's, followed by nodetool removetoken is slightly painful and prone to manual errors. First question: how would you expose this in our tool ecosystem? It needs to be a startup-time option to the new node, so it can't be nodetool, and messing with the config xml definitely takes the convenience out. A one-off -DreplaceToken=XXY argument? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-957) convenience workflow for replacing dead node
[ https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069029#comment-13069029 ] Vijay commented on CASSANDRA-957: - Seems like CASSANDRA-2928 fixes the hints issue... so we can ignore 0003 in this ticket. convenience workflow for replacing dead node Key: CASSANDRA-957 URL: https://issues.apache.org/jira/browse/CASSANDRA-957 Project: Cassandra Issue Type: Wish Components: Core, Tools Affects Versions: 0.8.2 Reporter: Jonathan Ellis Assignee: Vijay Fix For: 1.0 Attachments: 0001-Support-Token-Replace.patch, 0001-Support-bringing-back-a-node-to-the-cluster-that-exi.patch, 0001-Support-token-replace.patch, 0002-Do-not-include-local-node-when-computing-workMap.patch, 0002-Rework-Hints-to-be-on-token.patch, 0002-Rework-Hints-to-be-on-token.patch, 0003-Make-HintedHandoff-More-reliable.patch, 0003-Make-hints-More-reliable.patch Original Estimate: 24h Remaining Estimate: 24h Replacing a dead node with a new one is a common operation, but nodetool removetoken followed by bootstrap is inefficient (re-replicating data first to the remaining nodes, then to the new one) and manually bootstrapping to a token just less than the old one's, followed by nodetool removetoken is slightly painful and prone to manual errors. First question: how would you expose this in our tool ecosystem? It needs to be a startup-time option to the new node, so it can't be nodetool, and messing with the config xml definitely takes the convenience out. A one-off -DreplaceToken=XXY argument? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-957) convenience workflow for replacing dead node
[ https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13056504#comment-13056504 ] Vijay commented on CASSANDRA-957: - Thanks Chris, I will give it a try... convenience workflow for replacing dead node Key: CASSANDRA-957 URL: https://issues.apache.org/jira/browse/CASSANDRA-957 Project: Cassandra Issue Type: Wish Components: Core, Tools Reporter: Jonathan Ellis Fix For: 1.0 Attachments: 0001-Support-bringing-back-a-node-to-the-cluster-that-exi.patch, 0002-Do-not-include-local-node-when-computing-workMap.patch Original Estimate: 24h Remaining Estimate: 24h Replacing a dead node with a new one is a common operation, but nodetool removetoken followed by bootstrap is inefficient (re-replicating data first to the remaining nodes, then to the new one) and manually bootstrapping to a token just less than the old one's, followed by nodetool removetoken is slightly painful and prone to manual errors. First question: how would you expose this in our tool ecosystem? It needs to be a startup-time option to the new node, so it can't be nodetool, and messing with the config xml definitely takes the convenience out. A one-off -DreplaceToken=XXY argument? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-957) convenience workflow for replacing dead node
[ https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13056293#comment-13056293 ] Chris Goffinet commented on CASSANDRA-957: -- I am going to defer this to anyone else who would like to pick up this ticket. I just do not have the spare time to focus on this. convenience workflow for replacing dead node Key: CASSANDRA-957 URL: https://issues.apache.org/jira/browse/CASSANDRA-957 Project: Cassandra Issue Type: Wish Components: Core, Tools Reporter: Jonathan Ellis Fix For: 1.0 Attachments: 0001-Support-bringing-back-a-node-to-the-cluster-that-exi.patch, 0002-Do-not-include-local-node-when-computing-workMap.patch Original Estimate: 24h Remaining Estimate: 24h Replacing a dead node with a new one is a common operation, but nodetool removetoken followed by bootstrap is inefficient (re-replicating data first to the remaining nodes, then to the new one) and manually bootstrapping to a token just less than the old one's, followed by nodetool removetoken is slightly painful and prone to manual errors. First question: how would you expose this in our tool ecosystem? It needs to be a startup-time option to the new node, so it can't be nodetool, and messing with the config xml definitely takes the convenience out. A one-off -DreplaceToken=XXY argument? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-957) convenience workflow for replacing dead node
[ https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045729#comment-13045729 ] Jonathan Ellis commented on CASSANDRA-957: -- Chris, are you still working on this? convenience workflow for replacing dead node Key: CASSANDRA-957 URL: https://issues.apache.org/jira/browse/CASSANDRA-957 Project: Cassandra Issue Type: Wish Components: Core, Tools Reporter: Jonathan Ellis Assignee: Chris Goffinet Fix For: 1.0 Attachments: 0001-Support-bringing-back-a-node-to-the-cluster-that-exi.patch, 0002-Do-not-include-local-node-when-computing-workMap.patch Original Estimate: 24h Remaining Estimate: 24h Replacing a dead node with a new one is a common operation, but nodetool removetoken followed by bootstrap is inefficient (re-replicating data first to the remaining nodes, then to the new one) and manually bootstrapping to a token just less than the old one's, followed by nodetool removetoken is slightly painful and prone to manual errors. First question: how would you expose this in our tool ecosystem? It needs to be a startup-time option to the new node, so it can't be nodetool, and messing with the config xml definitely takes the convenience out. A one-off -DreplaceToken=XXY argument? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-957) convenience workflow for replacing dead node
[ https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045741#comment-13045741 ] Chris Goffinet commented on CASSANDRA-957: -- yes. ill post an update next week with an updated patchset. convenience workflow for replacing dead node Key: CASSANDRA-957 URL: https://issues.apache.org/jira/browse/CASSANDRA-957 Project: Cassandra Issue Type: Wish Components: Core, Tools Reporter: Jonathan Ellis Assignee: Chris Goffinet Fix For: 1.0 Attachments: 0001-Support-bringing-back-a-node-to-the-cluster-that-exi.patch, 0002-Do-not-include-local-node-when-computing-workMap.patch Original Estimate: 24h Remaining Estimate: 24h Replacing a dead node with a new one is a common operation, but nodetool removetoken followed by bootstrap is inefficient (re-replicating data first to the remaining nodes, then to the new one) and manually bootstrapping to a token just less than the old one's, followed by nodetool removetoken is slightly painful and prone to manual errors. First question: how would you expose this in our tool ecosystem? It needs to be a startup-time option to the new node, so it can't be nodetool, and messing with the config xml definitely takes the convenience out. A one-off -DreplaceToken=XXY argument? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-957) convenience workflow for replacing dead node
[ https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12991058#comment-12991058 ] Nick Bailey commented on CASSANDRA-957: --- What happens if I bring up a node with the same ip and bootstrap on but forget the replace option? It looks like it will try to bootstrap to an auto picked token. Am I reading that right? What happens if I accidentally give the wrong token with the replace option? If I accidentally give the token for a live node will it try to bootstrap to the same position? convenience workflow for replacing dead node Key: CASSANDRA-957 URL: https://issues.apache.org/jira/browse/CASSANDRA-957 Project: Cassandra Issue Type: Wish Components: Core, Tools Reporter: Jonathan Ellis Assignee: Chris Goffinet Fix For: 0.8 Attachments: 0001-Support-bringing-back-a-node-to-the-cluster-that-exi.patch, 0002-Do-not-include-local-node-when-computing-workMap.patch Original Estimate: 24h Remaining Estimate: 24h Replacing a dead node with a new one is a common operation, but nodetool removetoken followed by bootstrap is inefficient (re-replicating data first to the remaining nodes, then to the new one) and manually bootstrapping to a token just less than the old one's, followed by nodetool removetoken is slightly painful and prone to manual errors. First question: how would you expose this in our tool ecosystem? It needs to be a startup-time option to the new node, so it can't be nodetool, and messing with the config xml definitely takes the convenience out. A one-off -DreplaceToken=XXY argument? -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-957) convenience workflow for replacing dead node
[ https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12991095#comment-12991095 ] Chris Goffinet commented on CASSANDRA-957: -- Good observation. I'll work on covering both use cases. convenience workflow for replacing dead node Key: CASSANDRA-957 URL: https://issues.apache.org/jira/browse/CASSANDRA-957 Project: Cassandra Issue Type: Wish Components: Core, Tools Reporter: Jonathan Ellis Assignee: Chris Goffinet Fix For: 0.8 Attachments: 0001-Support-bringing-back-a-node-to-the-cluster-that-exi.patch, 0002-Do-not-include-local-node-when-computing-workMap.patch Original Estimate: 24h Remaining Estimate: 24h Replacing a dead node with a new one is a common operation, but nodetool removetoken followed by bootstrap is inefficient (re-replicating data first to the remaining nodes, then to the new one) and manually bootstrapping to a token just less than the old one's, followed by nodetool removetoken is slightly painful and prone to manual errors. First question: how would you expose this in our tool ecosystem? It needs to be a startup-time option to the new node, so it can't be nodetool, and messing with the config xml definitely takes the convenience out. A one-off -DreplaceToken=XXY argument? -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CASSANDRA-957) convenience workflow for replacing dead node
[ https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12853687#action_12853687 ] Ryan King commented on CASSANDRA-957: - It would be easier if a node could start without joining the ring: CASSANDRA-526. convenience workflow for replacing dead node Key: CASSANDRA-957 URL: https://issues.apache.org/jira/browse/CASSANDRA-957 Project: Cassandra Issue Type: Wish Components: Core, Tools Reporter: Jonathan Ellis Fix For: 0.8 Replacing a dead node with a new one is a common operation, but nodetool removetoken followed by bootstrap is inefficient (re-replicating data first to the remaining nodes, then to the new one) and manually bootstrapping to a token just less than the old one's, followed by nodetool removetoken is slightly painful and prone to manual errors. First question: how would you expose this in our tool ecosystem? It needs to be a startup-time option to the new node, so it can't be nodetool, and messing with the config xml definitely takes the convenience out. A one-off -DreplaceToken=XXY argument? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.