[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2017-08-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16146625#comment-16146625
 ] 

ASF GitHub Bot commented on CASSANDRA-8523:
---

Github user asfgit closed the pull request at:

https://github.com/apache/cassandra/pull/145


> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.2.8, 3.0.9, 3.10
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2017-08-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16146592#comment-16146592
 ] 

ASF GitHub Bot commented on CASSANDRA-8523:
---

GitHub user kgreav opened a pull request:

https://github.com/apache/cassandra/pull/145

Describe replacement cases based on CASSANDRA-8523 and CASSANDRA-12344

Just updating the docs to cover cases where replacing with a same IP node 
and different IP node.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kgreav/cassandra replace_fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/cassandra/pull/145.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #145


commit e9228182d7f6916eab0dd64afcc57fedb103f25c
Author: kurt 
Date:   2017-08-29T03:36:12Z

Describe replacement cases based on CASSANDRA-8523 and CASSANDRA-12344




> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.2.8, 3.0.9, 3.10
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-08-31 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15453146#comment-15453146
 ] 

Aleksey Yeschenko commented on CASSANDRA-8523:
--

Committed as 
[b39d984f7bd682c7638415d65dcc4ac9bcb74e5f|https://github.com/apache/cassandra/commit/b39d984f7bd682c7638415d65dcc4ac9bcb74e5f]
 to 2.2, merged with 3.0 and trunk. Thanks.

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.2.8, 3.0.9, 3.10
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-08-30 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15449771#comment-15449771
 ] 

Paulo Motta commented on CASSANDRA-8523:


from [PR#1155|https://github.com/riptano/cassandra-dtest/pull/1155], it's done. 
:)

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-08-30 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15449749#comment-15449749
 ] 

sankalp kohli commented on CASSANDRA-8523:
--

waiting final +1 from whom?

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-08-30 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15449605#comment-15449605
 ] 

Paulo Motta commented on CASSANDRA-8523:


Waiting final +1 and then will mark as ready to commit.

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-08-30 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15449594#comment-15449594
 ] 

Philip Thompson commented on CASSANDRA-8523:


I don't think the tests are waiting on anything at this point, but 
[~pauloricardomg] can confirm

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-08-30 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15449587#comment-15449587
 ] 

sankalp kohli commented on CASSANDRA-8523:
--

Any updates about the Dtests

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-08-18 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15427337#comment-15427337
 ] 

Paulo Motta commented on CASSANDRA-8523:


No, someone from the cassandra-dtest is reviewing, but since this will break 
existing tests it can only be commited after that is merged.

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-08-18 Thread Richard Low (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15427335#comment-15427335
 ] 

Richard Low commented on CASSANDRA-8523:


Are you waiting for me to review the dtest PR?

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-08-18 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15427310#comment-15427310
 ] 

Paulo Motta commented on CASSANDRA-8523:


Thanks! This will be ready to commit after 
[PR|https://github.com/riptano/cassandra-dtest/pull/1155] is reviewed and 
merged. I rebased with the updated dtests and submitted a new CI run:

||2.2||3.0||trunk||
|[branch|https://github.com/apache/cassandra/compare/cassandra-2.2...pauloricardomg:2.2-8523]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-8523]|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-8523]|
|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-8523-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-8523-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-8523-testall/lastCompletedBuild/testReport/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-8523-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-8523-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-8523-dtest/lastCompletedBuild/testReport/]|

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-08-10 Thread Richard Low (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15416328#comment-15416328
 ] 

Richard Low commented on CASSANDRA-8523:


+1 on the 3.9 version too.

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-07-29 Thread Richard Low (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399949#comment-15399949
 ] 

Richard Low commented on CASSANDRA-8523:


I'll review the 3.9 version. I'm very much in favour of putting this in 2.2 and 
3.0 as this hurts us badly and no doubt others suffer too.

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-07-29 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399941#comment-15399941
 ] 

Paulo Motta commented on CASSANDRA-8523:


Thanks for taking a look! Created CASSANDRA-12344 to follow-up with support for 
this when the replacement node has the same address as the original node.

Rebased patch and dtests as well as merged up to 3.0+. All patches and CI 
results available below:

||2.2||3.0||3.9||trunk||dtest||
|[branch|https://github.com/apache/cassandra/compare/cassandra-2.2...pauloricardomg:2.2-8523]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-8523]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.9...pauloricardomg:3.9-8523]|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-8523]|[branch|https://github.com/riptano/cassandra-dtest/compare/master...pauloricardomg:8523]|
|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-8523-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-8523-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.9-8523-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-8523-testall/lastCompletedBuild/testReport/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-8523-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-8523-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.9-8523-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-8523-dtest/lastCompletedBuild/testReport/]|

There were some minor merge conflicts on 3.0, and a slightly larger conflict on 
3.9 due to CASSANDRA-10134, so I did some refactoring in the 3.9+ version to 
move most of the logic to {{prepareForReplacement}}. Can you take another look 
[~rlow]?

Dtest PR created [here|https://github.com/riptano/cassandra-dtest/pull/1155].

While this is marked an improvement and would theoretically only go to trunk, 
this limitation is pretty counter-intuitive and probably hurts many users in 
the wild, and since the changeset is relatively small and self-contained, I 
think it could be interpreted as a bugfix and perhaps go on 2.2+ or maybe 3.0+. 
WDYT [~brandon.williams] [~jkni] ?

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-07-28 Thread Richard Low (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15398667#comment-15398667
 ] 

Richard Low commented on CASSANDRA-8523:


+1 patch looks good. Really like the dtests.

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-06-20 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340771#comment-15340771
 ] 

Paulo Motta commented on CASSANDRA-8523:


bq. The hints that you might care about are writes dropped during the 
replacement on the replacing node. 
Even these should not be a problem, because they are also forwarded to the 
replacement node, and if they fail they are hinted to replacement node ID.

I rebased and resubmitted dtests and fixed minor typo on the original patch. I 
also realized it's not necessary to change original node state to 
{{REMOVED_TOKEN}} because it's already removed from gossip when the replacement 
node changes its state to {{NORMAL}}, so I removed that step.

Also added new dtests to test replace in a mixed-version environment to verify 
backward compatibility and to check for the warning message when replacing a 
node with the same address.

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-06-20 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340204#comment-15340204
 ] 

Joel Knighton commented on CASSANDRA-8523:
--

Assigning you to review [~rlow] - just saw the comment from two days ago. Let 
me know if anything has changed there; I'm available to review as well.

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-06-20 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340200#comment-15340200
 ] 

Joshua McKenzie commented on CASSANDRA-8523:


Missed that comment above - [~rlow] to review. Thanks!

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-06-17 Thread Richard Low (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15337328#comment-15337328
 ] 

Richard Low commented on CASSANDRA-8523:


It's not those writes that matter - the replacement will get those writes from 
the other nodes during streaming. The hints that you might care about are 
writes dropped during the replacement on the replacing node. But those should 
be extremely rare, and a price well worth paying for getting the vast majority 
of the writes vs none today.

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-06-17 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15337301#comment-15337301
 ] 

Brandon Williams commented on CASSANDRA-8523:
-

.bq Since it's no longer necessary to forward hints to the replacement node 
when replace_address != broadcast_address, the replacement node does not need 
to inherit the same ID of the original node.

I don't think that's necessarily true, if a node dies at point A, and then the 
replacement is issued at point N, the hints between points B and M are still 
missed if the host id is not the same.

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-06-17 Thread Richard Low (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15337290#comment-15337290
 ] 

Richard Low commented on CASSANDRA-8523:


Thanks! I'm happy to review.

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-06-17 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15337288#comment-15337288
 ] 

Paulo Motta commented on CASSANDRA-8523:


Due to the limitations of forwarding writes to replacement nodes with the same 
IP, I propose initially adding this support only to replacement nodes with a 
different IP, since it's much simpler and we can do it in a backward-compatible 
way so it can probably go on 2.2+.

After CASSANDRA-11559, we can extend this support to nodes with the same IP 
quite easily by setting an inactive flag on nodes being replaced and ignore 
these nodes on read.

The central idea is:
{quote}
* Add a new non-dead gossip state for replace BOOT_REPLACE
* When receiving BOOT_REPLACE, other node adds the replacing node as 
bootstrapping endpoint
* Pending ranges are calculated, and writes are sent to the replacing node 
during replace
* When replacing node changes state to NORMAL, the old node is removed and the 
new node becomes a natural endpoint on TokenMetadata
* The final step is to change the original node state to REMOVED_TOKEN so other 
nodes evict the original node from gossip
{quote}

Since it's no longer necessary to forward hints to the replacement node when 
{{replace_address != broadcast_address}}, the replacement node does not need to 
inherit the same ID of the original node.

The replacing process remains unchanged when the replacement node has the same 
IP as the original node. If that's the case, I added a warn message so users 
know they need to run repair if the node is down for longer than 
{{max_hint_window_in_ms}}:
{noformat}
Writes will not be redirected to this node while it is performing replace 
because it has the same address as the node to be replaced ({}). 
If that node has been down for longer than max_hint_window_in_ms, repair must 
be run after the replacement process in order to make this node consistent.
{noformat}

I adapted current dtests to test replace_address for both the old and the new 
path, and when {{replace_address != broadcast_address}} make sure writes are 
being redirected to the replacement node.

Initial patch and tests below (will provide 2.2+ patches after initial review):
||2.2||dtest||
|[branch|https://github.com/apache/cassandra/compare/cassandra-2.2...pauloricardomg:2.2-8523]|[branch|https://github.com/riptano/cassandra-dtest/compare/master...pauloricardomg:8523]|
|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-8523-testall/lastCompletedBuild/testReport/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-8523-dtest/lastCompletedBuild/testReport/]|

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-06-02 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15313417#comment-15313417
 ] 

Paulo Motta commented on CASSANDRA-8523:


I had a closer look at CASSANDRA-11559 and it should be possible to support 
that on trunk/3.x series without breaking compatibility, thus making this 
viable before 4.0 in a robust way. I will provide an initial patch for that 
soon.

We now need to decide whether we want to have a partial/fragile solution to 
this ticket before the ideal fix on 3.x leveraging #11559.

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-06-02 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15312567#comment-15312567
 ] 

Paulo Motta commented on CASSANDRA-8523:


Solving this when the replacing node has a different ip address is quite simple:
* Add a new non-dead gossip state for replace BOOT_REPLACE
* When receiving BOOT_REPLACE, other node adds the replacing node as 
bootstrapping AND replacing endpoint
* Pending ranges are calculated, and writes are sent to the replacing node 
during replace
* When replacing node changes state to NORMAL, the old node is removed and the 
new node becomes a natural endpoint on {{TokenMetadata}}

I have a WIP patch with this approach 
[here|https://github.com/pauloricardomg/cassandra/commit/a4be7f7a0f298b1d52821f67ff2e927a1d1f1c5e]
 that is functional. The tricky part is when the replacing node has the same IP 
as the old node, because if this node ever joins gossip with a non-dead state, 
{{FailureDetector.isAlive(InetAddress)}} return true and reads are sent to the 
replacing node, since the original node is a natural endpoint.

My initial idea of removing the old node as a natural endpoint from 
{{TokenMetadata}} obviously does not work because this changes range 
assignments making reads go the following node in the ring, which has no data 
for that range. I also considered replacing the node IP in TokenMetadata with a 
special marker, but this will also not play along well with snitches and 
topologies.

A clean fix to this is to enhance the node representation on {{TokenMetadata}}, 
which is currently a plain {{InetAddresss}}, to include the information if an 
endpoint is available for reads (CASSANDRA-11559), but this cannot be done 
before cassandra 4.0 because it will change {{TokenMetadata}} and 
{{AbstractReplicationStrategy}} public interfaces and other related classes, so 
it's a major refactoring in the codebase. One step further is to change the 
{{FailureDetector}} interface to query nodes by {{UUIDs}} instead of 
{{InetAddress}}, so a replacing endpoint will not be confused with the original 
node since they will have different ids.

One workaround before that would be to send a read to a normal endpoint if both 
{{FailureDetector.isAlive(endpoint)}} is true and the endpoint is on {{NORMAL}} 
state, so we avoid sending reads to a replacing endpoint with the same IP 
address. A big downside is that this check should also be peformed for other 
operations such as bootstrap, repairs, hints, batches, etc. I experimented with 
that route [in this 
commit|https://github.com/pauloricardomg/cassandra/commit/a439edcbd8d4186ea2e3301681b7cc0b5012a265]
 by creating a method {{StorageService.isAvailable(InetAddress endpoint, 
boolean isPending) = \{FailureDetector.instance.isAlive(endpoint) && 
(isPendingEndpoint || Gossiper.instance.isNormalStatus(endpoint))\}}} that 
should be used instead of {{FailureDetector.isAlive(endpoint)}}, but I'm not 
very comfortable with this approach for the following reasons:
* It's pretty fragile, because that check must be used instead of 
{{FailureDetector.isAlive()}} in every place throughout the code
* Other developers/drivers/tools wil also need to use the same approach or a 
similar logic
* We will need to modify the write path to segregate writes between natural and 
pending endpoints (see an example on 
[doPaxosCommit|https://github.com/pauloricardomg/cassandra/commit/a439edcbd8d4186ea2e3301681b7cc0b5012a265#diff-71f06c193f5b5e270cf8ac695164f43aR498])

So unless someone comes up with a magic idea, I think we're left with two 
options to fix this before CASSANDRA-11559:
1. Go ahead with {{StorageService.isAvailable}} approach even with its 
downsides and fragility
2. Fix this only for replacing endpoints with a different IP address, falling 
back to the previous behavior when the replacing node has the same IP as the 
old node, which should already fix this in most cloud deployments, where 
replacement nodes usually have different IPs

I'm leaning more towards 2 since it's much simpler/safer and work on a proper 
general solution for 4.0 after CASSANDRA-11559. WDYT?

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent 

[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-06-01 Thread Richard Low (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15311526#comment-15311526
 ] 

Richard Low commented on CASSANDRA-8523:


Without understanding the FD details, this sounds good. Losing hints isn't an 
issue, as you say.

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-05-27 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304281#comment-15304281
 ] 

Paulo Motta commented on CASSANDRA-8523:


Initial route I will take is to create a new dead state for replace, that adds 
the node as replacing endpoint to TokenMetadata. When a replacement endpoint is 
added to TokenMetadata, if there's an existing down node with the same IP, then 
we remove it from natural endpoints so the replacement node no longer receive 
reads when it becomes alive in the FD and include the replacement node as 
pending joining endpoint so writes are forwarded to it. After pending ranges 
for replace are calculated the final step is to set the node as alive in the FD 
and change the dead state logic to not mark "alive" dead state nodes as DOWN 
automatically (we will probably rename this nomenclature to a better name, like 
invisible or whatever), so the FD will work as usual and remove the replacing 
endpoint from TokenMetadata (and restore it as a natural endpoint) if it 
becomes down. The current streaming logic will probably be unaffected by this, 
and the node will change its state to NORMAL after stream is finished 
completing the replace procedure.

The only downside I can think of this approach is that we will lose hints 
during a failed replace, but this is not a big deal as hints are an 
optimization, and replace will probably take longer than max_hint_window anyway.

I will start going this route, feel free to give any feedback or let me know if 
I'm missing something on this high level flow.

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-05-09 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15277068#comment-15277068
 ] 

Brandon Williams commented on CASSANDRA-8523:
-

[~pauloricardomg] assigning to you, please do continue working on this.

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-04-15 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243834#comment-15243834
 ] 

Jason Brown commented on CASSANDRA-8523:


I agree with [~jkni] - a new gossip state and updating FD/TMD is the best way 
to go. I'll follow up on #9244 to get better grasp of what's going on there, as 
well.

wrt strongly consistent membership, that effort focuses primarily on correct 
and linearizable changes to the cluster state machine, and not any behaviors 
that should be triggered due to those changes when they arrive at cluster 
nodes. So, I don't *think* we'd run into problems between these two efforts, 
but good to keep the same players involved :)

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Brandon Williams
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-04-06 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15228474#comment-15228474
 ] 

Joel Knighton commented on CASSANDRA-8523:
--

I don't have a good idea for a low-hanging solution here - I think the cleanest 
is the combination of a new gossip state for replacing nodes + enhancement of 
the failure detector/token metadata. Since that's a bigger change, let's also 
make sure that it doesn't conflict with plans for strongly consistent 
membership. I'd really rather not couple gossip status to the internals of the 
read/write paths.

I also agree that we should make sure that we kill two birds with one stone and 
handle this and [CASSANDRA-9244] at the same time.

Let me know if I can help in design and/or review.

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Brandon Williams
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-04-06 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15228317#comment-15228317
 ] 

Paulo Motta commented on CASSANDRA-8523:


There are two scenarios we should consider when replacing a node:
1) The replacing node has the same IP as the previous node
2) The replacing node has a different IP as the previous node

On CASSANDRA-9244 I have gotten pretty far in an 
[implementation|https://issues.apache.org/jira/browse/CASSANDRA-9244?focusedCommentId=15211202=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15211202]
 that adds a new non-dead gossip state {{BOOT_REPLACE}} and considers the 
replacing endpoint as a bootstrapping pending endpoint, solving case 2 
transparently.

Case 1 is trickier because when the replacing node enters gossip with a 
non-dead state, other nodes will think the previous node is back up and send 
reads to him (since he is a natural endpoint).

A simple way to solve this is to special-case the read path and ignore nodes in 
"NON-NORMAL" state when sending reads to natural endpoints. While this will 
probably solve the problem, there are quite a few different paths we need to 
hack to make sure this is enforced correctly (paxos, read, hints, etc), so I'm 
not totally comfortable with that.

A more transparent but a bit costlier approach to solve case 1 would be to 
change the {{TokenMetadata}} to keep nodes as {{(InetAddress, UUID)}} pairs, 
and create a new interface to the {{FailureDetector}} indexed by {{UUID}}. This 
way we could keep {{(IP=127.0.0.1,UUID=1)}} in {{TokenMetadata}} as a natural 
endpoint, and add a replacement node {{(IP=127.0.0.1,UUID=2)}} as a pending 
endpoint. So, during reads, {{FD.isAlive(UUID=1)}} would return false, and 
natural reads would not be sent to {{(IP=127.0.0.1,UUID=1)}}, while pending 
writes would be sent to {{(IP=127.0.0.1,UUID=2)}} because 
{{FD.isAlive(UUID=2)}} would return true.

I'd be happy to continue working on this, so feedback on any of the above or 
alternative approaches would be greatly appreciated.

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Brandon Williams
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-04-04 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15225483#comment-15225483
 ] 

Stefania commented on CASSANDRA-8523:
-

bq. Hmm, can you can explain further?

Well, it would be a bit convoluted so a new Gossip state would probably be 
cleaner if we can add it. But the idea is to link a host to one or more hosts 
that act as 'shadows', that is they receive copies of all writes or hints but 
are not queried for reads. 

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Brandon Williams
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-04-04 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15225475#comment-15225475
 ] 

Brandon Williams commented on CASSANDRA-8523:
-

bq. I would consider a new, non dead, state for replacing a node rather than 
using hibernate. This should allow us to use the Failure Detector right?

It looks like this would work now, because the gossiper is the only thing that 
registers with the FD.  In the past this wasn't always true, though.

bq. Alternatively, would a new Gossip property that indicates a "shadow" of an 
existing node be easier to implement?

Hmm, can you can explain further?

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Brandon Williams
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-04-04 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15225470#comment-15225470
 ] 

Stefania commented on CASSANDRA-8523:
-

I would consider a new, non dead, state for replacing a node rather than using 
hibernate. This should allow us to use the Failure Detector right?

Alternatively, would a new Gossip property that indicates a "shadow" of an 
existing node be easier to implement? Could this property be made to expire 
unless it gets renewed by the shadow node? 


> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Brandon Williams
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2015-01-13 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14275862#comment-14275862
 ] 

Jonathan Ellis commented on CASSANDRA-8523:
---

Marking related to CASSANDRA-8494, but may want to resolve as duplicate.

 Writes should be sent to a replacement node while it is streaming in data
 -

 Key: CASSANDRA-8523
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Richard Wagner
Assignee: Brandon Williams
 Fix For: 2.0.12, 2.1.3


 In our operations, we make heavy use of replace_address (or 
 replace_address_first_boot) in order to replace broken nodes. We now realize 
 that writes are not sent to the replacement nodes while they are in hibernate 
 state and streaming in data. This runs counter to what our expectations were, 
 especially since we know that writes ARE sent to nodes when they are 
 bootstrapped into the ring.
 It seems like cassandra should arrange to send writes to a node that is in 
 the process of replacing another node, just like it does for a nodes that are 
 bootstraping. I hesitate to phrase this as we should send writes to a node 
 in hibernate because the concept of hibernate may be useful in other 
 contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
 Among other things, the fact that we don't get writes during this period 
 makes subsequent repairs more expensive, proportional to the number of writes 
 that we miss (and depending on the amount of data that needs to be streamed 
 during replacement and the time it may take to rebuild secondary indexes, we 
 could miss many many hours worth of writes). It also leaves us more exposed 
 to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2015-01-09 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14271464#comment-14271464
 ] 

Brandon Williams commented on CASSANDRA-8523:
-

I completely agree this is an improvement, but it's going to be pretty tricky, 
especially since we can't use the FD to determine if the node has died, at 
least not in its current form since that would mark the node as UP.

 Writes should be sent to a replacement node while it is streaming in data
 -

 Key: CASSANDRA-8523
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Richard Wagner
 Fix For: 2.0.12, 2.1.3


 In our operations, we make heavy use of replace_address (or 
 replace_address_first_boot) in order to replace broken nodes. We now realize 
 that writes are not sent to the replacement nodes while they are in hibernate 
 state and streaming in data. This runs counter to what our expectations were, 
 especially since we know that writes ARE sent to nodes when they are 
 bootstrapped into the ring.
 It seems like cassandra should arrange to send writes to a node that is in 
 the process of replacing another node, just like it does for a nodes that are 
 bootstraping. I hesitate to phrase this as we should send writes to a node 
 in hibernate because the concept of hibernate may be useful in other 
 contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
 Among other things, the fact that we don't get writes during this period 
 makes subsequent repairs more expensive, proportional to the number of writes 
 that we miss (and depending on the amount of data that needs to be streamed 
 during replacement and the time it may take to rebuild secondary indexes, we 
 could miss many many hours worth of writes). It also leaves us more exposed 
 to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)