Hi Anthony,

there is a problem with replacing dead node as per the blog,if the replacement 
process takes longer than max_hint_window_in_ms,we must run repair to make the 
replaced node consistent again, since it missed ongoing writes during 
bootstrapping.but for a great cluster,repair is a painful process.
 
Thanks,
Peng Xiao






------------------ ???????? ------------------
??????: "Anthony Grasso"<anthony.gra...@gmail.com>;
????????: 2018??3??22??(??????) ????7:13
??????: "user"<user@cassandra.apache.org>;

????: Re: replace dead node vs remove node



Hi Peng,

Depending on the hardware failure you can do one of two things:



1. If the disks are intact and uncorrupted you could just use the disks with 
the current data on them in the new node. Even if the IP address changes for 
the new node that is fine. In that case all you need to do is run repair on the 
new node. The repair will fix any writes the node missed while it was down. 
This process is similar to the scenario in this blog post: 
http://thelastpickle.com/blog/2018/02/21/replace-node-without-bootstrapping.html


2. If the disks are inaccessible or corrupted, then use the method as described 
in the blogpost you linked to. The operation is similar to bootstrapping a new 
node. There is no need to perform any other remove or join operation on the 
failed or new nodes. As per the blog post, you definitely want to run repair on 
the new node as soon as it joins the cluster. In this case here, the data on 
the failed node is effectively lost and replaced with data from other nodes in 
the cluster.


Hope this helps.


Regards,
Anthony


On Thu, 22 Mar 2018 at 20:52, Peng Xiao <2535...@qq.com> wrote:

Dear All,


when one node failure with hardware errors,it will be in DN status in the 
cluster.Then if we are not able to handle this error in three hours(max hints 
window),we will loss data,right?we have to run repair to keep the consistency.
And as per 
https://blog.alteroot.org/articles/2014-03-12/replace-a-dead-node-in-cassandra.html,we
 can replace this dead node,is it the same as bootstrap new node?that means we 
don't need to remove node and rejoin?
Could anyone please advise?


Thanks,
Peng Xiao

Reply via email to