I' am Pan's collogue, allow me make it clear... Pan's problem is:
If a node's data has been damaged, you cannot use new node replace old one directly, unless 'removetoken' first. But, (suppose node A is dead) 'removetoken' will complement missing replica due A's death first, it will generate lot data on other nodes, say it's B, C, D After add new node and copy data from other node through bootstrapping, you have to 'cleanup' data just generate from ' removetoken ' on B, C, D So, B/C/D will have heavy I/O load (half of them is waste) due to repair A, in pan's case, it will be 5TB (and will cause days...) Pan try to invent a method to repair A directly through streaming, and have less impact on other nodes. ---------END---------- -----Original Message----- From: XL.Pan [mailto:pan_xiao...@sina.com] Sent: Friday, January 15, 2010 10:23 AM To: cassandra-user; cassandra-user Subject: Re: Re: replace a bad node through bootstrapping -------------------------------------------------- | Range changes | | Bootstrap | | Adding new nodes is called "bootstrapping." | ------------------------------------------------- Do you mean that "bootstrapping" is designed for adding new nodes only? I think the bootstrapping idea is good enough to do something else, for example that restoring the data in a bad node, though it needs some modification if that. What's the difference between a new one which is NOT in the ring before and a new one which is in the ring before? I think there are some similarities and differences. (let new one called N, and the replaced one called R) * similarities: Both N and R have no available data, as a result that both of which need to copy data from the replication soureces. * differences: 1) About writing while startup N is not seen before and of couse it has no handoff data in other nodes. As a result that it should serve for writing which is routed from other nodes while coping data. R is seen before and it has handoff data in other nodes, so it will not care about losing data while coping data from other sources and it will receive the handoff data after startup. That means R has no need to serve for writing at that time. 2) About the selection of replication sources A-->B | | D<--C N want's to insert between B and C, so N knows that it can get data from C, D, A. After bootstrapping, the ring will be: A-->B | | | N | | D<--C Then the node N is down and replaced with R. Because the R has seen a different ring, it will select B and C. >From the comparison, I think it's possible that replacing a bad node and >restore the data through bootstrapping. ------------------ XL.Pan 2010-01-15 ------------------------------------------------------------- 发件人:Jonathan Ellis 发送日期:2010-01-15 00:51:57 收件人:cassandra-user 抄送: 主题:Re: replace a bad node through bootstrapping On Thu, Jan 14, 2010 at 6:30 AM, XL.Pan <pan_xiao...@sina.com> wrote: > *Why not the standard boostrap? http://wiki.apache.org/cassandra/Operations says that boostrap is the preferred method for handling node replacement. Please read how that describes how to handle things because your description of how bootstrap works is very off base. -Joanthan