Hi ALL: *My issues: I have a few high-capacity servers, which has about 5T disk space. I know there are 2 solutions for handling failure, through the wiki. But it's not very conveninte for me. That is : Solution 1: new node + removetoken Add a new node will make lots of data transfer between machines. And, of couse, the removetoken operation will make the data be transferred too. That means the data will be transferred between machines twice.
Solution 2: repaire Though I have not try this operation yes, I find it is a heavy operation, because it will trigger the major compaction. *My objects: I think the most conveninte way to handle one machines failure is that : "1) Replace the bad node with a new one; 2) Copy data from the other nodes to the new node. 3) Start the service on the new node." *Why not the standard boostrap? At the first time, I think the boostrap can deal with this. But Im wrong, because of 3 problems. 1) The new node, which has no data before bootstrapping, will be found by other nodes as soon as the Gossiper is OK. Before it is changed to boostrapping mode, it will receive many routed message and of couse, all of them failed to response. 2) After it is changed to booststrapping mode, other nodes will remove it from the ring. That means the architecture of the ring has been changed while the replication count is not restored. As a result, many read requests are routed to the wrong nodes. 3) The new node, which is in the bootstrapping mode, think itself is a new one in the ring and the replication count in the ring have been restored, so it calculates the replication nodes without itself in the ring. Of couse, its not right. *My solutions: 1) Hide the new nodes state from the other nodes until the bootstrapping has done. Then, the new node can stream the data and will not worry other nodes route messages at that time, because other ones think its down. It seems ok to take off itself from the deltaEpStateMap in the ACK and ACK2. As a result, other nodes will not see it until put itself back. 2) Add itselft into the ring temporarily while calculating the replication nodes. In this way, the new node will see the original ring and can find the right replication nodes. *My Questions: Am I right? Whats the problem in my design? Why not before? -------------- XL.Pan 2010-01-14