On Tue, Jun 3, 2014 at 3:48 PM, Matthew Allen <matthew.j.al...@gmail.com> wrote:
> Just out of curiosity, for a dead node, would it be possible to just > > - replace the node (no data in data/commit dirs), same IP Address, same > hostname. > - restore the cassandra.yaml (initial_token etc) > - set auto_bootstrap:false > - start it up and then run a nodetool rebuild ? > > Or would the Host ID value change with the new node ? > That would work, but until CASSANDRA-6961 [1] there is no way to prevent this node from having a long window where it may serve stale reads at CLs below QUORUM, until the rebuild completes. "rebuild" gets you exactly one replica's worth of data, just like bootstrap does. If you want to actually sync a node with all of its replicas and RF>2, you want "repair" and not "rebuild." I wish "rebuild" had been named something else, because people seem to think it does something it doesn't do. This property of decreasing what I call "unique replica count" is why people like me prefer to back up their nodes with something like tablesnap [2], so that losing a node does not decrease the "unique replica count." A simpler solution if you want to avoid the chance of inconsistency is to operate with CL.QUORUM instead of CL.ONE. You'd be better off leaving auto_bootstrap set to true and setting -Dcassandra.replace_address, which bootstraps you (from a single-replica source per range) to the token owned by the dead node. This is exactly like your process above, except that you don't serve stale reads while doing so. That said, the single-replica source thing is why people want to first bootstrap (which does the same single-replica source thing as "rebuild" but does not serve writes while it does so) and then repair and then, finally, join the ring. Note that if writes are incoming, this does not actually *close* the race window for stale reads at ONE, it just makes it much shorter. =Rob [1] https://issues.apache.org/jira/browse/CASSANDRA-6961 [2] https://github.com/JeremyGrosser/tablesnap