On Sat, Jul 6, 2013 at 1:50 PM, Rodrigo Felix < rodrigofelixdealme...@gmail.com> wrote:
> > - Is it normal to take about 9 minutes to add a new node? Follows the > log generated by a script to add a new node. > > Sure. > > - Is there a way to reduce the time to start cassandra? > > Not usually. > > - Sometimes cleanup operation takes make minutes (about 10). Is this > normal since the amount of data is small (1.7gb at maximum / seed)? > > Compaction is throttled, and cleanup is a type of compaction. Bootstrap is also throttled via the streaming throttle. > > - Considering that I have two seeds in the beginning, their tokens are > 0 and 85070591730234615865843651857942052864. When I add a new machine, do > I need to execute move and cleanup on both seeds? Nowadays, I'm running > cleanup on seed 0, move + cleanup on the other seed and neither move nor > cleanup on the just added node. Is this OK? > > Only nodes which have "lost" ranges need to run cleanup. In general you should add new nodes "between" other nodes such that "move" is not required at all. > > - What if I do not run cleanup in any existing node when adding or > removing a node? Is the data that was not "cleaned up" still available if I > send a scan, for instance, and the scan range is still in the node but it > wouldn't be there if I had run cleanup? Data would be gather from other > node, ie. the one that properly has the range specified in the scan query? > > If data for range [x] is on node [a] but node [a] is no longer considered an endpoint for range [x], it will never receive a request to serve range [x]. > > - After decommissioning a node, is it advisable to run cleanup in the > remaining nodes? The consequences of not to run are the same of not to run > when adding a node? > > Cleanup is only for the node which lost a range. In decommission case, no live nodes lost a range, only some nodes gained one. =Rob