> - When adding nodes to a cluster it's mode efficient if you can change the > range to existing nodes to be a sub set of what they were responsible for > previously. So the node only has to stream out data, rather than stream out > and stream in data. Say you have this contrived example (where values are
Also, doubling is usually the easiest thing to do because it only involves inserting new nodes at appropriate places. Any increase that involves moving nodes is a bit more of an issue because moving a node implies decommission+bootstrap. If your cluster is not under significant load it is not a huge problem, but if you're trying to execute a cluster expansion live with significant amounts of live traffic, you may not want to remove any of your existing nodes even temporarily. So, if possible, I'd recommend doubling. You can get around the problem of move being a decommission+bootstrap by jumping through some extra hoops (essentially using an 'extra' node so you can do insertions followed by removals instead of moves), but it's more of a hassle. -- / Peter Schuller