Probably worth mentioning that some operational procedures like repairs, bootstrapping etc are helped massively by using less tokens. Incremental repairs are one of the things I would say is most impacted the by it since less tokens will mean less local ranges to iterate through and less anti compaction. I would highly recommend using far less than 256 in 3.x.
Chris On Tue, Jul 11, 2017 at 8:36 PM, Justin Cameron <jus...@instaclustr.com> wrote: > Hi, > > Using fewer vnodes means you'll have a higher chance of hot spots in your > cluster. Hot spots in Cassandra are nodes that, by random chance, are > responsible for a higher percentage of the token space than others. This > means they will receive more data and also more traffic/load than other > nodes in the cluster. > > CASSANDRA-7032 goes a long way towards addresses this issue by allocating > vnode tokens more intelligently, rather than just randomly assigning them. > If you're using a version of Cassandra that contains this feature (3.0+), > you can use a smaller number of vnodes in your cluster. > > A high number of vnodes won't affect performance for most Cassandra > workloads, but if you're running tasks that need to do token-range scans > (such as Spark), there is usually a significant performance hit. > > If you're on C* 3.0+ and are using Spark (or similar workloads - cassandra > lucene index plugin is also affected) then I'd recommend using fewer vnodes > - 16 would be ok. You'll probably still see some variance in token-space > ownership between nodes, but the trade-off for better Spark performance > will likely be worth it. > > Justin > > On Wed, 12 Jul 2017 at 00:34 ZAIDI, ASAD A <az1...@att.com> wrote: > >> Hi Folks, >> >> >> >> Pardon me if I’m missing something obvious. I’m still using >> apache-cassandra 2.2 and planning for upgrade to 3.x. >> >> I came across this jira [https://issues.apache.org/ >> jira/browse/CASSANDRA-7032] that suggests reducing num_token may improve >> general performance of Cassandra like having num_token=16 instead of 256 >> may help! >> >> >> >> Can you please suggests if having less num_token would provide real >> performance benefits or if it comes with any downsides that we should also >> consider? I’ll much appreciate your insights. >> >> >> >> Thank you >> >> Asad >> > -- > > > *Justin Cameron*Senior Software Engineer > > > <https://www.instaclustr.com/> > > > This email has been sent on behalf of Instaclustr Pty. Limited (Australia) > and Instaclustr Inc (USA). > > This email and any attachments may contain confidential and legally > privileged information. If you are not the intended recipient, do not copy > or disclose its content, but please reply to this email immediately and > highlight the error to the sender and then immediately delete the message. >