If the blob link on github doesn't work for the pdf (looks like mobile might not like it), try:
https://github.com/jolynch/python_performance_toolkit/raw/master/notebooks/cassandra_availability/whitepaper/cassandra-availability-virtual.pdf -Joey <https://github.com/jolynch/python_performance_toolkit/raw/master/notebooks/cassandra_availability/whitepaper/cassandra-availability-virtual.pdf> On Mon, Apr 16, 2018 at 1:14 PM, Joseph Lynch <joe.e.ly...@gmail.com> wrote: > Josh Snyder and I have been working on evaluating virtual nodes for large > scale deployments and while it seems like there is a lot of anecdotal > support for reducing the vnode count [1], we couldn't find any concrete > math on the topic, so we had some fun and took a whack at quantifying how > different choices of num_tokens impact a Cassandra cluster. > > According to the model we developed [2] it seems that at small cluster > sizes there isn't much of a negative impact on availability, but when > clusters scale up to hundreds of hosts, vnodes have a major impact on > availability. In particular, the probability of outage during short > failures (e.g. process restarts or failures) or permanent failure (e.g. > disk or machine failure) appears to be orders of magnitude higher for large > clusters. > > The model attempts to explain why we may care about this and advances a > few existing/new ideas for how to fix the scalability problems that vnodes > fix without the availability (and consistency—due to the effects on repair) > problems high num_tokens create. We would of course be very interested in > any feedback. The model source code is on github [3], PRs are welcome or > feel free to play around with the jupyter notebook to match your > environment and see what the graphs look like. I didn't attach the pdf here > because it's too large apparently (lots of pretty graphs). > > I know that users can always just pick whichever number they prefer, but I > think the current default was chosen when token placement was random, and I > wonder whether it's still the right default. > > Thank you, > -Joey Lynch > > [1] https://issues.apache.org/jira/browse/CASSANDRA-13701 > [2] https://github.com/jolynch/python_performance_toolkit/ > raw/master/notebooks/cassandra_availability/whitepaper/cassandra- > availability-virtual.pdf > > <https://github.com/jolynch/python_performance_toolkit/blob/master/notebooks/cassandra_availability/whitepaper/cassandra-availability-virtual.pdf> > [3] https://github.com/jolynch/python_performance_toolkit/tree/m > aster/notebooks/cassandra_availability >