Re: Repair scheduling tools

2018-04-16 Thread Blake Eggleston
This thread is mainly focused on how repairs are scheduled, not implementation details of how the repairs themselves work. On 4/16/18, 11:07 AM, "Carl Mueller" wrote: So reading ( https://www.datastax.com/dev/blog/anticompaction-in-cassandra-2-1)...

Re: Quantifying Virtual Node Impact on Cassandra Availability

2018-04-16 Thread Joseph Lynch
If the blob link on github doesn't work for the pdf (looks like mobile might not like it), try: https://github.com/jolynch/python_performance_toolkit/raw/master/notebooks/cassandra_availability/whitepaper/cassandra-availability-virtual.pdf -Joey

Quantifying Virtual Node Impact on Cassandra Availability

2018-04-16 Thread Joseph Lynch
Josh Snyder and I have been working on evaluating virtual nodes for large scale deployments and while it seems like there is a lot of anecdotal support for reducing the vnode count [1], we couldn't find any concrete math on the topic, so we had some fun and took a whack at quantifying how

Re: Repair scheduling tools

2018-04-16 Thread Carl Mueller
So reading ( https://www.datastax.com/dev/blog/anticompaction-in-cassandra-2-1)... anticompaction problems from repair seem related to the fact that the sstables for a repair range can have data that isn't in the repaired data range, so we then have an sstable with the repaired data (I'm ...

Re: Repair scheduling tools

2018-04-16 Thread Carl Mueller
Is the fundamental nature of sstable fragmentation the big wrench here? I've been trying to imagine aids like an offline repair resolver or a gradual node replacement/regenerator process that could serve as a backstop/insurance for compaction and repair problems. After all, some of the "we don't