> > Do you have an indication that at least the disk space is in fact > consistent with the amount of data being streamed between the nodes? I > think you had 90 -> ~ 450 gig with RF=3, right? Still sounds like a > lot assuming repairs are not running concurrently (and compactions are > able to run after a repair before the next repair of a neighbor > starts). > Hi Peter, When a repair was running on the 40GB keyspace I'd usually see range repairs for about up to a couple thousand ranges for each CF. If range = #keys then that's a very small amount of data being moved around. However, at the time, I hadn't noticed that there were multiple repairs running concurrently on the same nodes and on the neighbors so I suppose my experience is invalid for possibly finding a bug. But I suspect it will help someone out along the way because they'll have multiple repairs going on too and I have a much better understanding of what's going on myself.
I've reloaded all my data in my cluster now, the load is 140GB on each node and I've been able to run a repair on each CF that comes out almost 100% consistent. I'm now starting to run the daily repair crons again to see if they go out of whack or not.
