There are a couple of things that could be happening here:
- There will be time differences between when nodes participating repair
flush, so in write-heavy tables there will always be minor differences
during validation, and those could be accentuated by low resolution merkle
trees, which will affect mostly larger tables.
- SSTables compacted during incremental repair will not be marked as
repaired, so nodes with different compaction cadences will have different
data in their unrepaired set, what will cause mismatches in the subsequent
incremental repairs. CASSANDRA-9143 will hopefully fix that limitation.
2016-09-22 7:10 GMT-03:00 Stefano Ortolani <ostef...@gmail.com>:
> I am seeing something weird while running repairs.
> I am testing 3.0.9 so I am running the repairs manually, node after node,
> on a cluster with RF=3. I am using a standard repair command (incremental,
> parallel, full range), and I just noticed that the third node detected some
> ranges out of sync with one of the nodes that just finished repairing.
> Since there was no dropped mutation, that sounds weird to me considering
> that the repairs are supposed to operate on the whole range.
> Any idea why?
> Maybe I am missing something?