Thanks Stefan for reviewing this, please find my comments inline:
>We already provide tons of metrics and provide some useful logging (e.g.
when reading too many tombstones), but I think we should still be able to
implement further >checks in-code that highlight potentially issues. Maybe
we
I agree with Stefan that we should use incremental repair and use patches
from Marcus to drop tombstones only from repaired data.
Regarding deep repair, you can bump the read repair and run the repair. The
issue will be that you will stream lot of data and also your blocking read
repair will go up
We've seen this before but couldn't tie it to GCGS so we ended up
forgetting about it. Now with a reproducible test case things make much
more sense and we should be able to fix this.
Seems that it's most certainly a bug with partition deletions and handling
of GC grace seconds. It seems that the