Thanks for sharing this! I added some comments/suggestions on the ticket
for those interested.

On a side note, it's still not clear if we should do the discussion here on
the dev-list or just call attention for a particular issue/ticket and then
continue discussion on JIRA, but I find the latter more appropriate to
avoid spamming those not interested, and only update here if there are new
developments in the ticket direction.

2016-08-24 18:35 GMT-03:00 Blake Eggleston <beggles...@apple.com>:

> Hi everyone,
>
> I just posted a proposed solution to some issues with incremental repair
> in CASSANDRA-9143. The solution involves non-trivial changes to the way
> incremental repair works, so I’m giving it a shout out on the dev list in
> the spirit of increasing the flow of information here.
>
> Summary of problem:
>
> Anticompaction excludes sstables that have been, or are, compacting.
> Anticompactions can also fail on a single machine due to any number of
> reasons. In either of these scenarios, a potentially large amount of data
> will be marked as unrepaired on one machine that’s marked as repaired on
> the others. During the next incremental repair, this potentially large
> amount of data will be unnecessarily streamed out to the other nodes,
> because it won’t be in their unrepaired data.
>
> Proposed solution:
>
> Add a ‘pending repair’ bucket to the existing repaired and unrepaired
> sstable buckets. We do the anticompaction up front, but put the
> anticompacted data into the pending bucket. From here, the repair proceeds
> normally against the pending sstables, with the streamed sstables also
> going into the pending buckets. Once all nodes have completed streaming,
> the pending sstables are moved into the repaired bucket, or back into
> unrepaired if there’s a failure.
>
> - Blake

Reply via email to