[
https://issues.apache.org/jira/browse/CASSANDRA-12888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15896342#comment-15896342
]
Stefan Podkowinski commented on CASSANDRA-12888:
------------------------------------------------
The repairedAt value stored in each sstable's metadata will indicate the time
the sstable has been repaired and nothing more. The basic idea behind tracking
such a timestamp value was that once a sstable has been repaired, the
containing data is consistent in a way that no node would miss any data such as
tombstones and therefore we won't have to repair this data ever again. This is
what makes incremental repairs possible. As simple as the idea is, things start
to become a bit tricky when we want to merge data, either by compactions or in
case of this ticket, by applying mutations. The way compactions have been
implemented is that we now have two pools of sstables that will be compacted
independently from each other: unrepaired and repaired data. Sstables in both
pools can be compacted together just fine and in case of repaired data, the
lowest timestamp of the compaction candidates will be used as output. However,
the actual timestamp value currently doesn't really matter, as we just use it
to track if it the sstables has been repaired or not. Future repairs may be
executed based on unrepaired only (incremental) or both unrepaired and repaired
(full) data. Does this answer your question?
> Incremental repairs broken for MVs and CDC
> ------------------------------------------
>
> Key: CASSANDRA-12888
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12888
> Project: Cassandra
> Issue Type: Bug
> Components: Streaming and Messaging
> Reporter: Stefan Podkowinski
> Assignee: Benjamin Roth
> Priority: Critical
> Fix For: 3.0.x, 3.11.x
>
>
> SSTables streamed during the repair process will first be written locally and
> afterwards either simply added to the pool of existing sstables or, in case
> of existing MVs or active CDC, replayed on mutation basis:
> As described in {{StreamReceiveTask.OnCompletionRunnable}}:
> {quote}
> We have a special path for views and for CDC.
> For views, since the view requires cleaning up any pre-existing state, we
> must put all partitions through the same write path as normal mutations. This
> also ensures any 2is are also updated.
> For CDC-enabled tables, we want to ensure that the mutations are run through
> the CommitLog so they can be archived by the CDC process on discard.
> {quote}
> Using the regular write path turns out to be an issue for incremental
> repairs, as we loose the {{repaired_at}} state in the process. Eventually the
> streamed rows will end up in the unrepaired set, in contrast to the rows on
> the sender site moved to the repaired set. The next repair run will stream
> the same data back again, causing rows to bounce on and on between nodes on
> each repair.
> See linked dtest on steps to reproduce. An example for reproducing this
> manually using ccm can be found
> [here|https://gist.github.com/spodkowinski/2d8e0408516609c7ae701f2bf1e515e8]
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)