[ 
https://issues.apache.org/jira/browse/CASSANDRA-12888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15723449#comment-15723449
 ] 

Benjamin Roth commented on CASSANDRA-12888:
-------------------------------------------

Another example:

A repair of a KS with approx 1.7TB total on 8 nodes with 7 base tables and 4 
MVs increased from roughly 18:30h to 23:30h. I would explain it like that:
This patch causes more validation work as it also validates MVs now, not only 
base tables. Depending of the ratio of base tables to MVs and the extent of 
detected inconsistencies and repair streams, it is possible that this patch 
performs worse than before.
>From my point of view it is still ok because it will never perform worse than 
>if all the MVs were normal tables. But if there are a lot of streams e.g. due 
>to a node failure recovery, this patch will perform much, much better than 
>before.

> Incremental repairs broken for MVs and CDC
> ------------------------------------------
>
>                 Key: CASSANDRA-12888
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12888
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Streaming and Messaging
>            Reporter: Stefan Podkowinski
>            Assignee: Benjamin Roth
>            Priority: Critical
>             Fix For: 3.0.x, 3.x
>
>
> SSTables streamed during the repair process will first be written locally and 
> afterwards either simply added to the pool of existing sstables or, in case 
> of existing MVs or active CDC, replayed on mutation basis:
> As described in {{StreamReceiveTask.OnCompletionRunnable}}:
> {quote}
> We have a special path for views and for CDC.
> For views, since the view requires cleaning up any pre-existing state, we 
> must put all partitions through the same write path as normal mutations. This 
> also ensures any 2is are also updated.
> For CDC-enabled tables, we want to ensure that the mutations are run through 
> the CommitLog so they can be archived by the CDC process on discard.
> {quote}
> Using the regular write path turns out to be an issue for incremental 
> repairs, as we loose the {{repaired_at}} state in the process. Eventually the 
> streamed rows will end up in the unrepaired set, in contrast to the rows on 
> the sender site moved to the repaired set. The next repair run will stream 
> the same data back again, causing rows to bounce on and on between nodes on 
> each repair.
> See linked dtest on steps to reproduce. An example for reproducing this 
> manually using ccm can be found 
> [here|https://gist.github.com/spodkowinski/2d8e0408516609c7ae701f2bf1e515e8]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to