[
https://issues.apache.org/jira/browse/CASSANDRA-13257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949858#comment-15949858
]
Blake Eggleston commented on CASSANDRA-13257:
---------------------------------------------
bq. I think even so streaming preview covers both full and incremental repair
case, and other streaming usage.
No, I’m afraid it doesn’t. Part of the confusion here is that my linked patch
doesn’t include the fix included in CASSANDRA-13328, which fixes how sstables
are selected for streaming post #9143. Sorry about that. The other part is
that, post CASSANDRA-9143, incremental repair does an anti-compaction before
doing anything else, including validation or streaming. Rewriting a bunch of
sstables just so we can estimate the streaming that would happen if we ran one
for real is sort of a non-starter.
So, I still don’t see a way we can prevent StreamSession from having some
notion of what is being previewed. Previewing incremental repair streaming
means that we need StreamSession to know it should only include unrepaired
sstables, instead of all sstables, as it would with a full repair, since we
won’t be including a pending repair id. After #13328, the isIncremental flag in
StreamSession is not doing anything, and I have a note to remove it before 4.0.
We could make the argument that we should leave it to support preview, but then
why not just have the preview enum, which has a much clearer purpose?
Also, while knowing that there was a merkle tree mismatch is technically enough
to validate whether repaired data is in sync across nodes, having information
about the related streaming we expect does have value which shouldn’t be
dismissed just because it’s a bit abstract. From the development side, it will
provide clues about the cause of the mismatch (ie: a one way transfer indicates
that one node failed to promote an sstable). From the operational side, knowing
how much data needs to be streamed to fix the out of sync data is useful, it
also indicates the severity of the problem, and worst case data loss risk in
the case of corruption. But, we can't do this without StreamSession having some
notion of what's being previewed.
Rebased against trunk (and CASSANDRA-13325) here:
https://github.com/bdeggleston/cassandra/tree/13257-squashed-trunk
> Add repair streaming preview
> ----------------------------
>
> Key: CASSANDRA-13257
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13257
> Project: Cassandra
> Issue Type: New Feature
> Components: Streaming and Messaging
> Reporter: Blake Eggleston
> Assignee: Blake Eggleston
> Fix For: 4.0
>
>
> It would be useful to be able to estimate the amount of repair streaming that
> needs to be done, without actually doing any streaming. Our main motivation
> for this having something this is validating CASSANDRA-9143 in production,
> but I’d imagine it could also be a useful tool in troubleshooting.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)