[
https://issues.apache.org/jira/browse/CASSANDRA-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15310435#comment-15310435
]
Benedict edited comment on CASSANDRA-11886 at 6/1/16 2:59 PM:
--------------------------------------------------------------
If running that test, don't forget to crank up the cluster size and use vnodes,
as that will have a large impact. A cluster with 100 nodes, old skool vnodes
and 10k sstables would have 256M iterations (assuming we don't prune ones that
cannot overlap with us, which I hope we do, but wouldn't assume)
was (Author: benedict):
If running that test, don't forget to crank up the cluster size and use vnodes,
as that will have a large impact. A cluster with 100 nodes, old skool vnodes
and 10k sstables would have 256M iterations.
> Streaming will miss sections for early opened sstables during compaction
> ------------------------------------------------------------------------
>
> Key: CASSANDRA-11886
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11886
> Project: Cassandra
> Issue Type: Bug
> Reporter: Stefan Podkowinski
> Assignee: Marcus Eriksson
> Priority: Critical
> Labels: correctness, repair, streaming
> Attachments: 9700-test-2_1.patch
>
>
> Once validation compaction has been finished, all mismatching sstable
> sections for a token range will be used for streaming as return by
> {{StreamSession.getSSTableSectionsForRanges}}. Currently 2.1 will try to
> restrict the sstable candidates by checking if they can be found in
> {{CANONICAL_SSTABLES}} and will ignore them otherwise. At the same time
> {{IntervalTree}} in the {{DataTracker}} will be build based on replaced
> non-canonical sstables as well. In case of early opened sstables this becomes
> a problem, as the tree will be update with {{OpenReason.EARLY}} replacements
> that cannot be found in canonical. But whenever
> {{getSSTableSectionsForRanges}} will get a early instance from the view, it
> will fail to retrieve the corresponding canonical version from the map, as
> the different generation will cause a hashcode mismatch. Please find a test
> attached.
> As a consequence not all sections for a range are streamed. In our case this
> has caused deleted data to reappear, as sections holding tombstones were left
> out due to this behavior.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)