[ 
https://issues.apache.org/jira/browse/CASSANDRA-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15311908#comment-15311908
 ] 

Marcus Eriksson commented on CASSANDRA-11886:
---------------------------------------------

Pushed a new commit to the branches above which builds an IntervalTree over the 
CANONICAL_SSTABLES - this might not be as efficient as using the 
OverlapIterator, but it should be good enough and is a simpler solution than 
trying to use the OverlapIterator (imo).

We should note that the ranges we iterate over are the ranges we are about to 
stream, not all local ranges (though, I have to say I'm not sure if that makes 
things better or worse in real life)

> Streaming will miss sections for early opened sstables during compaction
> ------------------------------------------------------------------------
>
>                 Key: CASSANDRA-11886
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11886
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Stefan Podkowinski
>            Assignee: Marcus Eriksson
>            Priority: Critical
>              Labels: correctness, repair, streaming
>         Attachments: 9700-test-2_1.patch
>
>
> Once validation compaction has been finished, all mismatching sstable 
> sections for a token range will be used for streaming as return by 
> {{StreamSession.getSSTableSectionsForRanges}}. Currently 2.1 will try to 
> restrict the sstable candidates by checking if they can be found in 
> {{CANONICAL_SSTABLES}} and will ignore them otherwise. At the same time 
> {{IntervalTree}} in the {{DataTracker}} will be build based on replaced 
> non-canonical sstables as well. In case of early opened sstables this becomes 
> a problem, as the tree will be update with {{OpenReason.EARLY}} replacements 
> that cannot be found in canonical. But whenever 
> {{getSSTableSectionsForRanges}} will get a early instance from the view, it 
> will fail to retrieve the corresponding canonical version from the map, as 
> the different generation will cause a hashcode mismatch. Please find a test 
> attached.
> As a consequence not all sections for a range are streamed. In our case this 
> has caused deleted data to reappear, as sections holding tombstones were left 
> out due to this behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to