[ 
https://issues.apache.org/jira/browse/CASSANDRA-4883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495757#comment-13495757
 ] 

Jonathan Ellis commented on CASSANDRA-4883:
-------------------------------------------

Any reason to not use ImmutableSet in DataTracker?

+1 otherwise.
                
> Optimize mostRecentTomstone vs maxTimestamp check in 
> CollationController.collectAllData
> ---------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4883
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4883
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 1.2.0 rc1
>
>         Attachments: 4883.txt
>
>
> CollationController.collectAllData eliminates a sstable if we've already read 
> a row tombstone more recent that its maxTimestamp. This is however done in 2 
> passes and can be inefficient (or rather, it's not as efficient as it could). 
> More precisely, say we have 10 sstables s0, ... s9, where s0 is the most 
> recent and s9 the least one (and their maxTimestamp reflect that) and s0 has 
> a row tombstone that is more recent than all of s1-s9 maxTimestamps. Now in 
> collectAllData(), we first iterate over sstables in a "random" order (because 
> DataTracker keeps sstable in a more or less random order). Meaning that we 
> may iterate in the order s9, s8, ... s0. In that case, we will end up reading 
> the row header from all the sstable (hitting disk each time). Then, and only 
> then, the 2nd pass of collectAllData will eliminate s1 to s9.
> However, if we were to iterate sstable in maxTimestamps order (as we do in 
> collectTimeOrdered), we would only need one pass but more importantly we 
> would minimize the number of row header we read to perform that sstable 
> eliminination. In my example, we would only ever read the row tombstone from 
> s0 and eliminate all other sstable directly, simply based on their 
> maxTimestamp.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to