[ 
https://issues.apache.org/jira/browse/CASSANDRA-20092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17902419#comment-17902419
 ] 

Jordan West commented on CASSANDRA-20092:
-----------------------------------------

Ironically I had written a similar patch when exploring options for 
CASSANDRA-15452. Big fan of simplifying the scanner for compaction. We switched 
focus to the reads since the win there was bigger when we benchmarked but I am 
all for this. +1 to the concept (I haven't taken a deep look at the code since 
it looks like other comitters have done so already)

> SSTableScanner can be vastly simplified for compaction
> ------------------------------------------------------
>
>                 Key: CASSANDRA-20092
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20092
>             Project: Apache Cassandra
>          Issue Type: Improvement
>          Components: Local/Compaction
>            Reporter: Branimir Lambov
>            Assignee: Branimir Lambov
>            Priority: Normal
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> One of the main bottlenecks for compaction performance is its use of the 
> {{SSTableScanner}} class, whose main purpose is to implement partition range 
> queries and as such supports filtering by row and column that is not helpful 
> to compaction. To implement the latter it must rely on the sstable's index, 
> adding a lot of complexity and inefficiency.
> Implementing a simpler version of a scanner that reads off the data file 
> directly for given spans of offsets would speed up compaction significantly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to