[
https://issues.apache.org/jira/browse/CASSANDRA-5722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13709945#comment-13709945
]
Tyler Hobbs commented on CASSANDRA-5722:
----------------------------------------
Quick question around your commit here:
https://github.com/jbellis/cassandra/commit/b9194842fc9eacc4fd55813cd200fa55f5713072.
In getPosition() (which I lifted parts of), is the cost of decorating index
keys so high that it outweighs the savings from exiting the loop earlier when a
greater key is found? In other words, if we were to always decorate and do a
normal compareTo() (even in the EQ case) with a return/break when the index key
was higher, could that not potentially give better performance (skipping ~64
checks on average)?
Should I go ahead and roll your changes into a new patch?
I'm totally fine with doing CASSANDRA-2524. Shouldn't be too hard.
> Cleanup should skip sstables that don't contain data outside a nodes ranges
> ---------------------------------------------------------------------------
>
> Key: CASSANDRA-5722
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5722
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Nick Bailey
> Assignee: Tyler Hobbs
> Fix For: 2.0.1
>
> Attachments: 0001-Skip-cleanup-when-unneeded.patch
>
>
> Right now cleanup is optimized to simply delete sstables that *only* contain
> data that doesn't belong on the node, for all other sstables though, it will
> read them, check each row, and write out new sstables.
> Cleanup could be optimized to look at an sstable and determine that all data
> within the sstable does belong on a node, and therefore skip re-writing that
> sstable. This would make cleanup essentially a noop in the case where all
> data on a node belongs on that node.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira