[jira] [Commented] (CASSANDRA-8918) Optimise compaction performance for unique partition keys

Sylvain Lebresne (JIRA) Thu, 05 Mar 2015 08:39:16 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-8918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14349044#comment-14349044
 ]


Sylvain Lebresne commented on CASSANDRA-8918:
---------------------------------------------

What does large mean? Unless we're talking really really large (which would 
mean it's almost never used in practice), you can still have tons of "large" 
(time-series like) partition in a sstable, which could result in a really big 
overhead per node (pretty much all sstable stats would have to be 
"materialized" that way). Though I suppose if we only use those materialized 
stats for compaction, we at least wouldn't have to keep them in memory.

Anyway, just saying that a priori it sounds to me like more complexity that it 
would sound like initially (including having to deal with that new "how large 
is large" setting). And it might not kick in all that often. Both of which 
should be taken into account in our costs/benefits analysis. Personally, I'd 
rather first focus on the other optimizations we've open tickets, and then do a 
quick POC to evaluate how much doing this actually gives us on top of those.


> Optimise compaction performance for unique partition keys
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-8918
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8918
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>             Fix For: 3.0
>
>
> Related to the raft of improvements we're looking at for improving the CPU 
> burden of merge, if we can demonstrate that an entire partition key is unique 
> to a given file (which is quite easily done) we can avoid materialising any 
> of the row, and simply copy the data wholesale, with potentially some small 
> modifications to the index file data if it has clustering column index 
> entries, and special treatment of tombstones (most simple would be to only 
> check there are no tombstones to purge, and abort this approach if so).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8918) Optimise compaction performance for unique partition keys

Reply via email to