[jira] [Commented] (CASSANDRA-10045) Sparse/Dense decision should be made per-row, not per-file

Benedict (JIRA) Wed, 19 Aug 2015 05:09:07 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-10045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14702915#comment-14702915
 ]


Benedict commented on CASSANDRA-10045:
--------------------------------------

Patch available [here|https://github.com/belliottsmith/cassandra/tree/10045]

This patch does away with the concept of sparse/dense, and simply uses the 
subset encoding of CASSANDRA-9894 to encode the row columns. Only present cells 
are then serialized, always in "dense" form. 

This does mean there are some (very few) situations in which we aren't coming 
out net positive from the change, but the result is improved clarity as well as 
better behaviour in a majority of cases. Behaviour can be improved further with 
follow up improvements to the subset serialization. 


> Sparse/Dense decision should be made per-row, not per-file
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-10045
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10045
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: Minor
>             Fix For: 3.0 beta 2
>
>
> Marking this as beta 1 in the hope I have time to rustle it up and get it 
> reviewed beforehand. If I do not, I will let it slide, but our behaviour 
> right now is not brilliant for workloads with a variance in density, and it 
> should not be challenging to make a more targeted decision.
> We can also make use of CASSANDRA-9894 to make column encoding more efficient 
> in many, even dense, cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10045) Sparse/Dense decision should be made per-row, not per-file

Reply via email to