[
https://issues.apache.org/jira/browse/CASSANDRA-10045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14702915#comment-14702915
]
Benedict commented on CASSANDRA-10045:
--------------------------------------
Patch available [here|https://github.com/belliottsmith/cassandra/tree/10045]
This patch does away with the concept of sparse/dense, and simply uses the
subset encoding of CASSANDRA-9894 to encode the row columns. Only present cells
are then serialized, always in "dense" form.
This does mean there are some (very few) situations in which we aren't coming
out net positive from the change, but the result is improved clarity as well as
better behaviour in a majority of cases. Behaviour can be improved further with
follow up improvements to the subset serialization.
> Sparse/Dense decision should be made per-row, not per-file
> ----------------------------------------------------------
>
> Key: CASSANDRA-10045
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10045
> Project: Cassandra
> Issue Type: Sub-task
> Components: Core
> Reporter: Benedict
> Assignee: Benedict
> Priority: Minor
> Fix For: 3.0 beta 2
>
>
> Marking this as beta 1 in the hope I have time to rustle it up and get it
> reviewed beforehand. If I do not, I will let it slide, but our behaviour
> right now is not brilliant for workloads with a variance in density, and it
> should not be challenging to make a more targeted decision.
> We can also make use of CASSANDRA-9894 to make column encoding more efficient
> in many, even dense, cases.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)