[
https://issues.apache.org/jira/browse/CASSANDRA-10045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14709344#comment-14709344
]
Sylvain Lebresne commented on CASSANDRA-10045:
----------------------------------------------
I like the idea. A few remarks on the patch:
* In {{Columns}}, since we introduce a {{size()}} (which I'm good with), I'd
prefer removing {{columnCount()}}.
* {{EncodingStats.avgColumnSetPerRow}} isn't used anymore (if I'm not mistaken)
so we should remove it.
* In {{Row}}, I don't like having {{columns()}} return a {{Columns}} but
{{actualColumns()}} return a {{Collection<ColumnDefinition>}} (the latter
method is in need of a javadoc also). It's an important API and API consistency
is important so I'd prefer one of (with a minor preference for the 2nd one for
the sake of keeping the {{Row}} interface smaller):
## make {{actualColumns}} return a {{Columns}} and pay the allocation.
## don't add that method and just call {{Collections2.transform(row,
ColumnData::column)}} at the call sites.
* In {{BufferCell.Serializer}}, can we "rebase" the "MASK" (from 0x01). The
patch is breaking serialization anyway.
* In {{UnfilteredSerializer}} class javadoc, the bullet about complex cells
starts by "while each ...", where the "while" should be removed (bad
copy-paste). More importantly, the description is not up-to-date: we now encode
the number of cells up-front and there is no {{<emptyCell>}} at the end.
* Any reason for changing the name of the row inserted in {{ScrubTest}}?
Doesn't seem related to this patch in any way.
* Nit: in {{UnfilteredSerializer.readComplexColumn/skipComplexColumn}}, a
{{for}} loop would be more idiomatic (of the code base if not of Java).
> Sparse/Dense decision should be made per-row, not per-file
> ----------------------------------------------------------
>
> Key: CASSANDRA-10045
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10045
> Project: Cassandra
> Issue Type: Sub-task
> Components: Core
> Reporter: Benedict
> Assignee: Benedict
> Priority: Minor
> Fix For: 3.0 beta 2
>
>
> Marking this as beta 1 in the hope I have time to rustle it up and get it
> reviewed beforehand. If I do not, I will let it slide, but our behaviour
> right now is not brilliant for workloads with a variance in density, and it
> should not be challenging to make a more targeted decision.
> We can also make use of CASSANDRA-9894 to make column encoding more efficient
> in many, even dense, cases.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)