[
https://issues.apache.org/jira/browse/CASSANDRA-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14628176#comment-14628176
]
Sylvain Lebresne commented on CASSANDRA-9705:
---------------------------------------------
I've pushed a commit to the branch with some of those first remarks fixed. I
discuss the rest below.
bq. {{ColumnData}} and {{ComplexColumnData}} are a bit confusingly named, to my
mind, at least, because it is {{(Column x Row) Data}}.
The idea is that {{ColumnData}} is the data for a single column within a row.
For a simple column, that data is just a single cell, so making {{Cell}}
implement {{ColumnData}} directly felt simpler (we could, I suppose, have a
{{SimpleColumnData}} that {{Cell}} would extend, but I don't see what that
would buy us). And {{ComplexColumnData}} is the implementation for complex
columns. I understand we don't all find the same things intuitive/confusing,
but I have to admit that as far as naming goes, it's one I'm reasonably happy
with.
bq. Could we not just have ComplexCell (made up of Cell), both implementing
AbstractCell; or a SimpleCell both implementing Cell?
Not totally sure to understand the suggestion. But what is currently called
{{ComplexColumnData}} cannot reasonably implement {{Cell}}, not with the
current definition/API of {{Cell}} at least.
I'll also add that the current cell concept is pretty close to our existing
concept, which is strongly suspect is a good thing, so I'm reluctant to change
what it means too much (in case you were suggesting along those lines).
bq. Is it really better to have Row implement {{Iterable<ColumnData>}} than
{{Iterable<Cell>}}?
I do strongly believe it is. Mainly because when you iterate over the cells,
you're missing the complex column deletion informations, so iterating on cells
should only be done when you know that it's ok to do so. Also, quite a few of
the places where we iterate on cells is for 2ndary index, and that's because
the current API is fundamentally cell based. But that's actually inefficient:
typically, if you're not indexing a collection column, there is no point in
checking all the individual cells for that column, but that's what we do. That
should be fixed by the 2ndary API refactor though, so I expect the direct cell
iterations to diminish, and iteration on {{ColumnData}} to be by far
predominant. As it should be.
That said, totally agreed about the {{Iterable}} point and I've made that
change in the new commit.
bq. I like the migration to Builder, but we still have a lot of methods
labelled {{writeX}}
I've fixed a few ({{RangeTombstoneMarker.Builder}}, which had a few, was
actually unused), but it's hard to make sure I found all occurrences ("write"
is a pretty common symbol prefix in the code base), so if you find some
remaining, I'll be happy to fix them too.
Regarding {{SSTable\[Reversed\]Iterator}} readers, I agree it's not perfect in
term of code duplication. I'll have a fresh look at trying to clean it up but
it will take a tad more time.
> Simplify some of 8099's concrete implementations
> ------------------------------------------------
>
> Key: CASSANDRA-9705
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9705
> Project: Cassandra
> Issue Type: Sub-task
> Reporter: Sylvain Lebresne
> Assignee: Sylvain Lebresne
> Fix For: 3.0 beta 1
>
>
> As mentioned in the ticket comments, some of the concrete implementations
> (for Cell, Row, Clustering, PartitionUpdate, ...) of the initial patch for
> CASSANDRA-8099 are more complex than they should be (the use of flyweight is
> typically probably ill-fitted), which probably has performance consequence.
> This ticket is to track the refactoring/simplifying those implementation
> (mainly by removing the use of flyweights and simplifying accordingly).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)