[
https://issues.apache.org/jira/browse/CASSANDRA-11877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15310446#comment-15310446
]
Paulo Motta commented on CASSANDRA-11877:
-----------------------------------------
Thanks for the feedback. I definitely agree it doesn't make sense to make the 2
paradigms interoperable and it's better to keep legacy code isolated since it
will probably be removed in the next major release.
Since there quite a few special cases to consider (range tombstones, index
sampling) let's focus on the simple case first (simple cells, no index
sampling) so we can leverage existing code while having visible progress and
create a basic test structure to build on top when dealing with more complex
cases (range tombstones, collections, index sampling, large partitions, etc)
and do the necessary improvements/optimizations later. I will update the ticket
description to reflect that.
I think we can start by:
* Adding support to simple {{RowIndexEntry}} serialization (only position) on
{{LegacyShallowIndexedEntry.serialize}}
* Create {{LegacyLayout.LegacyBigTableWriter}}, which basically copies 2.2
BigTableWriter while working with the new {{SSTableWriter}} interface
({{append(UnfilteredRowIterator iterator)}}):
** Port other necessary class: {{LegacyLayout.LegacyColumnIndex}},
{{LegacyLayout.MetadataCollector}}, trying to use legacy classes from
{{LegacyLayout}} where applicable ({{LegacyAtom, LegacyDeletionInfo,
LegacyUnfilteredPartition}}), or classes that haven't changed between two
versions ({{EstimatedHistogram, DeletionTime}} for example).
** Since we're not dealing with rangetombstones in this initial version, we can
create an empty stub for {{LegacyRangeTombstoneTracker}} and port that later
when dealing with range tombstones.
** Similarly, since we're not dealing with complex index columns, we can
comment out parts constructing {{IndexInfo}} and always return
{{ColumnIndex.EMPTY}} on {{ColumnIndex.Builder.build()}}
* After the bulk structure of {{LegacyBigTableWriter}} is ported, we can
probably reuse {{LegacyLayout.fromUnfilteredRowIterator}} to convert from
{{UnfilteredRowIterator}} to {{LegacyUnfilteredPartition}} and work from there
on {{LegacyBigTableWriter}}
** At this initial stage, since we're not dealing with range tombstones, we can
probably extract the cell serialization code of
{{LegacyLayout.serializeAsLegacyPartition}} to perform disk cell serialization
on {{LegacyColumnIndex}}
* After we have an initial draft of {{LegacyBigTableWriter}} ready, we can
probably instantiate that when {{!version.storeRows}} on
{{BigFormat.WriterFactory}}
* Adding a few simple tests to guide the development would probably be handy,
maybe we can start by making {{SimpleQuery.testTableWithoutClustering}} work
with a converted sstable.
[~thobbs] Does this sound better to start with and like it's going to work
(even if maybe not efficiently)? Any other particular caveat we are missing or
should be aware of? Thanks in advance for the help!
> Add support to legacy row serialization on BigTableWriter
> ---------------------------------------------------------
>
> Key: CASSANDRA-11877
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11877
> Project: Cassandra
> Issue Type: Sub-task
> Components: Tools
> Reporter: Paulo Motta
> Assignee: Kaide Mu
> Priority: Minor
>
> In order to support writing pre-3.0 sstables, we must add support to legacy
> cell serialization to {{BigTableWriter}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)