[
https://issues.apache.org/jira/browse/CASSANDRA-847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842290#action_12842290
]
Jonathan Ellis commented on CASSANDRA-847:
------------------------------------------
Let's keep this simple.
The goal is to create an abstraction that (a) compaction code can apply to both
old and new data formats, while (b) allowing for memory-efficient compactions
on the new format and (c) making new-format indexing of subcolumns possible and
(d) ideally allowing up to 256 levels of new-format subcolumn nesting.
In particular, the goal does not include improving efficiency of compaction of
old format data; if that falls out naturally, fine, but it's not really our
goal.
Nor is it yet our goal to support the new format, in this patchset, although
maybe it should be. Compacting from old format to old format, with new data
structures, is not part of our ultimate goal either, and making it an
intermediate step may be making things harder than necessary. It may be
simpler to introduce the new format first, so we can skip to compacting from
old -> new and new -> new, not bothering with old -> old.
Are we on the same page?
I think the simplest way to get to this is to simply continue using IColumn.
It generalizes just fine to multiple levels, and the existing implementation
knows how to use abstractions like mostRecentLiveChangeAt to handle tricky
problems like tombstones. Throwing this away and starting over will lead us
eventually to the same place. [Although certainly some parts like
getObjectCount won't be needed and can ultimately be removed.] Also, sharing
code b/t old and new formats is within reason a good thing. So let's keep
IColumn (I believe the analogue in your patch is Named?) and Column.
ColumnFamily + SuperColumn should be replaced with a more generalized structure
supporting arbitrary nesting. Here I think ColumnGroup is a better name than
Slice; we use the latter term in querying, which would be potentially
confusing. But I think it would have a lot in common w/ the existing
CF/SuperColumn code. Each ColumnGroup, like Column, only needs a byte[] name.
No need to copy a lot of full paths around; experience with existing code shows
that this is unnecessary.
Mapping this to the old data format is hopefully clear since it resembles it
relatively strongly. What about the new format? Here we come back to my
advocating that "all container information goes in the block header, followed
by serialized Columns [not IColumns, just name-data-ts triples]." This is
where we will need something like ColumnKey to contain column boundaries --
i.e., not in this patchset, unless you decide that actually introducing the new
format here is the way to go.
Thus, for compaction, our algorithm goes something like "read all the header
information at once and build the ColumnGroup structure in memory, then iterate
through matching sub-columngroups, merging as necessary." Since we read the
header all at once, and then the subcolumns in-order, all i/o within a single
sstable remains sequential.
It's not clear to me how to apply the old ReducingIterator approach to
multilevel groups when the data to merge into one Block may be spread across
multiple Blocks in another sstable, although I find the iterator design very
elegant and easy to confirm correctness in. So you are probably right that
this has to change.
One other thing about header info / column key: it would be nice to come up
with a scheme that doesn't repeat the full path in the description of each
ColumnGroup [i.e., ColumnKey or its analogue], at least not on-disk; in a
heavily nested structure that would be a lot of duplication of the initial path
elements, although presumably compression would mitigate this some.
What do you think?
> Make the reading half of compactions memory-efficient
> -----------------------------------------------------
>
> Key: CASSANDRA-847
> URL: https://issues.apache.org/jira/browse/CASSANDRA-847
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Stu Hood
> Priority: Critical
> Fix For: 0.7
>
> Attachments:
> 0001-Add-structures-that-were-important-to-the-SSTableSca.patch,
> 0002-Implement-most-of-the-new-SSTableScanner-interface.patch,
> 0003-Rename-RowIndexedReader-specific-test.patch,
> 0004-Improve-Scanner-tests-and-separate-SuperCF-handling-.patch,
> 0005-Add-Scanner-interface-and-a-Filtered-implementation-.patch,
> 0006-Add-support-for-compaction-of-super-CFs-and-some-tes.patch
>
>
> This issue is the next on the road to finally fixing CASSANDRA-16. To make
> compactions memory efficient, we have to be able to perform the compaction
> process on the smallest possible chunks that might intersect and contend
> one-another, meaning that we need a better abstraction for reading from
> SSTables.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.