[
https://issues.apache.org/jira/browse/CASSANDRA-1092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stu Hood resolved CASSANDRA-1092.
---------------------------------
Resolution: Invalid
Waaay too invasive.
> Add Slice API, and replace CF and SC for compaction reads
> ---------------------------------------------------------
>
> Key: CASSANDRA-1092
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1092
> Project: Cassandra
> Issue Type: Sub-task
> Components: Core
> Reporter: Stu Hood
> Priority: Critical
> Fix For: 0.8
>
> Attachments: 0001-Add-Slice-and-ColumnKey.patch,
> 0002-Refactor-Scanner-interface-into-filtering-and-filter.patch,
> 0003-Add-a-Scanner-for-merging-sorted-slice-lists.patch,
> 0004-Make-CompactionIterator-extend-SliceMergingIterator.patch
>
>
> Currently, we have two read paths for fetching Columns from disk: the
> io.sstable.SSTableScanner interface, and the db.filter.SSTable*Iterator
> interfaces. The latter is intended for iterating over the IColumns contained
> in a single row, while the former iterates over entire rows at once (although
> SSTableScanner supports returning a db.filter implementation per row).
> While this separation has allowed for highly optimized pushdown filtering in
> the db.filter classes, the lack of abstraction makes it impossible to reason
> about changes to the file format, and depends on random access into the file.
> Additionally, the separation of 'row iteration' from 'icolumn iteration'
> ignores the fact that super columns contain an additional level of columns
> that could be iterated. Rather than introducing a third level of iterators
> that deals with iterating over subcolumns, a unified interface for iterating
> over arbitrarily nested columns would clarify the code, and open the door to
> many interesting possibilities (see CASSANDRA-998).
> This ticket deals with implementing an initial cut of the unified interface,
> which reuses the "Scanner" name. The org.apache.cassandra.Scanner interface
> is essentially an extended iterator, which is further enhanced by
> org.apache.cassandra.SeekableScanner to add operations that reposition the
> iterator. By the end of CASSANDRA-998, SeekableScanner will have
> implementations for the Memtable and SSTables, allowing for uniform iteration
> of all sources.
> The object that a Scanner iterates over is org.apache.cassandra.Slice, which
> is immutable, and contains parent deletion Metadata
> (markedForDeleteAt/localDeletionTime: like a ColumnFamily or SuperColumn).
> Since only the highest markedForDeleteAt or localDeletionTime matters for
> nested columns, Slices simplify storage of this data by storing a single
> value for all parents. The Metadata in a Slice is bounded at each end by a
> org.apache.cassandra.db.ColumnKey, which is a compound key representing the
> full path to a column, or a parent boundary.
> The ColumnKeys in a Slice make it possible to delete column name ranges. By
> convention (in this patch), the ColumnKeys in a Slice always share parents.
> In the future, if we wanted to support range deletes for rows or
> supercolumns, it would be trivial to remove that assumption.
> SSTables and Memtables can be abstracted into "sorted lists of Slices" which
> are individually non-intersecting. Client reads and compactions can use
> org.apache.cassandra.SliceMergingIterator to merge the Slices from multiple
> Scanners into a new Scanner which is globally non-intersecting. This process
> will be at the heart of any read from a ColumnFamilyStore by the end of 998,
> but this issue only uses SliceMergingIterator at the core of compaction, by
> making CompactionIterator a subclass of SliceMergingIterator.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira