[jira] Created: (CASSANDRA-1106) Use Scanner API for all reads

Stu Hood (JIRA) Wed, 19 May 2010 00:06:19 -0700

Use Scanner API for all reads
-----------------------------

                 Key: CASSANDRA-1106
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1106
             Project: Cassandra
          Issue Type: Sub-task
            Reporter: Stu Hood
            Priority: Minor
             Fix For: 0.8



The goal of this issue is to eliminate the IColumnIterator interface, and to 
use the Slice/Scanner API for all reads. Additionally, this issue begins to 
optimize the interaction between FilteredScanner and QueryFilter to gain back 
speed lost in CASSANDRA-1095.

This issue adds Memtable.Scanner and converts Memtables to maps from 
DecoratedKey -> List<Slice> (where the list represents a row: one entry for 
Standard CFs, and more than one entry for Super CFs). Since Slices are 
immutable, rows in the Memtable are merged using SliceMergingIterator, and 
atomically swapped out. This is much less granular atomicity than we support 
currently, so this approach to mapping the Memtable to Slices is wide open to 
debate.

The row cache in this patch mimics the Memtable and becomes a map from 
DecoratedKey -> List<Slice>. In order to reuse the QueryFilter API, a 
db.ListScanner is added to wrap an individual row in the cache for filtering. 
One limitation imposed by this design is that the row cache can't be used as a 
write-through cache, since its entries are immutable.

The common order of operations is:
# Get a SeekableScanner implementation for the Memtable/cache entry/SSTable
# Build a QueryFilter describing the query
# Call QueryFilter.filter(scanner) to wrap the SeekableScanner in a 
FilteredScanner
* Optionally, merge multiple Scanners using MergingScanner
# Call QueryFilter.collect(scanner) to wrap garbage collection around the 
merged input
# Limit the output columns using QueryFilter.limit(scanner)

Optimization between FilteredScanner and QueryFilter is accomplished via the 
MatchResult object, which is pretty ugly, and still a work in progress. 
Internally to a QueryFilter, IFilters for each level return MatchResults 
indicating where their next interesting matches are, and QueryFilter composes 
the levels into a MathResult that a FilteredScanner uses to see on its 
underlying Scanner.

These patches remove a lot of deeply nested and complicated logic for dealing 
with super columns and garbage collection, including IFilter.filterSuperColumn 
(replaced naturally by Slice filtering), IFilter.collectReducedColumns (ditto) 
and ColumnFamilyStore.removeDeleted (replaced by ASlice.GCFunction). 
Additionally, they replace scads of AbstractIterator implementations that were 
implementing IColumnIterator on a case by case basis.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (CASSANDRA-1106) Use Scanner API for all reads

Reply via email to