Replace SSTable*Iterators
-------------------------

                 Key: CASSANDRA-1095
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1095
             Project: Cassandra
          Issue Type: Sub-task
            Reporter: Stu Hood
             Fix For: 0.8


Building on the foundation created by CASSANDRA-1092, the goal of this issue is 
to replace the db.filter.SSTable*Iterator classes with naive filtering wrapped 
around the "sorted list of Slices" abstraction. The goal of this issue is _not_ 
speed: the filtering done here is very, very naive, but it will be improved in 
patch 3 of the 998 series.

This issue begins simplifying and generalizing the QueryFilter API, to allow 
more complex queries like "keys in [A:G] and columns in [b, h]". In the current 
codebase, this is accomplished by filtering rows independently from the 
columns, with keys filtered inside ColumnFamilyStore, and columns filtered by 
the SSTable*Iterator classes. This more powerful QueryFilter API also opens the 
door for compound filters, which would be prohibitively difficult with the 
current API, since each filter currently requires direct file access.

This patch sets up db.filter.FilteredScanner (added in the previous patch) to 
use an embedded QueryFilter to perform filtering of a source SeekableScanner. 
The QueryFilter class composes implementations of db.filter.IFilter, which 
answer two questions:
# might this Slice match the filter? (IFilter.matchesBetween)
# does this Column match the filter? (IFilter.matches)
If it is possible that Columns in a Slice will match the filter, the Slice 
cannot be filtered out because its Metadata might need to be resolved against 
Metadata from other Scanners. The answer to the second question allows 
individual Columns in a Slice to skip deserialization from disk.

For this issue, the SeekableScanner/FilteredScanner API is used to access 
SSTables, but the access is wrapped by db.SliceToRowIterator to allow for 
Row/ColumnFamily iteration. Memtables will be converted to the Scanner API in 
the 3rd issue in the series, which will allow us to remove most of the Slice -> 
CF translation layer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to