implementation choice with regard to multiple range slice query filters

David Alves Mon, 02 Apr 2012 21:20:28 -0700

Hi guys

        I'm a PhD student and I'm trying to dip my feet in the water wrt to 
cassandra development, as I'm a long time fan.
        I'm implementing CASSANDRA-3885 which pertains to supporting returning 
multiple slices of a row.
        
        After looking around at the portion of the code that is involved two 
implementation options come to mind and I'd like to get feedback from you on 
whichever you think might work best (or even if I'm in the right track).


        As a first approach I simply subclassed SliceQueryFilter (setting start 
and finish to firstRange.start and lastRange.finish) and made the subclass not 
return the elements in between the ranges (spinning to the first element of the 
next range whenever the final element of the previous was found). This approach 
only uses one IndexedSliceReader but it scans from firstRange.start to 
lastRange.finish.

        Still when I was finishing It came to mind that in cases where the 
filter's selectivity is very low i.e., the ranges are a sparse selection of the 
total number of columns, I might be doing a full row scan for nothing, so 
another option came to mind: an iterator of iterators where I use multiple 
IndexedSliceReader's for each of the required slice ranges and simply iterate 
though them.

        Which do you think is the better option? Am I making any sense, or am I 
completely off track?

        Any help would be greatly appreciated.

Cheers
David Ribeiro Alves

implementation choice with regard to multiple range slice query filters

Reply via email to