[jira] [Updated] (CASSANDRA-3885) Support multiple ranges in SliceQueryFilter

Sylvain Lebresne (JIRA) Fri, 15 Jun 2012 01:30:50 -0700

     [ 
https://issues.apache.org/jira/browse/CASSANDRA-3885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sylvain Lebresne updated CASSANDRA-3885:
----------------------------------------

    Attachment: 3885-v2.txt

Ok, I like the last patch much more and I've convinced myself that this 
prefetched thing is a good idea with multiple slices.

I believe this can be simpler a tiny bit further though. Typically the 
{{readBackward}} business is a tad confusing and since only SimpleBlockFetcher 
really cares about it, it can be moved there. I also think the switch of 
slice/lookup of index block can be cleaned by doing both together. I'm 
attaching a v2 that does those things, a few other cleanups, adds a bunch of 
comments and also adds two optimizations:
* Since we always know the order on which we read columns, we can remember when 
we've "entered" a slice to avoid a bunch of comparisons until we "leave" it.
* When we switch to the next slice, we must reuse the binary search to locate 
where that new slice starts rather than going to the next block, otherwise we 
may end up reading lots of unnecessary blocks.

I'll note that imho, with those optimizations, SimpleSliceReader (not to be 
confused with SimpleBlockFetcher) isn't really useful anymore, but we probably 
want to make sure by benchmarking it.

Anyway, outside of IndexedSliceReader, there was a few problems (that are all 
solved by the attached v2):
* We needed to deal with multiple slices for the memtable iterator too, 
otherwise we'll end up returning the wrong columns.
* We needed to correctly serialize multiple slices for the inter-node protocol.

I'll note that with this patch, db.SerializationsTest is broken, but that is no 
mystery: it's trying to read old messages using the current protocol version. 
So I could regenerate the binary messages, but I'm confused on what 
SerializationsTest is actually testing. I though it was making sure we don't 
break backward compatibility. If we regenerate the binary message at each 
release we're not testing that at all.

ps: @david, it seems your editor is splitting lines where it shouldn't and is 
reordering imports in a way that don't respect 
http://wiki.apache.org/cassandra/CodeStyle.

                
> Support multiple ranges in SliceQueryFilter
> -------------------------------------------
>
>                 Key: CASSANDRA-3885
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3885
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: David Alves
>             Fix For: 1.2
>
>         Attachments: 3885-v2.txt, CASSANDRA-3885.patch, CASSANDRA-3885.patch, 
> CASSANDRA-3885.patch, CASSANDRA-3885.patch, CASSANDRA-3885.patch
>
>
> This is logically a subtask of CASSANDRA-2710, but Jira doesn't allow 
> sub-sub-tasks.
> We need to support multiple ranges in a SliceQueryFilter, and we want 
> querying them to be efficient, i.e., one pass through the row to get all of 
> the ranges, rather than one pass per range.
> Supercolumns are irrelevant since the goal is to replace them anyway.  Ignore 
> supercolumn-related code or rip it out, whichever is easier.
> This is ONLY dealing with the storage engine part, not the StorageProxy and 
> Command intra-node messages or the Thrift or CQL client APIs.  Thus, a unit 
> test should be added to ColumnFamilyStoreTest to demonstrate that it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3885) Support multiple ranges in SliceQueryFilter

Reply via email to