[ 
https://issues.apache.org/jira/browse/CASSANDRA-1156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12893837#action_12893837
 ] 

Jonathan Ellis commented on CASSANDRA-1156:
-------------------------------------------

pushed fancy statistics-based parallelization to CASSANDRA-1337.  this merely 
adds support for scanning across multiple nodes to satisfy a query serially, as 
well as ConcurrencyLevel-awareness.

on reflection, it seems that forcing the user to wrap multiget/range scan/index 
scan in a RowPredicate is the wrong move, even if the eventual proliferation of 
_count methods pains me.  0001 has the updates to thrift to make that change 
broken out. (as usual, `ant gen-thrift-java` is left as an exercise for the 
reader to avoid unnecessary noise in the patcheset.)

> support querying multiple nodes for index scan
> ----------------------------------------------
>
>                 Key: CASSANDRA-1156
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1156
>             Project: Cassandra
>          Issue Type: Sub-task
>            Reporter: Jonathan Ellis
>             Fix For: 0.7 beta 1
>
>         Attachments: 0001-update-thrift.txt, 
> 0002-handle-index-scans-across-multiple-nodes-and-consisten.txt
>
>
> given CASSANDRA-1155, we should query multiple nodes for the rows 
> corresponding to the given index criteria, such that we have a 90% chance of 
> getting enough rows w/o having to do another query (but, if our estimate is 
> incorrect, we do need to loop and do a 2nd query).
> we start with the first node in token order, so that we only have to query a 
> single node for low cardinality (i.e., every index value has many rows 
> associated with it).  we do this by ordering the keys in the index row, in 
> partitioner order.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to