[
https://issues.apache.org/jira/browse/CASSANDRA-1156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12893837#action_12893837
]
Jonathan Ellis commented on CASSANDRA-1156:
-------------------------------------------
pushed fancy statistics-based parallelization to CASSANDRA-1337. this merely
adds support for scanning across multiple nodes to satisfy a query serially, as
well as ConcurrencyLevel-awareness.
on reflection, it seems that forcing the user to wrap multiget/range scan/index
scan in a RowPredicate is the wrong move, even if the eventual proliferation of
_count methods pains me. 0001 has the updates to thrift to make that change
broken out. (as usual, `ant gen-thrift-java` is left as an exercise for the
reader to avoid unnecessary noise in the patcheset.)
> support querying multiple nodes for index scan
> ----------------------------------------------
>
> Key: CASSANDRA-1156
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1156
> Project: Cassandra
> Issue Type: Sub-task
> Reporter: Jonathan Ellis
> Fix For: 0.7 beta 1
>
> Attachments: 0001-update-thrift.txt,
> 0002-handle-index-scans-across-multiple-nodes-and-consisten.txt
>
>
> given CASSANDRA-1155, we should query multiple nodes for the rows
> corresponding to the given index criteria, such that we have a 90% chance of
> getting enough rows w/o having to do another query (but, if our estimate is
> incorrect, we do need to loop and do a 2nd query).
> we start with the first node in token order, so that we only have to query a
> single node for low cardinality (i.e., every index value has many rows
> associated with it). we do this by ordering the keys in the index row, in
> partitioner order.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.