[jira] [Commented] (CASSANDRA-1337) parallelize fetching rows for low-cardinality indexes

David Alves (JIRA) Sun, 10 Jun 2012 12:16:45 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13292562#comment-13292562
 ]


David Alves commented on CASSANDRA-1337:
----------------------------------------

Thanks for the suggestion Vijay.

my question referred to a particular instruction in the patch (carried over 
from the original patch) where we block and wait for the handler's results only 
after we have more handlers than concurrency factor. 

My question was: wouldn't it be possible to reach a point where we have no more 
ranges (and will create no more handlers) but still have some for which we 
haven't blocked to read the data and these last few are less than concurrency 
factor therefore never passing the if's condition (if (scanHandlers.size() >= 
concurrencyFactor)).

With regard to testing I guess stress is ok to test speed but how (where?) 
would I add the unit/system tests?

                
> parallelize fetching rows for low-cardinality indexes
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1337
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1337
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jonathan Ellis
>            Assignee: David Alves
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: 
> 0001-CASSANDRA-1337-scan-concurrently-depending-on-num-rows.txt, 
> CASSANDRA-1337.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> currently, we read the indexed rows from the first node (in partitioner 
> order); if that does not have enough matching rows, we read the rows from the 
> next, and so forth.
> we should use the statistics fom CASSANDRA-1155 to query multiple nodes in 
> parallel, such that we have a high chance of getting enough rows w/o having 
> to do another round of queries (but, if our estimate is incorrect, we do need 
> to loop and do more rounds until we have enough data or we have fetched from 
> each node).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-1337) parallelize fetching rows for low-cardinality indexes

Reply via email to