[jira] [Commented] (CASSANDRA-1337) parallelize fetching rows for low-cardinality indexes

Sylvain Lebresne (JIRA) Thu, 26 Jul 2012 06:26:31 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422986#comment-13422986
 ]


Sylvain Lebresne commented on CASSANDRA-1337:
---------------------------------------------

The patch almost move the check for "do we have enough rows for the query" 
inside the "don't use the local path for that range", which is broken (as in 
"uselessly inefficient"), because it means if we do have enough rows locally, 
we will still check another range.

But probably more importantly, I'm confused on how this patch works.

First, it estimated how much the query of *a range* will likely yield based on 
the *total* number of keys on the node. We should at least divide this by the 
replication factor to have a proper estimate of the number-of-keys-per-range.

Second, it seems to correctly parallelize "old-style" range_slice (i.e. 
range_slice for thrift without any IndexExpression), but I think it doesn't 
correctly handle neither secondary indexes (which was the main goal of the 
patch I believe), not any kind of range slices when CQL3 is involved:
* For secondary index queries, the number or rows returned doesn't depend at 
all on the number of row keys the node holds (and thus if follows that 
estimating the number of parallel queries to do based on that parameter is 
broken), it depends on how many columns the row for the most selective index 
contains. So for the concurrencyFactor we should 1) figure out which index will 
be used and 2) probably use the estimated mean columns count for that index.
* For CQL3 queries, they use the maxIsColumns parameters (for both traditional 
*and* 2ndary index queries) and so the number of rows returned shouldn't be 
directly compared to maxResults (i.e. if the first row we find has enough 
columns to satisfy maxResults, we're done). In that case, it unfortunately 
become more complicated to predict how much "results" a query might yield in 
general, because this depends on the column filter. I.e. if the filter is a 
name filter, or an "identity" slice filter (as in IdentitySliceFilter), we can 
try an estimate (in the latter case, something like maxResults / (estimatedKeys 
* meanColumnsCountPerKey)), but for other kind of slice filters, I don't think 
we can do much estimate. That being said, it might still be worth using the 
estimate in those two cases, because at least for 2ndary index query, the 
column filter will likely be very often an "identity" slice. And in that 
"identity" slice case, for 2ndary index and CQL3, the estimation should be 
something like maxResults / (meanColumnCountPerKey(index used) * 
meanColumnCountPerKey(parent cf)).

I'll note however that in both case, the meanColumnsCount is not necessary the 
perfect estimate to use, as it pretty much imply that half the time we will 
query one more range than is necessary. Instead we could either use the 
maxColumnsCount (if we really want to be conservative) or add some fudge factor 
to the mean. Fudge factor that may maybe be based on the difference between the 
min, mean and max columns count estimates.
                
> parallelize fetching rows for low-cardinality indexes
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1337
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1337
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jonathan Ellis
>            Assignee: David Alves
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: 
> 0001-CASSANDRA-1337-scan-concurrently-depending-on-num-rows.txt, 
> CASSANDRA-1337.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> currently, we read the indexed rows from the first node (in partitioner 
> order); if that does not have enough matching rows, we read the rows from the 
> next, and so forth.
> we should use the statistics fom CASSANDRA-1155 to query multiple nodes in 
> parallel, such that we have a high chance of getting enough rows w/o having 
> to do another round of queries (but, if our estimate is incorrect, we do need 
> to loop and do more rounds until we have enough data or we have fetched from 
> each node).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-1337) parallelize fetching rows for low-cardinality indexes

Reply via email to