[
https://issues.apache.org/jira/browse/CASSANDRA-18515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17740679#comment-17740679
]
Andres de la Peña commented on CASSANDRA-18515:
-----------------------------------------------
I think there isn't a run for j17, and the run for j11 doesn't include repeated
runs of the new {{{}ConcurrencyFactorTest{}}}. I'm starting new runs for both
things:
||PR||CI||
|[trunk|https://github.com/apache/cassandra/pull/2463]|[j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/3008/workflows/737510be-0260-4e6b-9b31-335e4631099f]
[j17|https://app.circleci.com/pipelines/github/adelapena/cassandra/3008/workflows/32b31597-53e2-4551-8faa-7cd1809d79bf]|
+1 assuming those runs don't find any new failures.
[~maedhroz] are you going to review this one?
> Optimize Initial Concurrency Selection for Range Read Algorithm During SAI
> Queries
> ----------------------------------------------------------------------------------
>
> Key: CASSANDRA-18515
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18515
> Project: Cassandra
> Issue Type: Improvement
> Components: Feature/2i Index
> Reporter: Mike Adamson
> Assignee: Mike Adamson
> Priority: Normal
> Time Spent: 3h 50m
> Remaining Estimate: 0h
>
> The range read algorithm relies on the Index API’s notion of estimated result
> rows to decide how many replicas to contact in parallel during its first
> round of requests. The more results expected from a replica for a token
> range, the fewer replicas the range read will initially try to contact. Like
> SASI, SAI floors that estimate to a huge negative number to make sure it’s
> selected over other indexes, and this floors the concurrency factor to 1. The
> actual formula looks like this:
> {code:java}
> // resultsPerRange, from SAI, is a giant negative number
> concurrencyFactor = Math.max(1, Math.min(ranges.rangeCount(), (int)
> Math.ceil(command.limits().count() / resultsPerRange)));
> {code}
> Although that concurrency factor is updated as actual results stream in, only
> sending a single range request to a single replica in every case for SAI is
> not ideal. For example, assume I have a 3 node cluster and a keyspace at
> RF=1, with 10 rows spread across the 3 nodes, without vnodes. Issuing a query
> that matches all 10 rows with a LIMIT of 10 will make 2 or 3 serial range
> requests from the coordinator, one to each of the 3 nodes.
> This can be fixed by allowing indexes to bypass the initial concurrency
> calculation allowing SAI queries to contact the entire ring in a single round
> of queries, or at worst the minimum number of rounds as bounded by the
> existing statutory maximum ranges per round.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]