[ 
https://issues.apache.org/jira/browse/CASSANDRA-18515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Adamson updated CASSANDRA-18515:
-------------------------------------
    Description: 
The range read algorithm relies on the Index API’s notion of estimated result 
rows to decide how many replicas to contact in parallel during its first round 
of requests. The more results expected from a replica for a token range, the 
fewer replicas the range read will initially try to contact. Like SASI, SAI 
floors that estimate to a huge negative number to make sure it’s selected over 
other indexes, and this floors the concurrency factor to 1. The actual formula 
looks like this:

{code:java}
// resultsPerRange, from SAI, is a giant negative number
concurrencyFactor = Math.max(1, Math.min(ranges.rangeCount(), (int) 
Math.ceil(command.limits().count() / resultsPerRange)));
{code}

Although that concurrency factor is updated as actual results stream in, only 
sending a single range request to a single replica in every case for SAI is not 
ideal. For example, assume I have a 3 node cluster and a keyspace at RF=1, with 
10 rows spread across the 3 nodes, without vnodes. Issuing a query that matches 
all 10 rows with a LIMIT of 10 will make 2 or 3 serial range requests from the 
coordinator, one to each of the 3 nodes.

This can be fixed by allowing indexes to bypass the initial concurrency 
calculation allowing SAI queries to contact the entire ring in a single round 
of queries, or at worst the minimum number of rounds as bounded by the existing 
statutory maximum ranges per round.


  was:
The range read algorithm relies on the Index API’s notion of estimated result 
rows to decide how many replicas to contact in parallel during its first round 
of requests. The more results expected from a replica for a token range, the 
fewer replicas the range read will initially try to contact. Like SASI, SAI 
floors that estimate to a huge negative number to make sure it’s selected over 
other indexes, and this floors the concurrency factor to 1. The actual formula 
looks like this:

{{// resultsPerRange, from SAI, is a giant negative number
concurrencyFactor = Math.max(1, Math.min(ranges.rangeCount(), (int) 
Math.ceil(command.limits().count() / resultsPerRange)));}}

Although that concurrency factor is updated as actual results stream in, only 
sending a single range request to a single replica in every case for SAI is not 
ideal. For example, assume I have a 3 node cluster and a keyspace at RF=1, with 
10 rows spread across the 3 nodes, without vnodes. Issuing a query that matches 
all 10 rows with a LIMIT of 10 will make 2 or 3 serial range requests from the 
coordinator, one to each of the 3 nodes.

This can be fixed by allowing indexes to bypass the initial concurrency 
calculation allowing SAI queries to contact the entire ring in a single round 
of queries, or at worst the minimum number of rounds as bounded by the existing 
statutory maximum ranges per round.



> Optimize Initial Concurrency Selection for Range Read Algorithm During SAI 
> Queries
> ----------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-18515
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18515
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Feature/2i Index
>            Reporter: Mike Adamson
>            Priority: Normal
>
> The range read algorithm relies on the Index API’s notion of estimated result 
> rows to decide how many replicas to contact in parallel during its first 
> round of requests. The more results expected from a replica for a token 
> range, the fewer replicas the range read will initially try to contact. Like 
> SASI, SAI floors that estimate to a huge negative number to make sure it’s 
> selected over other indexes, and this floors the concurrency factor to 1. The 
> actual formula looks like this:
> {code:java}
> // resultsPerRange, from SAI, is a giant negative number
> concurrencyFactor = Math.max(1, Math.min(ranges.rangeCount(), (int) 
> Math.ceil(command.limits().count() / resultsPerRange)));
> {code}
> Although that concurrency factor is updated as actual results stream in, only 
> sending a single range request to a single replica in every case for SAI is 
> not ideal. For example, assume I have a 3 node cluster and a keyspace at 
> RF=1, with 10 rows spread across the 3 nodes, without vnodes. Issuing a query 
> that matches all 10 rows with a LIMIT of 10 will make 2 or 3 serial range 
> requests from the coordinator, one to each of the 3 nodes.
> This can be fixed by allowing indexes to bypass the initial concurrency 
> calculation allowing SAI queries to contact the entire ring in a single round 
> of queries, or at worst the minimum number of rounds as bounded by the 
> existing statutory maximum ranges per round.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to