[jira] [Updated] (CASSANDRA-17523) Reduce histogram snapshot long[] allocation overhead during speculative read and write threshold updates

Caleb Rackliffe (Jira) Tue, 05 Apr 2022 13:29:05 -0700


     [ 
https://issues.apache.org/jira/browse/CASSANDRA-17523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Caleb Rackliffe updated CASSANDRA-17523:
----------------------------------------
    Test and Documentation Plan: added to existing 
{{DecayingEstimatedHistogramReservoirTest}} suite to cover new snapshot type
                         Status: Patch Available  (was: Open)

Pushed a patch that addressed the two points in the description. I didn't 
attempt to avoid array copying altogether, but those allocations should now be 
cut in half for threshold updates without degrading the accuracy of the 
snapshot (and cut to zero when we aren't even using the percentile or hybrid 
policies).

|trunk|
|[patch|https://github.com/apache/cassandra/pull/1551]|
|[CircleCI|https://app.circleci.com/pipelines/github/maedhroz/cassandra?branch=CASSANDRA-17523-trunk&filter=all]|

> Reduce histogram snapshot long[] allocation overhead during speculative read 
> and write threshold updates
> --------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-17523
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17523
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Consistency/Coordination, Observability/Metrics
>            Reporter: Caleb Rackliffe
>            Assignee: Caleb Rackliffe
>            Priority: Normal
>             Fix For: 4.x
>
>
> Every 5 seconds with the default {{read_request_timeout}} (or in the old 
> naming scheme {{{}read_request_timeout_in_ms{}}}), a scheduled task updates 
> the speculation thresholds (for reads and writes) for all active tables. 
> However, there are a few issues with the way we do this:
>  
> 1.) Whether or not the {{SpeculativeRetryPolicy}} implementations in use 
> actually looks at them, we create latency histogram snapshots to pass to 
> {{{}calculateThreshold(){}}}. We could trivially avoid this by having the 
> method take an argument of type {{Sampling}} and build the snapshot only when 
> necessary.
>  
> 2.) The only reason we build the histogram snapshot is to find the new 
> threshold value for the percentile based policies. 
> {{EstimatedHistogramReservoirSnapshot}} creates copies of both the decaying 
> and non-decaying buckets, but we don’t use the non-decaying values at all for 
> percentile calculation. Just avoiding the non-decaying values array creation 
> would cut allocations in half.
>  
> Given even our snapshots aren’t perfectly consistent, it might also be 
> possible to calculate a percentile value directly from the reservoir’s 
> decaying buckets, although that might be less accurate, as new values could 
> be added to the buckets after a count is calculated.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-17523) Reduce histogram snapshot long[] allocation overhead during speculative read and write threshold updates

Reply via email to