Caleb Rackliffe created CASSANDRA-17523:
-------------------------------------------

             Summary: Reduce histogram snapshot long[] allocation overhead 
during speculative read and write threshold updates
                 Key: CASSANDRA-17523
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17523
             Project: Cassandra
          Issue Type: Improvement
            Reporter: Caleb Rackliffe
            Assignee: Caleb Rackliffe


Every 5 seconds with the default {{read_request_timeout}} (or in the old naming 
scheme {{{}read_request_timeout_in_ms{}}}), a scheduled task updates the 
speculation thresholds (for reads and writes) for all active tables. However, 
there are a few issues with the way we do this:
 
1.) Whether or not the {{SpeculativeRetryPolicy}} implementations in use 
actually looks at them, we create latency histogram snapshots to pass to 
{{{}calculateThreshold(){}}}. We could trivially avoid this by having the 
method take an argument of type {{Sampling}} and build the snapshot only when 
necessary.
 
2.) The only reason we build the histogram snapshot is to find the new 
threshold value for the percentile based policies. 
{{EstimatedHistogramReservoirSnapshot}} creates copies of both the decaying and 
non-decaying buckets, but we don’t use the non-decaying values at all for 
percentile calculation. Just avoiding the non-decaying values array creation 
would cut allocations in half.
 
Given even our snapshots aren’t perfectly consistent, it might also be possible 
to calculate a percentile value directly from the reservoir’s decaying buckets, 
although that might be less accurate, as new values could be added to the 
buckets after a count is calculated.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to