[jira] [Commented] (CASSANDRA-20333) Reduce DecayingEstimatedHistogramReservoir update cost

Dmitry Konstantinov (Jira) Sun, 16 Feb 2025 15:35:36 -0800


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-20333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17927578#comment-17927578
 ]


Dmitry Konstantinov commented on CASSANDRA-20333:
-------------------------------------------------

currently during a rescale we copy and replace the whole decay buckets array, 
before CASSANDRA-1936 the logic rescaled each bucket individually, can 
switching to the single array approach return back the race condition issue?..

But probably we can deal with it in a different way like delaying snapshotting 
while a rescale is in progress..

> Reduce DecayingEstimatedHistogramReservoir update cost
> ------------------------------------------------------
>
>                 Key: CASSANDRA-20333
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20333
>             Project: Apache Cassandra
>          Issue Type: Improvement
>          Components: Observability/Metrics
>            Reporter: Dmitry Konstantinov
>            Assignee: Dmitry Konstantinov
>            Priority: Normal
>
> Based on the discussions in CASSANDRA-20250
> [~benedict]:
> {quote}We can probably improve our reservoir performance if we want to, 
> perhaps in a follow-up patch? For instance, we could have a small 
> thread-local buffer of (time, latency) pairs that we periodically flush 
> together, so that we amortise the memory latency costs. Or we could explore 
> maintaining a per-thread HdrHistogram, that we periodically flush. This would 
> be a good time to explore fully migrating to HdrHistogram, as it has built-in 
> merge semantics iirc. I am not sure what the decayed version would look like 
> there, but I am certain we could maintain a separate decayed HdrHistogram.
> Having a thread-local buffer of updates we intend to flush to the histograms 
> would amortise the latency penalties without fundamentally redesigning 
> anything (as well as reducing contention).
> Other possibilities might include e.g. changing the bucket distribution so we 
> don't need a LUT for computing lg2, although the above would gracefully 
> handle any contribution this has as well.
> {quote}
>  
> Other ideas about squeezing extra bits from the current design:
>  * bucket id can be calculated once (currently we do it 2 times for decaying 
> and current buckets), like:
> {code:java}
> int stripe = (int) (Thread.currentThread().getId() & (nStripes - 1));
> int bucket = stripedIndex(index, stripe);
> rescaledDecayingBuckets.update(bucket, now);
> updateBucket(buckets, bucket, 1); {code}
>  * for histograms on highly loaded paths we can use another number of stripes 
> (by default it is 2, we can set for example 4 for them)
>  * I noticed some variation in performance for a micro-benchmark (existing 
> one: DecayingEstimatedHistogramBench) depending on what exact value for 
> distributionPrime is used (but I need to double check it)
>  * forwardDecayWeight function depends on SampledClock value, so we can try 
> to recalculate the weight only when time is changed



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-20333) Reduce DecayingEstimatedHistogramReservoir update cost

Reply via email to