[ 
https://issues.apache.org/jira/browse/CASSANDRA-20333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17927652#comment-17927652
 ] 

Benedict Elliott Smith commented on CASSANDRA-20333:
----------------------------------------------------

It looks like CASSANDRA-19365 exacerbated other race conditions, by overwriting 
newer updates with the snapshot values as the rebase progresses? So we lose any 
newer data.

We can address this properly if we like. But it isn't _trivial_. My preference 
would be to switch to some form of backwards decay rather than forwards decay, 
since this means the updater does not need to synchronise with the rebaser. 
This might look like picking some large constant we add for regular updates, 
and when rebasing adjusting the current value down by some decay factor, and on 
read we would divide by the constant we use for regular updates. The rebaser 
would then just record the position it has both started and finished rebasing, 
and when querying the collection we would detect that we are racing with the 
rebase of a bucket we're reading and simply wait for it to complete.

If we want to retain forward decay, we would probably want to do something like 
detect that a rebase is in progress, and take over migrating the bucket we're 
updating. So, we would have e.g. some bit set backed by an AtomicLongArray 
recording whether the rebase has been started or finished for the bucket in 
question, and if not we would take ownership of applying the rebase ourselves 
before updating the bucket.

> Reduce DecayingEstimatedHistogramReservoir update cost
> ------------------------------------------------------
>
>                 Key: CASSANDRA-20333
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20333
>             Project: Apache Cassandra
>          Issue Type: Improvement
>          Components: Observability/Metrics
>            Reporter: Dmitry Konstantinov
>            Assignee: Dmitry Konstantinov
>            Priority: Normal
>
> Based on the discussions in CASSANDRA-20250
> [~benedict]:
> {quote}We can probably improve our reservoir performance if we want to, 
> perhaps in a follow-up patch? For instance, we could have a small 
> thread-local buffer of (time, latency) pairs that we periodically flush 
> together, so that we amortise the memory latency costs. Or we could explore 
> maintaining a per-thread HdrHistogram, that we periodically flush. This would 
> be a good time to explore fully migrating to HdrHistogram, as it has built-in 
> merge semantics iirc. I am not sure what the decayed version would look like 
> there, but I am certain we could maintain a separate decayed HdrHistogram.
> Having a thread-local buffer of updates we intend to flush to the histograms 
> would amortise the latency penalties without fundamentally redesigning 
> anything (as well as reducing contention).
> Other possibilities might include e.g. changing the bucket distribution so we 
> don't need a LUT for computing lg2, although the above would gracefully 
> handle any contribution this has as well.
> {quote}
>  
> Other ideas about squeezing extra bits from the current design:
>  * bucket id can be calculated once (currently we do it 2 times for decaying 
> and current buckets), like:
> {code:java}
> int stripe = (int) (Thread.currentThread().getId() & (nStripes - 1));
> int bucket = stripedIndex(index, stripe);
> rescaledDecayingBuckets.update(bucket, now);
> updateBucket(buckets, bucket, 1); {code}
>  * for histograms on highly loaded paths we can use another number of stripes 
> (by default it is 2, we can set for example 4 for them)
>  * I noticed some variation in performance for a micro-benchmark (existing 
> one: DecayingEstimatedHistogramBench) depending on what exact value for 
> distributionPrime is used (but I need to double check it)
>  * forwardDecayWeight function depends on SampledClock value, so we can try 
> to recalculate the weight only when time is changed



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to