[ 
https://issues.apache.org/jira/browse/CASSANDRA-14281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Burman updated CASSANDRA-14281:
---------------------------------------
    Description: Currently, the DecayingEstimatedHistogramReservoir acquires a 
lock for each update operation, which causes a contention if there are more 
than one thread updating the histogram. This impacts scalability when using 
larger machines. We should make it lock-free as much as possible and also avoid 
a single CAS-update from blocking all the concurrent threads from making an 
update.  (was: Currently for each write/read/rangequery/CAS touching the CFS we 
write a latency metric which takes a lot of processing time (up to 66% of the 
total processing time if the update was empty).

The way latencies are recorded is to use both a dropwizard "Timer" as well as 
"Counter". Latter is used for totalLatency and the previous is decaying metric 
for rates and certain percentile metrics. We then replicate all of these CFS 
writes to the KeyspaceMetrics and globalWriteLatencies.

For example, for each CFS write we do first write to the CFS's metrics and then 
to Keyspace's metrics and finally globalMetrics. The way Timer is built is to 
maintain a Histogram and a Meter and update both when Timer is updated.  The 
Meter then updates 4 different values (1 minute rate, 5 minute rate, 15 minutes 
rate and a counter).

So for each CFS write we actually do 15 different counter updates. And then of 
course maintain their states at the same time while writing. These operations 
are very slow when combined.

A small JMH benchmark doing an update against a single LatencyMetrics with 4 
threads gives us around 5.2M updates / second. With the current writeLatency 
metric (having 2 parents) we get only 1.6M updates / second. 

I'm proposing to update this to use a small circular buffer HdrHistogram 
implementation. We would maintain a rolling buffer with last 15 minutes of 
histograms (30 seconds per histogram) and update the correct bucket each time. 
When requesting metrics we would then merge requested amount of buckets to a 
new histogram and parse results from it. This moves some of the load from 
writing of the metrics to reading them (which is much more infrequent 
operation), including the parent metrics. It also allows us to maintain the 
current metrics structure - if we wish to do so.

My prototype with this approach improves the performance to around 13.8M 
updates/second, thus almost 9 times faster than the current approach. We also 
maintain HdrHistogram already in the Cassandra's lib so there's no new 
dependencies to add (java-driver also uses it). 

FUTURE:

This opens up some possibilities, such as replacing all dropwizard 
Histograms/Meters with the new approach (to reduce overhead elsewhere in the 
codebase). It would also allow us to supply downloadable histograms directly 
from the Cassandra or store them to the disk each time a bucket is filled if 
user wishes to monitor latency history or graph all percentiles. 

HdrHistogram also provides the ability to "fix" these histograms with pause 
tracking, such as GC pauses which we currently can't do (as dropwizard 
histograms can't be merged).)
        Summary: Reduce contention on DecayingEstimatedHistogramReservoir  
(was: LatencyMetrics performance)

> Reduce contention on DecayingEstimatedHistogramReservoir
> --------------------------------------------------------
>
>                 Key: CASSANDRA-14281
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14281
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Michael Burman
>            Assignee: Michael Burman
>            Priority: Major
>
> Currently, the DecayingEstimatedHistogramReservoir acquires a lock for each 
> update operation, which causes a contention if there are more than one thread 
> updating the histogram. This impacts scalability when using larger machines. 
> We should make it lock-free as much as possible and also avoid a single 
> CAS-update from blocking all the concurrent threads from making an update.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to