[
https://issues.apache.org/jira/browse/CASSANDRA-11751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280143#comment-15280143
]
Jeff Griffith commented on CASSANDRA-11751:
-------------------------------------------
Thanks [~tjake]. Sorry for the duplicate.
> Histogram overflow in metrics
> -----------------------------
>
> Key: CASSANDRA-11751
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11751
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Environment: Cassandra 2.2.6 on Linux
> Reporter: Jeff Griffith
>
> One particular histogram in the cassandra metrics seems to overflow
> preventing the calculation of the mean on the dropwizard "Snapshot". Here is
> the exception that comes from the metrics library:
> {code}
> java.lang.IllegalStateException: Unable to compute ceiling for max when
> histogram overflowed
> at
> org.apache.cassandra.utils.EstimatedHistogram.rawMean(EstimatedHistogram.java:232)
> ~[apache-cassandra-2.2.6.jar:2.2.6-SNAPSHOT]
> at
> org.apache.cassandra.metrics.EstimatedHistogramReservoir$HistogramSnapshot.getMean(EstimatedHistogramReservoir.java:103)
> ~[apache-cassandra-2.2.6.jar:2.2.6-SNAPSHOT]
> at
> com.addthis.metrics3.reporter.config.SplunkReporter.reportHistogram(SplunkReporter.java:155)
> ~[reporter-config3-3.0.0.jar:3.0.0]
> at
> com.addthis.metrics3.reporter.config.SplunkReporter.report(SplunkReporter.java:101)
> ~[reporter-config3-3.0.0.jar:3.0.0]
> at
> com.codahale.metrics.ScheduledReporter.report(ScheduledReporter.java:162)
> ~[metrics-core-3.1.0.jar:3.1.0]
> at
> com.codahale.metrics.ScheduledReporter$1.run(ScheduledReporter.java:117)
> ~[metrics-core-3.1.0.jar:3.1.0]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [na:1.8.0_72]
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> [na:1.8.0_72]
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> [na:1.8.0_72]
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> [na:1.8.0_72]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [na:1.8.0_72]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [na:1.8.0_72]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72]
> {code}
> On deeper analysis, it seems like this is happening specifically on this
> metric:
> {code}
> ColUpdateTimeDeltaHistogram
> {code}
> I think this is where it is updated in ColumnFamilyStore.java
> {code}
> public void apply(DecoratedKey key, ColumnFamily columnFamily,
> SecondaryIndexManager.Updater indexer, OpOrder.Group opGroup, ReplayPosition
> replayPosition)
> {
> long start = System.nanoTime();
> Memtable mt = data.getMemtableFor(opGroup, replayPosition);
> final long timeDelta = mt.put(key, columnFamily, indexer, opGroup);
> maybeUpdateRowCache(key);
> metric.samplers.get(Sampler.WRITES).addSample(key.getKey(),
> key.hashCode(), 1);
> metric.writeLatency.addNano(System.nanoTime() - start);
> if(timeDelta < Long.MAX_VALUE)
> metric.colUpdateTimeDeltaHistogram.update(timeDelta);
> }
> {code}
> Considering it's calculating a mean, i don't know if perhaps a large sum
> might be overflowing? But that "if (timeDelta < Long.MAX_VALUE)" looks
> suspect, doesn't it?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)