[
https://issues.apache.org/jira/browse/HADOOP-14972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Loughran resolved HADOOP-14972.
-------------------------------------
Resolution: Won't Fix
> S3A add histogram metrics types for latency, etc.
> -------------------------------------------------
>
> Key: HADOOP-14972
> URL: https://issues.apache.org/jira/browse/HADOOP-14972
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 2.9.0, 3.0.0
> Reporter: Sean Mackrory
> Assignee: Sean Mackrory
> Priority: Major
>
> We'd like metrics to track latencies for various operations, such as
> latencies for various request types, etc. This may need to be done different
> from current metrics types that are just counters of type long, and it needs
> to be done intelligently as these measurements are very numerous, and are
> primarily interesting due to the outliers that are unpredictably far from
> normal. A few ideas on how we might implement something like this:
> * An adaptive, sparse histogram type. I envision something configurable with
> a maximumum granularity and a maximum number of bins. Initially, datapoints
> are tallied in bins with the maximum granularity. As we reach the maximum
> number of bins, bins are merged in even / odd pairs. There's some complexity
> here, especially to make it perform well and allow safe concurrency, but I
> like the ability to configure reasonable limits and retain as much
> granularity as possible without knowing the exact shape of the data
> beforehand.
> * LongMetrics named "read_latency_600ms", "read_latency_800ms" to represent
> bins. This was suggested to me by [~fabbri]. I initially did not like the
> idea of having either so many hard-coded bins for however many op types, but
> this could also be done dynamically (we just hard-code which measurements we
> take, and with what granularity to group them, e.g. read_latency, 200 ms).
> The resulting dataset could be sparse and dynamic to allow for extreme
> outliers, but the granularity is still pre-determined.
> * We could also simply track a certain number of the highest latencies, and
> basic descriptive statistics like a running average, min / max, etc.
> Inherently more limited in what it can show us, but much simpler and might
> still provide some insight when analyzing performance.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]