[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics

Steve Loughran (JIRA) Sat, 23 Apr 2016 04:58:30 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15255237#comment-15255237
 ]


Steve Loughran commented on HDFS-10175:
---------------------------------------

Yes, contention is an issue, especially against filesystems which respond 
fasts. But contention is not mandatory.

Coda Hale counters use a {{com.codahale.metrics.LongAdder}} class which queues 
up addition ops under load so threads don't block:

bq. Under low update contention, the two classes have similar characteristics. 
But under high contention, expected throughput of this class is significantly 
higher, at the expense of higher space consumption.

This class is now built in to Java 8 as 
{{java.util.concurrent.atomic.LongAdder}} alongside 
{{java.util.concurrent.atomic.LongAccumulator}}. 

Even with code built against Java 7, whatever is done here should be designed 
so that the switch to java 8 should be seamless and transparent. That is: the 
specific counter implementation hidden. I'd almost advocate using the coda hale 
one except that it would add a new dependency everywhere; it's only used in a 
couple of modules right now. And for trunk we may as well switch to Java 8.

(life would be so much easier of volatiles implemented atomic add/inc ops the 
way CPUs allow)

> add per-operation stats to FileSystem.Statistics
> ------------------------------------------------
>
>                 Key: HDFS-10175
>                 URL: https://issues.apache.org/jira/browse/HDFS-10175
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>            Reporter: Ram Venkatesh
>            Assignee: Mingliang Liu
>         Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, 
> HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, 
> HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java
>
>
> Currently FileSystem.Statistics exposes the following statistics:
> BytesRead
> BytesWritten
> ReadOps
> LargeReadOps
> WriteOps
> These are in-turn exposed as job counters by MapReduce and other frameworks. 
> There is logic within DfsClient to map operations to these counters that can 
> be confusing, for instance, mkdirs counts as a writeOp.
> Proposed enhancement:
> Add a statistic for each DfsClient operation including create, append, 
> createSymlink, delete, exists, mkdirs, rename and expose them as new 
> properties on the Statistics object. The operation-specific counters can be 
> used for analyzing the load imposed by a particular job on HDFS. 
> For example, we can use them to identify jobs that end up creating a large 
> number of files.
> Once this information is available in the Statistics object, the app 
> frameworks like MapReduce can expose them as additional counters to be 
> aggregated and recorded as part of job summary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics

Reply via email to