Ram Venkatesh created HDFS-10175:
------------------------------------

             Summary: add per-operation stats to FileSystem.Statistics
                 Key: HDFS-10175
                 URL: https://issues.apache.org/jira/browse/HDFS-10175
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: hdfs-client
            Reporter: Ram Venkatesh


Currently FileSystem.Statistics exposes the following statistics:
BytesRead
BytesWritten
ReadOps
LargeReadOps
WriteOps

These are in-turn exposed as job counters by MapReduce and other frameworks. 
There is logic within DfsClient to map operations to these counters that can be 
confusing, for instance, mkdirs counts as a writeOp.

Proposed enhancement:
Add a statistic for each DfsClient operation including create, append, 
createSymlink, delete, exists, mkdirs, rename and expose them as new properties 
on the Statistics object. The operation-specific counters can be used for 
analyzing the load imposed by a particular job on HDFS. 
For example, we can use them to identify jobs that end up creating a large 
number of files.

Once this information is available in the Statistics object, the app frameworks 
like MapReduce can expose them as additional counters to be aggregated and 
recorded as part of job summary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to