[ 
https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250807#comment-15250807
 ] 

Colin Patrick McCabe commented on HDFS-10175:
---------------------------------------------

I thought about this a little bit more, and I don't think that 
{{FileSystem#Statistics#StatisticsData}} is the best place to add these new 
statistics.  There are a few reasons.

Firstly, the statistics that we're interested in are inherently 
filesystem-specific.  For HDFS, we're interested in the number of RPCs to the 
NameNode-- calls like primitiveCreate, getBytesWithFutureGS, or concat.  For 
something like s3a, we're interested in how many PUT and GET requests we've 
done to Amazon S3.  s3 doesn't even support genstamps or the concat operation.  
Local filesystems have their own operations which are important.

Secondly, the thread-local-data mechanism is not really that appropriate for 
most operations.  Thread-local data is a big performance win when reading or 
writing bytes of data from or to a stream, since most such operations don't 
involve making an RPC.  We have big client-side buffers which mean that most 
reads and writes can return immediately. In constrast, operations like mkdir, 
rename, delete, etc. always end up making at least one RPC, since these 
operations cannot be buffered on the client.  In that case, the CPU overhead of 
doing an atomic increment is negligable.  But the overhead of storing all that 
thread-local data is significant.

I think what we should do is add an API to the FileSystem and FileContext base 
classes, which different types of FS can implement as appropriate.

> add per-operation stats to FileSystem.Statistics
> ------------------------------------------------
>
>                 Key: HDFS-10175
>                 URL: https://issues.apache.org/jira/browse/HDFS-10175
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>            Reporter: Ram Venkatesh
>            Assignee: Mingliang Liu
>         Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, 
> HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch, 
> HDFS-10175.005.patch, TestStatisticsOverhead.java
>
>
> Currently FileSystem.Statistics exposes the following statistics:
> BytesRead
> BytesWritten
> ReadOps
> LargeReadOps
> WriteOps
> These are in-turn exposed as job counters by MapReduce and other frameworks. 
> There is logic within DfsClient to map operations to these counters that can 
> be confusing, for instance, mkdirs counts as a writeOp.
> Proposed enhancement:
> Add a statistic for each DfsClient operation including create, append, 
> createSymlink, delete, exists, mkdirs, rename and expose them as new 
> properties on the Statistics object. The operation-specific counters can be 
> used for analyzing the load imposed by a particular job on HDFS. 
> For example, we can use them to identify jobs that end up creating a large 
> number of files.
> Once this information is available in the Statistics object, the app 
> frameworks like MapReduce can expose them as additional counters to be 
> aggregated and recorded as part of job summary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to