[
https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15254483#comment-15254483
]
Steve Loughran commented on HDFS-10175:
---------------------------------------
In HADOOP-13028 I've just implemented something for S3A lifted out of Azure's
Metrics2 integration; essentially I've got a per-FS instance set of metrics2
netrucs, which are incremented in the FS or input stream as things happen.
There's no per-thread tracking, —its collecting overall stats, rather than
trying to add up the cost of a single execution, which is what per-thread stuff
would, presumably do. This is lower cost but still permits microbenchmark-style
analysis of performance problems against S3a. It doesn't directly let you get
results of a job, "34MB of data, 2000 stream aborts, 1998 backward seeks" which
are the kind of things I'm curious about.
I think adding some duration tracking for the blobstore ops would be good too;
I used that in
[[Swift|https://github.com/apache/hadoop/tree/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/util]]
where it helped show that one of the public endpoints was throttling delete
calls, so timing out tests.
Again, that points more to the classic metrics stuff, or, even better Coda Hale
histograms
Maybe, and this would be nice, whatever is implemented here is (a) extensible
to support some duration type too, at least in parallel, and (b) could be used
as a back end by both Metrics2 and Coda Hale metrics registries. That way the
slightly more expensive metric systems would have access to this more raw data.
> add per-operation stats to FileSystem.Statistics
> ------------------------------------------------
>
> Key: HDFS-10175
> URL: https://issues.apache.org/jira/browse/HDFS-10175
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs-client
> Reporter: Ram Venkatesh
> Assignee: Mingliang Liu
> Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch,
> HDFS-10175.002.patch, HDFS-10175.003.patch, HDFS-10175.004.patch,
> HDFS-10175.005.patch, HDFS-10175.006.patch, TestStatisticsOverhead.java
>
>
> Currently FileSystem.Statistics exposes the following statistics:
> BytesRead
> BytesWritten
> ReadOps
> LargeReadOps
> WriteOps
> These are in-turn exposed as job counters by MapReduce and other frameworks.
> There is logic within DfsClient to map operations to these counters that can
> be confusing, for instance, mkdirs counts as a writeOp.
> Proposed enhancement:
> Add a statistic for each DfsClient operation including create, append,
> createSymlink, delete, exists, mkdirs, rename and expose them as new
> properties on the Statistics object. The operation-specific counters can be
> used for analyzing the load imposed by a particular job on HDFS.
> For example, we can use them to identify jobs that end up creating a large
> number of files.
> Once this information is available in the Statistics object, the app
> frameworks like MapReduce can expose them as additional counters to be
> aggregated and recorded as part of job summary.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)