[
https://issues.apache.org/jira/browse/HADOOP-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15282629#comment-15282629
]
Steve Loughran commented on HADOOP-11873:
-----------------------------------------
This could be implemented as a FilterFileSystem; it would then be available to
collect metrics on any FS, rather than just HDFS
> Include disk read/write time in FileSystem.Statistics
> -----------------------------------------------------
>
> Key: HADOOP-11873
> URL: https://issues.apache.org/jira/browse/HADOOP-11873
> Project: Hadoop Common
> Issue Type: New Feature
> Components: metrics
> Reporter: Kay Ousterhout
> Priority: Minor
>
> Measuring the time spent blocking on reading / writing data from / to disk is
> very useful for debugging performance problems in applications that read data
> from Hadoop, and can give much more information (e.g., to reflect disk
> contention) than just knowing the total amount of data read. I'd like to add
> something like "diskMillis" to FileSystem#Statistics to track this.
> For data read from HDFS, this can be done with very low overhead by adding
> logging around calls to RemoteBlockReader2.readNextPacket (because this reads
> larger chunks of data, the time added by the instrumentation is very small
> relative to the time to actually read the data). For data written to HDFS,
> this can be done in DFSOutputStream.waitAndQueueCurrentPacket.
> As far as I know, if you want this information today, it is only currently
> accessible by turning on HTrace. It looks like HTrace can't be selectively
> enabled, so a user can't just turn on the tracing on
> RemoteBlockReader2.readNextPacket for example, and instead needs to turn on
> tracing everywhere (which then introduces a bunch of overhead -- so sampling
> is necessary). It would be hugely helpful to have native metrics for time
> reading / writing to disk that are sufficiently low-overhead to be always on.
> (Please correct me if I'm wrong here about what's possible today!)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]