[
https://issues.apache.org/jira/browse/HADOOP-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518007#comment-14518007
]
Anu Engineer commented on HADOOP-11873:
---------------------------------------
Nope, it is not possible to get the read time of a particular block. Think of
this as counters that offer a view of the total performance of your datanode. I
have not see block level counters even on linux. if you really need that I
would suppose it is something that you would do it in your application as
opposed to in the infrastructure.
> Include disk read/write time in FileSystem.Statistics
> -----------------------------------------------------
>
> Key: HADOOP-11873
> URL: https://issues.apache.org/jira/browse/HADOOP-11873
> Project: Hadoop Common
> Issue Type: New Feature
> Components: metrics
> Reporter: Kay Ousterhout
> Priority: Minor
>
> Measuring the time spent blocking on reading / writing data from / to disk is
> very useful for debugging performance problems in applications that read data
> from Hadoop, and can give much more information (e.g., to reflect disk
> contention) than just knowing the total amount of data read. I'd like to add
> something like "diskMillis" to FileSystem#Statistics to track this.
> For data read from HDFS, this can be done with very low overhead by adding
> logging around calls to RemoteBlockReader2.readNextPacket (because this reads
> larger chunks of data, the time added by the instrumentation is very small
> relative to the time to actually read the data). For data written to HDFS,
> this can be done in DFSOutputStream.waitAndQueueCurrentPacket.
> As far as I know, if you want this information today, it is only currently
> accessible by turning on HTrace. It looks like HTrace can't be selectively
> enabled, so a user can't just turn on the tracing on
> RemoteBlockReader2.readNextPacket for example, and instead needs to turn on
> tracing everywhere (which then introduces a bunch of overhead -- so sampling
> is necessary). It would be hugely helpful to have native metrics for time
> reading / writing to disk that are sufficiently low-overhead to be always on.
> (Please correct me if I'm wrong here about what's possible today!)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)