Kay Ousterhout created HADOOP-11873:
---------------------------------------

             Summary: Include disk read/write time in FileSystem.Statistics
                 Key: HADOOP-11873
                 URL: https://issues.apache.org/jira/browse/HADOOP-11873
             Project: Hadoop Common
          Issue Type: New Feature
          Components: metrics
            Reporter: Kay Ousterhout
            Priority: Minor


Measuring the time spent blocking on reading / writing data from / to disk is 
very useful for debugging performance problems in applications that read data 
from Hadoop, and can give much more information (e.g., to reflect disk 
contention) than just knowing the total amount of data read.  I'd like to add 
something like "diskMillis" to FileSystem#Statistics to track this.

For data read from HDFS, this can be done with very low overhead by adding 
logging around calls to RemoteBlockReader2.readNextPacket (because this reads 
larger chunks of data, the time added by the instrumentation is very small 
relative to the time to actually read the data).  For data written to HDFS, 
this can be done in DFSOutputStream.waitAndQueueCurrentPacket.

As far as I know, if you want this information today, it is only currently 
accessible by turning on HTrace. It looks like HTrace can't be selectively 
enabled, so a user can't just turn on the tracing on 
RemoteBlockReader2.readNextPacket for example, and instead needs to turn on 
tracing everywhere (which then introduces a bunch of overhead -- so sampling is 
necessary).  It would be hugely helpful to have native metrics for time reading 
/ writing to disk that are sufficiently low-overhead to be always on. (Please 
correct me if I'm wrong here about what's possible today!)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to