[
https://issues.apache.org/jira/browse/HDFS-3343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13398825#comment-13398825
]
Todd Lipcon commented on HDFS-3343:
-----------------------------------
Few quick comments:
- Now that I see again how complicated {{transferToFully}} is, I think I
disagree with my earlier idea that we should copy-paste it. It seems like
instead we should add an API to SocketOutputStream like:
{{void transferToFully(FileChannel ch, int pos, int len, MutableCounterLong
transferTime, MutableCounterLong waitTime)}}
(and have the old call delegate to that and pass null for the metrics)
- The new metrics in DataNode need better names (eg
"readDataPacketFromDiskMillis" and "sendDataPacketToNetworkMillis" or
something?), and I think they should be MutableRates instead of counters,
right? ie you need to count the number of ops in addition to the sum time, or
else the sum time is uninterpretable.
- I think the counter increments should be summed inside the loop locally, and
then only added to the metric at the end of each packet. Otherwise it will skew
the averages
- It seems like we can add a simple unit test (or just a new assertion to an
existing test like TestPRead) that these counters have non-zero values.
- Maybe the unit for these should be microseconds instead of milliseconds?
Given a lot of reads should hit buffer cache, having more precision seems
useful.
> Improve metrics for DN read latency
> -----------------------------------
>
> Key: HDFS-3343
> URL: https://issues.apache.org/jira/browse/HDFS-3343
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: data-node
> Reporter: Todd Lipcon
> Assignee: Andrew Wang
> Attachments: hdfs-3343.patch
>
>
> Similar to HDFS-3170 on the write side, we should improve the metrics that
> are generated on the DN for read latency. We should have separate metrics for
> the time spent in {{transferTo}} vs {{waitWritable}} so that it's easy to
> distinguish slow local disks from slow readers on the other end of the socket.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira