[
https://issues.apache.org/jira/browse/HDFS-9579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15285883#comment-15285883
]
Hudson commented on HDFS-9579:
------------------------------
FAILURE: Integrated in Hadoop-trunk-Commit #9773 (See
[https://builds.apache.org/job/Hadoop-trunk-Commit/9773/])
HDFS-10208. Addendum for HDFS-9579: to handle the case when client (sjlee: rev
61f46be071e42f9eb49a54b1bd2e54feac59f808)
*
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java
*
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/net/TestNetworkTopology.java
*
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ClientContext.java
*
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java
*
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java
* hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
*
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java
*
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NodeBase.java
> Provide bytes-read-by-network-distance metrics at FileSystem.Statistics level
> -----------------------------------------------------------------------------
>
> Key: HDFS-9579
> URL: https://issues.apache.org/jira/browse/HDFS-9579
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Ming Ma
> Assignee: Ming Ma
> Fix For: 2.9.0, 3.0.0-alpha1
>
> Attachments: HDFS-9579-10.patch, HDFS-9579-2.patch,
> HDFS-9579-3.patch, HDFS-9579-4.patch, HDFS-9579-5.patch, HDFS-9579-6.patch,
> HDFS-9579-7.patch, HDFS-9579-8.patch, HDFS-9579-9.patch,
> HDFS-9579-branch-2.patch, HDFS-9579.patch, MR job counters.png
>
>
> For cross DC distcp or other applications, it becomes useful to have insight
> as to the traffic volume for each network distance to distinguish cross-DC
> traffic, local-DC-remote-rack, etc.
> FileSystem's existing {{bytesRead}} metrics tracks all the bytes read. To
> provide additional metrics for each network distance, we can add additional
> metrics to FileSystem level and have {{DFSInputStream}} update the value
> based on the network distance between client and the datanode.
> {{DFSClient}} will resolve client machine's network location as part of its
> initialization. It doesn't need to resolve datanode's network location for
> each read as {{DatanodeInfo}} already has the info.
> There are existing HDFS specific metrics such as {{ReadStatistics}} and
> {{DFSHedgedReadMetrics}}. But these metrics are only accessible via
> {{DFSClient}} or {{DFSInputStream}}. Not something that application framework
> such as MR and Tez can get to. That is the benefit of storing these new
> metrics in FileSystem.Statistics.
> This jira only includes metrics generation by HDFS. The consumption of these
> metrics at MR and Tez will be tracked by separated jiras.
> We can add similar metrics for HDFS write scenario later if it is necessary.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]