[ 
https://issues.apache.org/jira/browse/HDFS-9579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-9579:
--------------------------
    Attachment: HDFS-9579-6.patch

Thanks [~sjlee0] again for the review. Thanks [~liuml07] for the input about 
the test results. Here is the new patch based on the suggestions. Answers 
inline for some of the suggestions.

bq. why does this need to be public now?
In the MR change of the bigger patch, I refactored MR code that rely on this 
method being public. Given that patch hasn't been published yet, I have removed 
it in the new patch. We can update that later in the MR patch if necessary.

bq. if we're adopting a test using equals(), shouldn't the following code later 
in the same method be fixed too?
Good point. The reason is specific to NetworkTopology implementation. When a 
new leaf node is added to a NetworkTopology, inner node will be reused if it 
already exists in the tree; while different leaf node objects with the same 
path can exist in the same tree. {{getWeight}} function has the same issue. Not 
sure if we really need to fix it.

bq. why is a call to update the stats added here (actualGetFromOneDataNode()) 
instead of the previous location (pread())?
In order to update metrics by distance, it has to be done inside 
{{actualGetFromOneDataNode}} in order to access the datanode info. Given it is 
better to put the original {{incrementBytesRead}} and the new distance-based 
metrics together, the patch has {{actualGetFromOneDataNode}} call the new 
{{updateFileSystemReadStats}} which process both old and new read metrics.

bq. I suppose we need to add trivial overrides for equals() and hashCode() to 
address the findbugs issue...
That reminds me why the functions were there in earlier patches.:)



> Provide bytes-read-by-network-distance metrics at FileSystem.Statistics level
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-9579
>                 URL: https://issues.apache.org/jira/browse/HDFS-9579
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>         Attachments: HDFS-9579-2.patch, HDFS-9579-3.patch, HDFS-9579-4.patch, 
> HDFS-9579-5.patch, HDFS-9579-6.patch, HDFS-9579.patch, MR job counters.png
>
>
> For cross DC distcp or other applications, it becomes useful to have insight 
> as to the traffic volume for each network distance to distinguish cross-DC 
> traffic, local-DC-remote-rack, etc.
> FileSystem's existing {{bytesRead}} metrics tracks all the bytes read. To 
> provide additional metrics for each network distance, we can add additional 
> metrics to FileSystem level and have {{DFSInputStream}} update the value 
> based on the network distance between client and the datanode.
> {{DFSClient}} will resolve client machine's network location as part of its 
> initialization. It doesn't need to resolve datanode's network location for 
> each read as {{DatanodeInfo}} already has the info.
> There are existing HDFS specific metrics such as {{ReadStatistics}} and 
> {{DFSHedgedReadMetrics}}. But these metrics are only accessible via 
> {{DFSClient}} or {{DFSInputStream}}. Not something that application framework 
> such as MR and Tez can get to. That is the benefit of storing these new 
> metrics in FileSystem.Statistics.
> This jira only includes metrics generation by HDFS. The consumption of these 
> metrics at MR and Tez will be tracked by separated jiras.
> We can add similar metrics for HDFS write scenario later if it is necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to