[jira] Commented: (HDFS-1338) Improve TestDFSIO

Arun C Murthy (JIRA) Tue, 10 Aug 2010 21:51:43 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12897137#action_12897137
 ]


Arun C Murthy commented on HDFS-1338:
-------------------------------------

{quote}
DFSIO benchmark is designed to measure HDFS data transfer performance only.
TestDFSIO is not intended to benchmark typical MR usage pattern.
TestDFSIO intentionally avoids any overhead or optimizations induced by MR 
framework.
{quote}

A benchmark should be something we use to reason about a particular aspect of 
the framework, in this case performance.

The point I'm trying to make is that TestDFSIO, as it stands, is formulated in 
a way which is impossible to reason about its results. I don't particularly 
care how we implement it and I agree it shouldn't be constrained by the 
vagaries of the Map-Reduce scheduler. However, we do need a benchmark which 
does node-local, rack-local, off-switch reads and writes in a predictable 
manner so that when we notice a difference in the results of the benchmark we 
are in position to reason about it.

> Improve TestDFSIO
> -----------------
>
>                 Key: HDFS-1338
>                 URL: https://issues.apache.org/jira/browse/HDFS-1338
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Arun C Murthy
>
> Currently the read test in TestDFSIO benchmark just opens a large side file 
> and measures the read performance. The MR scheduler has no opportunity to do 
> *any* optimization for the TestDFSIO MR application. The side-effect of this 
> is that it is *very* hard to do any meaningful analysis of the results of the 
> benchmark i.e. to check if node-local or rack-local or off-switch read 
> performance improved/degraded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1338) Improve TestDFSIO

Reply via email to