[
https://issues.apache.org/jira/browse/MAPREDUCE-6931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122308#comment-16122308
]
Dennis Huo commented on MAPREDUCE-6931:
---------------------------------------
Thanks for the explanation! I have no strong preference about removing the
particular "Total Throughput" metric, but from my own experience using
TestDFSIO in the past, I do find that the "average single-stream throughput"
calculation historically provided by TestDFSIO can itself be somewhat
misleading in characterizing a cluster since it makes it difficult to infer the
level of concurrency corresponding to that per-stream performance without
backing out the numbers manually.
I see the new metric as being a useful measure of "Effective Aggregate
Throughput", all-in including overhead.
For example, if I use memory settings that only fit 1 container per physical
machine at a time, my TestDFSIO will trickle through 1 task per machine at a
time, and those single tasks will have very high single-stream throughput. If I
instead do memory packing so that every machine runs, say, 64 tasks
concurrently, then single-stream throughput will suffer significantly, while
total walltime will decrease significantly. With a walltime-based calculation,
I can see at a glance the approximate total throughput rating of my cluster
when everything is running at full throttle; I'd expect increasing concurrency
to increase aggregate throughput until IO limits are reached, where aggregate
throughput will become flat w.r.t. increasing concurrency or slightly declining
due to thrashing.
This could also be my cloud bias, where it becomes more important to
characterize a full-blast cluster against a remote filesystem vs caring so much
about per-stream throughputs.
It seems like an "effective aggregate throughput" calculation would help
encompass the cluster-wide effects of things like optimal CPU oversubscription
ratios, scheduler settings, speculative execution vs failure rates, etc.
I agree the wording and computation as-is might not be the right fit for this
though. I see a few options that might be worthwhile, possibly in some
combination:
* Change wording to say "Effective Aggregate Throughput" to more accurately
describe what the number means
* Add a metric displaying the "time" as "Slot Seconds" or something like that
so that user doesn't have to compute it by dividing "Total MBytes processes" by
"Throughput mb/sec" explicitly. This also helps clarify that the throughput is
computed in terms is slot time, not walltime.
* Additionally, maybe provide a measure of "average concurrency" taking total
slot time divided by walltime. This would legitimately consider scheduler
overheads; if my whole test only ran 1 task in an hour, and it only had 30
minutes of slot time, then a concurrency of 0.5 correctly characterizes the
fact that I'm only squeezing out 0.5 utilization after factoring in delays.
In any case, happy to just delete the one line in-place to have the
refactorings committed if you feel it's better not to change/add metrics or if
these are better discussed in a followup JIRA, let me know.
Re: MAPREDUCE and HDFS, I'll be sure remember TestDFSIO goes under HDFS in the
future. For this one I looked at a search for "TestDFSIO" in JIRA and eyeballed
that a plurality seemed to be under MAPREDUCE, a smaller fraction in HDFS, and
then remaining ones in HADOOP. Combined with this code going under the
hadoop-mapreduce directory, it looked like MAPREDUCE was more correct.
> Fix TestDFSIO "Total Throughput" calculation
> --------------------------------------------
>
> Key: MAPREDUCE-6931
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6931
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: benchmarks, test
> Affects Versions: 2.8.0
> Reporter: Dennis Huo
> Priority: Trivial
>
> The new "Total Throughput" line added in
> https://issues.apache.org/jira/browse/HDFS-9153 is currently calculated as
> {{toMB(size) / ((float)execTime)}} and claims to be in units of "MB/s", but
> {{execTime}} is in milliseconds; thus, the reported number is 1/1000x the
> actual value:
> {code:java}
> String resultLines[] = {
> "----- TestDFSIO ----- : " + testType,
> " Date & time: " + new Date(System.currentTimeMillis()),
> " Number of files: " + tasks,
> " Total MBytes processed: " + df.format(toMB(size)),
> " Throughput mb/sec: " + df.format(size * 1000.0 / (time *
> MEGA)),
> "Total Throughput mb/sec: " + df.format(toMB(size) /
> ((float)execTime)),
> " Average IO rate mb/sec: " + df.format(med),
> " IO rate std deviation: " + df.format(stdDev),
> " Test exec time sec: " + df.format((float)execTime / 1000),
> "" };
> {code}
> The different calculated fields can also use toMB and a shared
> milliseconds-to-seconds conversion to make it easier to keep units consistent.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]