TestDFSIO delivers bad values of "throughput" and "average IO rate"

tienduc_dinh Tue, 06 Jan 2009 07:05:27 -0800

Hello,

I'm now using hadoop-0.18.0 and testing it on a cluster with 1 master and 4
slaves. In hadoop-site.xml the value of "mapred.map.tasks" is 10. Because
the values "throughput" and "average IO rate" are similar, I just post the
values of "throughput" of the same command with 3 times running


- > hadoop-0.18.0/bin/hadoop jar testDFSIO.jar -write -fileSize 2048
-nrFiles 1

+ with "dfs.replication = 1" => 33,60 / 31,48 / 30,95

+ with "dfs.replication = 2" => 26,40 / 20,99 / 21,70

I find something strange while reading the source code. 

- The value of mapred.reduce.tasks is always set to 1 

job.setNumReduceTasks(1) in the function runIOTest()  and reduceFile = new
Path(WRITE_DIR, "part-00000") in analyzeResult().

So I think, if we properly have mapred.reduce.tasks = 2, we will have on the
file system 2 Paths to "part-00000" and "part-00001", e.g.
/benchmarks/TestDFSIO/io_write/part-00000

- And i don't understand the line with "double med = rate / 1000 / tasks".
Is it not "double med = rate * tasks / 1000 "
-- 
View this message in context: 
http://www.nabble.com/TestDFSIO-delivers-bad-values-of-%22throughput%22-and-%22average-IO-rate%22-tp21312088p21312088.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

TestDFSIO delivers bad values of "throughput" and "average IO rate"

Reply via email to