Hello, I'm now using hadoop-0.18.0 and testing it on a cluster with 1 master and 4 slaves. In hadoop-site.xml the value of "mapred.map.tasks" is 10. Because the values "throughput" and "average IO rate" are similar, I just post the values of "throughput" of the same command with 3 times running
- > hadoop-0.18.0/bin/hadoop jar testDFSIO.jar -write -fileSize 2048 -nrFiles 1 + with "dfs.replication = 1" => 33,60 / 31,48 / 30,95 + with "dfs.replication = 2" => 26,40 / 20,99 / 21,70 I find something strange while reading the source code. - The value of mapred.reduce.tasks is always set to 1 job.setNumReduceTasks(1) in the function runIOTest() and reduceFile = new Path(WRITE_DIR, "part-00000") in analyzeResult(). So I think, if we properly have mapred.reduce.tasks = 2, we will have on the file system 2 Paths to "part-00000" and "part-00001", e.g. /benchmarks/TestDFSIO/io_write/part-00000 - And i don't understand the line with "double med = rate / 1000 / tasks". Is it not "double med = rate * tasks / 1000 " -- View this message in context: http://www.nabble.com/TestDFSIO-delivers-bad-values-of-%22throughput%22-and-%22average-IO-rate%22-tp21312088p21312088.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
