Clearly your input size isn't changing. And depending on how they are
distributed on the nodes, there could be Datanode/disks contention.
The better way to model this is by scaling the input data also linearly. More
nodes should process more data in the same amount of time.
Thanks,
+Vinod
On
But I still want to fine the most efficient assignment and scale both data
and nodes as you said, for example in my result, 2 is the best, and 8 is
better than 4.
Why is it sub-linear from 2 to 4, super-linear from 4 to 8. I find it is
hard to model this result. Can you give me some hint about thi
See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1541/
###
## LAST 60 LINES OF THE CONSOLE
###
[...truncated 31951 lines...]
java.lang.OutOfMemoryError: unable t