On Nov 8, 2007, at 8:39 PM, Doug Judd wrote:

Thanks, Owen.  Did it look like the system was CPU bound?

I looked while the Java one was running and it was working a couple of the cpus pretty hard. (I was only running with the default 2tasks/ node, which is really low given these are nice 8 cpu machines.)

I should also mention that I was using a 500 node hdfs cluster that is a superset of the 39 node + 1 job tracker map/reduce cluster, so most of the hdfs reads and writes were outside of the map/reduce cluster.

It would be
interesting to see some top output for the various runs. It would also be interesting to profile the Java stuff in both Pipes mode and non- Pipes mode.

What I'm doing is putting together a somewhat representative workload to look at increasing utilization, so at some point I'll deep dive into the detail, but the first pass will be looking at the top level issues.

-- Owen


Reply via email to