On 11/01/11 16:40, Raj V wrote:
Ted


Thanks. I have all the graphs I need that include, map reduce timeline, system 
activity for all the nodes when the sort was running. I will publish them once 
I have them in some presentable format.,

For legal reasons, I really don't want to send the complete job histiory files.

My question is still this. When running terasort, would the CPU, disk and 
network utilization of all the nodes be more or less similar or completely 
different.

They can be different. The JT pushes out work to machines when they report in, some may get more work than others, so generate more local data. This will have follow-on consequences. In a live system things are different as the work tends to follow the data, so machines with (or near) the data you need get the work.

It's a really hard thing to say "is the cluster working right", when bringing it up, everyone is really guessing about expected performance.

-Steve

Reply via email to