Measuring Shuffle time for MR job

2012-08-27 Thread praveenesh kumar
Is there a way to know the total shuffle time of a map-reduce job - I mean some command or output that can tell that ? I want to measure total map, total shuffle and total reduce time for my MR job -- how can I achieve it ? I am using hadoop 0.20.205 Regards, Praveenesh

Re: Measuring Shuffle time for MR job

2012-08-27 Thread Bertrand Dechoux
Shuffle time is considered as part of the reduce step. Without reduce, there is no need for shuffling. One way to measure it would be using the full reduce time with a '/dev/null' reducer. I am not aware of any way to measure it. Regards Bertrand On Mon, Aug 27, 2012 at 8:18 AM, praveenesh

Re: Measuring Shuffle time for MR job

2012-08-27 Thread Raj Vishwanathan
You can extract the shuffle time from the job log. Take a look at  https://github.com/rajvish/hadoop-summary  Raj From: Bertrand Dechoux decho...@gmail.com To: common-user@hadoop.apache.org Sent: Monday, August 27, 2012 12:57 AM Subject: Re: Measuring

Number of reducers

2012-08-27 Thread Abhishek
Hi all, I just want to know that, based on what factor map reduce framework decides number of reducers to launch for a job By default only one reducer will be launched for a given job is this right? If we explicitly does not mention number to launch via command line or driver class. If i