Re: Measuring running times

Antonio D'Ettole Wed, 17 Mar 2010 15:16:36 -0700

>
> At the default log level, Hadoop job logs (the ones you also get in the
> job's output directory under _logs/history)

Thanks Simone, that's exactly what I was looking for.

Look at the job history logs. They break down the times for each task

I understand you guys are talking about the same thing? I'm using the file
in /outputDir/__logs/history . Interestingly, before you told me, I was
convinced that was actually a .jar archive so it took me a little while to
figure out where these history logs where :)

Thanks again folks!
Antonio

On Wed, Mar 17, 2010 at 4:45 PM, Owen O'Malley <[email protected]> wrote:

>
> On Mar 17, 2010, at 4:47 AM, Antonio D'Ettole wrote:
>
>  Hi everybody,
>> as part of my project work at school I'm running some Hadoop jobs on a
>> cluster. I'd like to measure exactly how long each phase of the process
>> takes: mapping, shuffling (ideally divided in copying and sorting) and
>> reducing.
>>
>
> Look at the job history logs. They break down the times for each task. You
> need to run a script to aggregate them. You can see an example of the
> aggregation on my petabyte sort description:
>
>
> http://developer.yahoo.net/blogs/hadoop/2009/05/hadoop_sorts_a_petabyte_in_162.html
>
> -- Owen
>

Re: Measuring running times

Reply via email to