Take a look at https://github.com/rajvish/hadoop-summary
>________________________________ > From: bharath vissapragada <[email protected]> >To: [email protected] >Sent: Monday, October 29, 2012 10:03 PM >Subject: Re: Tools for extracting data from hadoop logs > > >Hi Binglin, > > >Great scripts ..Thanks for sharing :D > > >Regards, > > >On Tue, Oct 30, 2012 at 8:54 AM, Binglin Chang <[email protected]> wrote: > >Hi, >> >> >>I think you want to analyze hadoop job logs in jobtracker history folder? >>These logs are in a centralized folder and don't need tools like flume or >>scribe to gather them. >>I used to write a simple python script to parse those log files, and generate >>csv/json reports, basically you can use it to get execution time, counter, >>status of job, taks, attempts, maybe you can modify it to meet you needs. >> >> >>Thanks, >>Binglin >> >> >> >> >>On Tue, Oct 30, 2012 at 9:48 AM, bharath vissapragada >><[email protected]> wrote: >> >>Hi list, >>> >>> >>>Are the any tools for parsing and extracting data from Hadoop's Job Logs? I >>>want to do stuff like .. >>> >>> >>>1. Getting run time of each map/reduce task >>>2. Total map/reduce tasks ran on a particular node in that job and some >>>similar stuff >>> >>> >>>Any suggestions? >>> >>> >>> >>>Thanks >> > > > >-- >Regards, >Bharath .V >w:http://researchweb.iiit.ac.in/~bharath.v > > >
