I believe that's one of the use cases for the Hadoop subproject Chukwa.
Check it out.

-D

On Tue, Feb 23, 2010 at 5:40 AM, prasenjit mukherjee
<[email protected]>wrote:

> I have a cluster  of 20 nodes, running 3-4 mapred jobs concurrently.
> Not sure I can get anything more than job.name ( which is the
> pig_script name). What I can do is to have a daemon running ( aka
> agent )  in each node which will keep doing grep on new
> hadoop/logs/*/userlog/stderr with some pre-defined formatted error
> string, and keep uploading it to a hadoop-directory.
>
> I was hoping that someone would have already done it :(
>
> -Prasen
>
> On Tue, Feb 23, 2010 at 6:00 PM, Rekha Joshi <[email protected]>
> wrote:
> > The attempt_XXXX_XXXX_N is in the format containing taskid, tip, jobid,
> that possibly can narrow down your search to get error under syslogs, and I
> think the logging has jobid, so a grep might help.
> >
> > Cheers,
> > /R
> >
> >
> > On 2/23/10 5:22 PM, "prasenjit mukherjee" <[email protected]> wrote:
> >
> > Generally the stderr goes to the file
> > <hadoop>/logs/userlog/attempt_XXXX_XXXX_N/stderr in the hadoop node
> > running that script. But it is not practical as it requires user to go
> > and search all the stderr files, most of them probably will be
> > pseudo-empty ( just the headers footers, but no actual stderr output
> > ).
> >
> > Any available packages/techniques to  help users to fetch stderr
> > coming out of custom DEFINE scripts  in pig ?
> >
> > -Thanks,
> > Prasen
> >
> >
>

Reply via email to