I believe that's one of the use cases for the Hadoop subproject Chukwa. Check it out.
-D On Tue, Feb 23, 2010 at 5:40 AM, prasenjit mukherjee <[email protected]>wrote: > I have a cluster of 20 nodes, running 3-4 mapred jobs concurrently. > Not sure I can get anything more than job.name ( which is the > pig_script name). What I can do is to have a daemon running ( aka > agent ) in each node which will keep doing grep on new > hadoop/logs/*/userlog/stderr with some pre-defined formatted error > string, and keep uploading it to a hadoop-directory. > > I was hoping that someone would have already done it :( > > -Prasen > > On Tue, Feb 23, 2010 at 6:00 PM, Rekha Joshi <[email protected]> > wrote: > > The attempt_XXXX_XXXX_N is in the format containing taskid, tip, jobid, > that possibly can narrow down your search to get error under syslogs, and I > think the logging has jobid, so a grep might help. > > > > Cheers, > > /R > > > > > > On 2/23/10 5:22 PM, "prasenjit mukherjee" <[email protected]> wrote: > > > > Generally the stderr goes to the file > > <hadoop>/logs/userlog/attempt_XXXX_XXXX_N/stderr in the hadoop node > > running that script. But it is not practical as it requires user to go > > and search all the stderr files, most of them probably will be > > pseudo-empty ( just the headers footers, but no actual stderr output > > ). > > > > Any available packages/techniques to help users to fetch stderr > > coming out of custom DEFINE scripts in pig ? > > > > -Thanks, > > Prasen > > > > >
