I have a cluster of 20 nodes, running 3-4 mapred jobs concurrently. Not sure I can get anything more than job.name ( which is the pig_script name). What I can do is to have a daemon running ( aka agent ) in each node which will keep doing grep on new hadoop/logs/*/userlog/stderr with some pre-defined formatted error string, and keep uploading it to a hadoop-directory.
I was hoping that someone would have already done it :( -Prasen On Tue, Feb 23, 2010 at 6:00 PM, Rekha Joshi <[email protected]> wrote: > The attempt_XXXX_XXXX_N is in the format containing taskid, tip, jobid, that > possibly can narrow down your search to get error under syslogs, and I think > the logging has jobid, so a grep might help. > > Cheers, > /R > > > On 2/23/10 5:22 PM, "prasenjit mukherjee" <[email protected]> wrote: > > Generally the stderr goes to the file > <hadoop>/logs/userlog/attempt_XXXX_XXXX_N/stderr in the hadoop node > running that script. But it is not practical as it requires user to go > and search all the stderr files, most of them probably will be > pseudo-empty ( just the headers footers, but no actual stderr output > ). > > Any available packages/techniques to help users to fetch stderr > coming out of custom DEFINE scripts in pig ? > > -Thanks, > Prasen > >
