I have a cluster  of 20 nodes, running 3-4 mapred jobs concurrently.
Not sure I can get anything more than job.name ( which is the
pig_script name). What I can do is to have a daemon running ( aka
agent )  in each node which will keep doing grep on new
hadoop/logs/*/userlog/stderr with some pre-defined formatted error
string, and keep uploading it to a hadoop-directory.

I was hoping that someone would have already done it :(

-Prasen

On Tue, Feb 23, 2010 at 6:00 PM, Rekha Joshi <[email protected]> wrote:
> The attempt_XXXX_XXXX_N is in the format containing taskid, tip, jobid, that 
> possibly can narrow down your search to get error under syslogs, and I think 
> the logging has jobid, so a grep might help.
>
> Cheers,
> /R
>
>
> On 2/23/10 5:22 PM, "prasenjit mukherjee" <[email protected]> wrote:
>
> Generally the stderr goes to the file
> <hadoop>/logs/userlog/attempt_XXXX_XXXX_N/stderr in the hadoop node
> running that script. But it is not practical as it requires user to go
> and search all the stderr files, most of them probably will be
> pseudo-empty ( just the headers footers, but no actual stderr output
> ).
>
> Any available packages/techniques to  help users to fetch stderr
> coming out of custom DEFINE scripts  in pig ?
>
> -Thanks,
> Prasen
>
>

Reply via email to