Re: Streaming and subprocess error code

Andrey Pankov Tue, 13 May 2008 08:57:39 -0700

Hi Rick,

Thank you for the quick response! I see this feature is in trunk and notavailable in last stable release. Anyway will try if it works for mefrom the trunk, and will try does it catch segmentation faults too.



Rick Cox wrote:

Try "-jobconf stream.non.zero.exit.status.is.failure=true".

That will tell streaming that a non-zero exit is a task failure. To
turn that into an immediate whole job failure, I think configuring 0
task retries (mapred.map.max.attempts=1 and
mapred.reduce.max.attempts=1) will be sufficient.

rick

On Tue, May 13, 2008 at 8:15 PM, Andrey Pankov <[EMAIL PROTECTED]> wrote:

Hi all,

 I'm looking a way to force Streaming to shutdown the whole job in case when
some of its subprocesses exits with non-zero error code.

 We have next situation. Sometimes either mapper or reducer could crush, as
a rule it returns some exit code. In this case entire streaming job finishes
successfully, but that's wrong. Almost the same when any subprocess finishes
with segmentation fault.

 It's possible to check automatically if a subprocess crushed only via logs
but it means you need to parse tons of outputs/logs/dirs/etc.
 In order to find logs of your job you have to know it's jobid ~
job_200805130853_0016. I don't know easy way to determine it - just scan
stdout for the pattern. Then find logs of each mapper, each reducer, find a
way to parse them, etc, etc...

 So, is there any easiest way get correct status of the whole streaming job
or I still have to build rather fragile parsing systems for such purposes?

 Thanks in advance.

 --
 Andrey Pankov



--
Andrey Pankov

Re: Streaming and subprocess error code

Reply via email to