Re: Streaming and subprocess error code

Andrey Pankov Thu, 15 May 2008 04:23:03 -0700

Hi Zgeng,

Your help was significant - it was my mistake I messed up option names.Now it works as desired for me. Thanks a lot!


Zheng Shao wrote:

See
https://issues.apache.org/jira/secure/attachment/12369344/exit-status-20
57-0.16.patch

The option is called stream.non.zero.exit.is.failure, not
stream.non.zero.exit.status.is.failure.


Some users (including me) are pushing to make this option default to
true, but there is no response yet.
Dhruba, maybe you can help push that?
Zheng
-----Original Message-----
From: Joydeep Sen SarmaSent: Wednesday, May 14, 2008 3:02 PM
To: Zheng Shao
Subject: FW: Streaming and subprocess error code

Looks like the bug is not fixed correctly in trunk ..

-----Original Message-----
From: Andrey Pankov [mailto:[EMAIL PROTECTED]Sent: Wednesday, May 14, 2008 8:19 AM
To: [email protected]
Subject: Re: Streaming and subprocess error code

Hi,
I've tested this new option "-jobconfstream.non.zero.exit.status.is.failure=true". Seems working but stillnot good for me. When mapper/reducer program have read all input datasuccessfully and fails after that, streaming still finishes successfully
so there are no chances to know about some data post-processing errorsin subprocesses :(
Andrey Pankov wrote:
Hi Rick,

Thank you for the quick response! I see this feature is in trunk and
not
available in last stable release. Anyway will try if it works for mefrom the trunk, and will try does it catch segmentation faults too.
Rick Cox wrote:
Try "-jobconf stream.non.zero.exit.status.is.failure=true".

That will tell streaming that a non-zero exit is a task failure. To
turn that into an immediate whole job failure, I think configuring 0
task retries (mapred.map.max.attempts=1 and
mapred.reduce.max.attempts=1) will be sufficient.

rick
On Tue, May 13, 2008 at 8:15 PM, Andrey Pankov <[EMAIL PROTECTED]>wrote:
Hi all,
I'm looking a way to force Streaming to shutdown the whole job incase when
some of its subprocesses exits with non-zero error code.
We have next situation. Sometimes either mapper or reducer couldcrush, asa rule it returns some exit code. In this case entire streaming jobfinishessuccessfully, but that's wrong. Almost the same when any subprocessfinishes
with segmentation fault.
It's possible to check automatically if a subprocess crushed onlyvia logs
but it means you need to parse tons of outputs/logs/dirs/etc.
 In order to find logs of your job you have to know it's jobid ~
job_200805130853_0016. I don't know easy way to determine it - just
scan
stdout for the pattern. Then find logs of each mapper, each reducer,
find a
way to parse them, etc, etc...
So, is there any easiest way get correct status of the wholestreaming jobor I still have to build rather fragile parsing systems for suchpurposes?
 Thanks in advance.

 --
 Andrey Pankov



--
Andrey Pankov

Re: Streaming and subprocess error code

Reply via email to