Hi all,
I'm looking a way to force Streaming to shutdown the whole job in case
when some of its subprocesses exits with non-zero error code.
We have next situation. Sometimes either mapper or reducer could crush,
as a rule it returns some exit code. In this case entire streaming job
finishes successfully, but that's wrong. Almost the same when any
subprocess finishes with segmentation fault.
It's possible to check automatically if a subprocess crushed only via
logs but it means you need to parse tons of outputs/logs/dirs/etc.
In order to find logs of your job you have to know it's jobid ~
job_200805130853_0016. I don't know easy way to determine it - just scan
stdout for the pattern. Then find logs of each mapper, each reducer,
find a way to parse them, etc, etc...
So, is there any easiest way get correct status of the whole streaming
job or I still have to build rather fragile parsing systems for such
purposes?
Thanks in advance.
--
Andrey Pankov