We have intermittently seen cases where a job will "freeze" for some as
yet unknown reason, and thereby block other processes waiting for that
job to complete. I'm trying to modify our job-launching scripts to kill
the job if it doesn't complete in a reasonable amount of time, however
to do that I would need to know the job ID (ie along the lines of
job_201104292017_9008).
As far as I can tell there is no good way to map the name we use to
create the job to the job ID from the command line (none of the "hadoop
job" options do it). Am I missing something in the CLI that would help
with this? Alternate ideas are also welcome.
- Adam
- Getting (or setting) a job ID Adam Phelps
-