We have intermittently seen cases where a job will "freeze" for some as yet unknown reason, and thereby block other processes waiting for that job to complete. I'm trying to modify our job-launching scripts to kill the job if it doesn't complete in a reasonable amount of time, however to do that I would need to know the job ID (ie along the lines of job_201104292017_9008).

As far as I can tell there is no good way to map the name we use to create the job to the job ID from the command line (none of the "hadoop job" options do it). Am I missing something in the CLI that would help with this? Alternate ideas are also welcome.

- Adam

Reply via email to