Tina Shan created MAPREDUCE-7327: ------------------------------------ Summary: Job.waitForCompletion function can sleep most for 596 hours when jobclient.completion.poll.interval is misconfigured , causing the job to hang Key: MAPREDUCE-7327 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7327 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 3.3.0 Reporter: Tina Shan
The loop terminates depending on a configurable value and there is little sanity checking on this value. When jobclient.completion.poll.interval is misconfigured to INT_MAX, it can cause the loop to sleep at most for 596 hours. The thread would get stuck and never return to the user even if the job completes. We suggest adding a cap value or a warning message. {code:java} public boolean waitForCompletion(boolean verbose ) throws IOException, InterruptedException, ClassNotFoundException { ... while (!isComplete()) { try { Thread.sleep(completionPollIntervalMillis); } catch (InterruptedException ie) { } ... } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org