[ https://issues.apache.org/jira/browse/MAPREDUCE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15889206#comment-15889206 ]
Jian He commented on MAPREDUCE-6852: ------------------------------------ looks like getJobID is used in the same class in several other places, we may just use this method. > Job#updateStatus() failed with NPE due to race condition > -------------------------------------------------------- > > Key: MAPREDUCE-6852 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6852 > Project: Hadoop Map/Reduce > Issue Type: Bug > Reporter: Junping Du > Assignee: Junping Du > Attachments: MAPREDUCE-6852.patch > > > Like MAPREDUCE-6762, we found this issue in a cluster where Pig query > occasionally failed on NPE - "Pig uses JobControl API to track MR job status, > but sometimes Job History Server failed to flush job meta files to HDFS which > caused the status update failed." Beside NPE in > o.a.h.mapreduce.Job.getJobName, we also get NPE in Job.updateStatus() and the > exception is as following: > {noformat} > Caused by: java.lang.NullPointerException > at org.apache.hadoop.mapreduce.Job$1.run(Job.java:323) > at org.apache.hadoop.mapreduce.Job$1.run(Job.java:320) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1833) > at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320) > at org.apache.hadoop.mapreduce.Job.isComplete(Job.java:604) > {noformat} > We found state here is null. However, we already check the job state to be > RUNNING as code below: > {noformat} > public boolean isComplete() throws IOException { > ensureState(JobState.RUNNING); > updateStatus(); > return status.isJobComplete(); > } > {noformat} > The only possible reason here is two threads are calling here for the same > time: ensure state first, then one thread update the state to null while the > other thread hit NPE issue here. > We should fix this NPE exception. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org