[jira] [Created] (MAPREDUCE-6852) Job#updateStatus() failed with NPE due to race condition

Junping Du (JIRA) Tue, 28 Feb 2017 12:43:38 -0800

Junping Du created MAPREDUCE-6852:
-------------------------------------

             Summary: Job#updateStatus() failed with NPE due to race condition
                 Key: MAPREDUCE-6852
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6852
             Project: Hadoop Map/Reduce
          Issue Type: Bug
            Reporter: Junping Du
            Assignee: Junping Du



Like MAPREDUCE-6762, we found this issue in a cluster where Pig query 
occasionally failed on NPE - "Pig uses JobControl API to track MR job status, 
but sometimes Job History Server failed to flush job meta files to HDFS which 
caused the status update failed." Beside NPE in o.a.h.mapreduce.Job.getJobName, 
we also get NPE in Job.updateStatus() and the exception is as following:
{noformat}
Caused by: java.lang.NullPointerException
        at org.apache.hadoop.mapreduce.Job$1.run(Job.java:323)
        at org.apache.hadoop.mapreduce.Job$1.run(Job.java:320)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1833)
        at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320)
        at org.apache.hadoop.mapreduce.Job.isComplete(Job.java:604)
{noformat}
We found state here is null. However, we already check the job state to be 
RUNNING as code below:
{noformat}
  public boolean isComplete() throws IOException {
    ensureState(JobState.RUNNING);
    updateStatus();
    return status.isJobComplete();
  }
{noformat}
The only possible reason here is two threads are calling here for the same 
time: ensure state first, then one thread update the state to null while the 
other thread hit NPE issue here.
We should fix this NPE exception.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (MAPREDUCE-6852) Job#updateStatus() failed with NPE due to race condition

Reply via email to