[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated MAPREDUCE-6950:
-------------------------------------------
    Fix Version/s:     (was: 2.7.5)

> Error Launching job : java.io.IOException: Unknown Job job_xxx_xxx
> ------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6950
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6950
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mr-am
>    Affects Versions: 2.7.1
>            Reporter: zhengchenyu
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> some job report error, like this:
> {code}
> hadoop.mapreduce.Job.monitorAndPrintJob(Job.java 1367) [main] :  map 100% 
> reduce 100%
> [2017-08-31T20:27:12.591+08:00] [INFO] 
> hadoop.mapred.ClientServiceDelegate.getProxy(ClientServiceDelegate.java 277) 
> [main] : Application state is completed. FinalApplicationStatus=SUCCEEDED. 
> Redirecting to job history server
> [2017-08-31T20:27:12.821+08:00] [INFO] 
> hadoop.mapred.ClientServiceDelegate.getProxy(ClientServiceDelegate.java 277) 
> [main] : Application state is completed. FinalApplicationStatus=SUCCEEDED. 
> Redirecting to job history server
> [2017-08-31T20:27:13.039+08:00] [INFO] 
> hadoop.mapred.ClientServiceDelegate.getProxy(ClientServiceDelegate.java 277) 
> [main] : Application state is completed. FinalApplicationStatus=SUCCEEDED. 
> Redirecting to job history server
> [2017-08-31T20:27:13.256+08:00] [ERROR] 
> hadoop.streaming.StreamJob.submitAndMonitorJob(StreamJob.java 1034) [main] : 
> Error Launching job : java.io.IOException: Unknown Job job_xxx_xxx
> {code}
> I found the am container log, like below. Here we know error happened in 
> pipeline, maybe some dn error. And I also found some other reason which close 
> the JobHistoryEventHandler. So MR AM can't write the information for JH. So 
> client counldn't know whether the appplication is finished. 
> {code}
> 2017-08-31 20:27:10,813 INFO [Thread-1968] 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: In stop, 
> writing event MAP_ATTEMPT_STARTED
> 2017-08-31 20:27:10,814 ERROR [Thread-1968] 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Error writing 
> History Event: 
> org.apache.hadoop.mapreduce.jobhistory.TaskAttemptStartedEvent@2055ea0a
> java.io.EOFException: Premature EOF: no length prefix available
>         at 
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2292)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1317)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
> 2017-08-31 20:27:10,814 INFO [Thread-1968] 
> org.apache.hadoop.service.AbstractService: Service JobHistoryEventHandler 
> failed in state STOPPED; cause: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.EOFException: 
> Premature EOF: no length prefix available
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.EOFException: 
> Premature EOF: no length prefix available
>         at 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:580)
>         at 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.serviceStop(JobHistoryEventHandler.java:374)
>  
>         at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
>         at 
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
>         at 
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
>         at 
> org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157)
>         at 
> org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131)
> {code}
> This problem is serious , especially for hive. Job must rerun meaninglessly!  
> So I think we need to retry the operation of writing history event. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to