[ 
https://issues.apache.org/jira/browse/CRUNCH-696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105434#comment-17105434
 ] 

Ben Roling commented on CRUNCH-696:
-----------------------------------

I don't think this would be an issue that typical Crunch users would encounter. 
 For it to happen, multiple versions of Crunch have to come into play in the 
execution of a single Crunch pipeline.  Normally that would not happen.  The 
Crunch version is usually supplied by the user code and a single classpath is 
used for the pipeline.

One scenario where I can imagine it might occur is if Crunch is treated as a 
cluster-provided dependency.  In that case around the time of a Crunch upgrade 
in the cluster, a job may be submitted with the old version of Crunch and then 
executed with the new version.  I'm not sure if any users are treating Crunch 
as cluster-provided?  I know in our organization we don't do that.  I have 
created a patch we could apply to address that case if desired:

https://github.com/apache/crunch/pull/33

> EOFException in App Master from FormatBundle#fromSerialized
> -----------------------------------------------------------
>
>                 Key: CRUNCH-696
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-696
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.0
>            Reporter: Andrew Olson
>            Assignee: Josh Wills
>            Priority: Major
>
> After CRUNCH-685 if there's a version mismatch between what a job was 
> launched with and what it runs with in the cluster, a job can fail to start 
> with an EOFException in the application master as shown below.
> {noformat}
> 2019-10-11 14:36:41,327 INFO [main] 
> org.apache.hadoop.service.AbstractService: Service 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state INITED; cause: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.lang.RuntimeException: java.io.EOFException
> 2019-10-11 14:36:41,332 ERROR [main] 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.lang.RuntimeException: java.io.EOFException
>       at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$2.call(MRAppMaster.java:531)
>       at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$2.call(MRAppMaster.java:511)
>       at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.callWithJobClassLoader(MRAppMaster.java:1614)
>       at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:511)
>       at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:301)
>       at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>       at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$5.run(MRAppMaster.java:1572)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
>       at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1569)
>       at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1502)
> Caused by: java.lang.RuntimeException: java.io.EOFException
>       at 
> org.apache.crunch.io.FormatBundle.fromSerialized(FormatBundle.java:97)
>       at 
> org.apache.crunch.io.CrunchOutputs.getNamedOutputs(CrunchOutputs.java:127)
>       at 
> org.apache.crunch.io.CrunchOutputs.getOutputCommitter(CrunchOutputs.java:87)
>       at 
> org.apache.crunch.impl.mr.run.CrunchOutputFormat.getOutputCommitter(CrunchOutputFormat.java:52)
>       at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$2.call(MRAppMaster.java:529)
>       ... 11 more
> Caused by: java.io.EOFException
>       at java.io.DataInputStream.readBoolean(DataInputStream.java:244)
>       at org.apache.crunch.io.FormatBundle.readFields(FormatBundle.java:292)
>       at 
> org.apache.crunch.io.FormatBundle.fromSerialized(FormatBundle.java:94)
>       ... 15 more
> {noformat}
> FormatBundle#fromSerialized should have handled the end of stream possibility 
> when attempting to read the new boolean attribute introduced into the 
> serialization.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to