Re: Yarnchild error : crunch-0.7.0

Bill Sparks Wed, 26 Feb 2014 14:48:30 -0800

Well to try and close the loop on this thread. I went back to first principles, 
download the 0.7.0 example code and built it against hadoop-2.0.6-alpha and 
used the -Dcrunch.platform=2 option to build. I've launched the job jar (with 
dependencies) and get the following error.


2014-02-26 16:32:50,468 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart 
= 0; bufvoid = 268435456
2014-02-26 16:32:50,468 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 
67108860; length = 16777216
2014-02-26 16:32:50,520 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : org.apache.crunch.CrunchRuntimeException: Could not 
read runtime node information
        at 
org.apache.crunch.impl.mr.run.CrunchTaskContext.<init>(CrunchTaskContext.java:48)
        at 
org.apache.crunch.impl.mr.run.CrunchMapper.setup(CrunchMapper.java:37)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:757)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)


If I look at the code it's trying to read the crunch.tmp.dir configuration and 
failing. We are on a Cray … so we have a little different HDFS structure (sorry 
about that). Currently this is our HDFS structure.

+ hdfs dfs -ls -R /
drwxrwxrwx   - jsparks supergroup          0 2014-02-26 16:32 /tmp
drwxrwx---   - jsparks supergroup          0 2014-02-26 16:32 /tmp/hadoop-yarn
drwxrwx---   - jsparks supergroup          0 2014-02-26 16:32 
/tmp/hadoop-yarn/staging
drwxrwx---   - jsparks supergroup          0 2014-02-26 16:32 
/tmp/hadoop-yarn/staging/history
drwxrwx---   - jsparks supergroup          0 2014-02-26 16:32 
/tmp/hadoop-yarn/staging/history/done
drwxrwxrwt   - jsparks supergroup          0 2014-02-26 16:32 
/tmp/hadoop-yarn/staging/history/done_intermediate
drwxr-xr-x   - jsparks supergroup          0 2014-02-26 16:32 /user
drwxr-xr-x   - jsparks supergroup          0 2014-02-26 16:32 /user/jsparks
-rw-r--r--   1 jsparks supergroup     610157 2014-02-26 16:32 
/user/jsparks/HuckleberryFinn.txt

And yes, we are reading Huck Finn …

--
Jonathan (Bill) Sparks
Software Architecture
Cray Inc.

From: Josh Wills <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Tuesday, February 25, 2014 4:19 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: Yarnchild error : crunch-0.7.0

The first error looks like a weird serialization error, like as if the Crunch 
version that was being used on the cluster was different from the one that was 
used to compile the client. Is crunch installed on the cluster, or is there 
another version of Crunch in the hadoop classpath?

The second one still looks to me like the hadoop1/hadoop2 incompatibility 
issue, like the local client was compiled with hadoop1 APIs instead of the 
hadoop2 APIs on the cluster.

There's an 0.7.0-hadoop2 maven target that should have the right API profile--
http://mvnrepository.com/artifact/org.apache.crunch/crunch-core/0.7.0-hadoop2

I know that we made an error in the 0.8.0 release w/the hadoop2 versioning, so 
0.8.0-hadoop2 doesn't work, but 0.8.1-hadoop2 or 0.8.2-hadoop2 should also work.



On Tue, Feb 25, 2014 at 1:54 PM, Bill Sparks 
<[email protected]<mailto:[email protected]>> wrote:
So interesting … same results.

This time I ran two versions 1) the examples from the crunch build and the 
other 2) a standalone application. The result for the standalone was the same 
as before  - I guess I expected that. The other failure was different and a 
little more confusing. I guess the question I have is can this be caused by the 
JDK used to build crunch. We are using JDK1.7

Failure 1)

2014-02-25 14:59:04,252 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : org.apache.crunch.CrunchRuntimeException: Could not 
read runtime node information
at 
org.apache.crunch.impl.mr.run.CrunchTaskContext.<init>(CrunchTaskContext.java:48)
at org.apache.crunch.impl.mr.run.CrunchMapper.setup(CrunchMapper.java:37)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:757)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
at java.security.AccessController.doPrivileged(AccessController.java:366)
at javax.security.auth.Subject.doAs(Subject.java:572)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
Caused by: java.io.InvalidClassException: 
org.apache.crunch.types.writable.Writables$4; local class incompatible: stream 
classdesc serialVersionUID = 5855040850180329703, local class serialVersionUID 
= 4130080921736307351

Failure 2)
2014-02-25 14:59:33,926 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error 
running child : java.lang.IncompatibleClassChangeError: 
org/apache/hadoop/mapreduce/JobContext.getConfiguration()Lorg/apache/hadoop/conf/Configuration;
at 
org.apache.crunch.impl.mr.run.CrunchTaskContext.<init>(CrunchTaskContext.java:42)
at org.apache.crunch.impl.mr.run.CrunchMapper.setup(CrunchMapper.java:37)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:757)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
at java.security.AccessController.doPrivileged(AccessController.java:366)
at javax.security.auth.Subject.doAs(Subject.java:572)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)


JDK
jsparks@jupiter:/lus/dal/jsparks/example/tmp/hdlogs.jsparks/userlogs> java 
-version
java version "1.7.0_45"
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)

--
Jonathan (Bill) Sparks
Software Architecture
Cray Inc.

From: Josh Wills <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Tuesday, February 25, 2014 1:54 PM

To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: Yarnchild error : crunch-0.7.0

Yeah, try it again w/ -Dcrunch.platform=2 instead of -Dhadoop.profile=2.0

J


On Tue, Feb 25, 2014 at 11:47 AM, Bill Sparks 
<[email protected]<mailto:[email protected]>> wrote:
Well I did the following and also changed the pom.xml to reference the correct 
hadoop version.

$ mvn clean install -Dhadoop.profile=2.0 –DskipTests

<hadoop.version>2.0.6-alpha</hadoop.version>

--
Jonathan (Bill) Sparks
Software Architecture
Cray Inc.

From: Josh Wills <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Tuesday, February 25, 2014 1:43 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: Yarnchild error : crunch-0.7.0

Hrm-- that's usually related to the API changes between hadoop1 and hadoop2. 
How did you build crunch, exactly? Did you use -Dcrunch.platform=2?

J


On Tue, Feb 25, 2014 at 11:37 AM, Bill Sparks 
<[email protected]<mailto:[email protected]>> wrote:
Can anyone shed some light on why I would be getting the following error when 
submitting a simple crunch wordcount example. Other Hadoop MR applications 
work, just it seems that Crunch is confused about some class definitions.

I'm running hadoop-2.0.6-alpha and have build crunch to match.

Hadoop 2.0.6-alpha
Subversion Unknown -r ca4c88898f95aaab3fd85b5e9c194ffd647c2109
Compiled by jenkins on 2013-10-30T07:19Z
>From source with checksum 95e88b2a9589fa69d6d5c1dbd48d4e


2014-02-25 13:23:00,049 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart 
= 0; bufvoid = 268435456
2014-02-25 13:23:00,049 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 
67108860; length = 16777216
2014-02-25 13:23:00,070 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error 
running child : java.lang.IncompatibleClassChangeError: 
org/apache/hadoop/mapreduce/JobContext.getConfiguration()Lorg/apache/hadoop/conf/Configuration;
at 
org.apache.crunch.impl.mr.run.CrunchTaskContext.<init>(CrunchTaskContext.java:42)
at org.apache.crunch.impl.mr.run.CrunchMapper.setup(CrunchMapper.java:37)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:757)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
at java.security.AccessController.doPrivileged(AccessController.java:366)
at javax.security.auth.Subject.doAs(Subject.java:572)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)




--
Director of Data Science
Cloudera<http://www.cloudera.com>
Twitter: @josh_wills<http://twitter.com/josh_wills>




--
Director of Data Science
Cloudera<http://www.cloudera.com>
Twitter: @josh_wills<http://twitter.com/josh_wills>

Re: Yarnchild error : crunch-0.7.0

Reply via email to