Mike Baretta posted about a similar issue late last year and had an ugly fix that involved copying the Crunch jars into the distributed cache. You can see the whole thread here:
https://www.mail-archive.com/[email protected]/msg00438.html I myself haven't run into this one. J On Tue, Jul 21, 2015 at 2:12 PM, David Ortiz <[email protected]> wrote: > Hello everyone, > > > > I’m getting an interesting exception running a crunch pipeline from > Oozie. I have all the crunch dependencies bundled in a fat jar of > dependencies called crunch-lib. My avro schemas all live in a jar called > schemas. These all live in a sharelib directory for java actions on HDFS. > My job itself is in a jar which lives in a directory pointed to by > oozie.libpath. As far as I can tell the Oozie job is getting all of the > dependencies since my crunch client code runs and tries to spin up MR > jobs. However, it fails, with the jobs it creates having the following > exception: > > > > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: > java.lang.RuntimeException: java.lang.ClassNotFoundException: Class > org.apache.crunch.impl.mr.run.CrunchOutputFormat not found > > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:472) > > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:452) > > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.callWithJobClassLoader(MRAppMaster.java:1541) > > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:452) > > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:371) > > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1499) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:415) > > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1496) > > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1429) > > Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: > Class org.apache.crunch.impl.mr.run.CrunchOutputFormat not found > > at > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2112) > > at > org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:232) > > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:468) > > ... 11 more > > Caused by: java.lang.ClassNotFoundException: Class > org.apache.crunch.impl.mr.run.CrunchOutputFormat not found > > at > org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2018) > > at > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2110) > > ... 13 more > > > > > > Anyone have any ideas how the dependencies would be making it to the > crunch client, but not into the jar that crunch submits to the cluster? > > > > Thanks, > > Dave > *This email is intended only for the use of the individual(s) to whom it > is addressed. If you have received this communication in error, please > immediately notify the sender and delete the original email.* > -- Director of Data Science Cloudera <http://www.cloudera.com> Twitter: @josh_wills <http://twitter.com/josh_wills>
