Thank you for the quick reply. I am indeed using the Oozie workflow lib directory as described here: http://oozie.apache.org/docs/3.3.2/WorkflowFunctionalSpec.html#a7_Workflow_Application_Deployment.
The primary job, which implements Tool, is able to run, it's just the jobs launched by the doFn() which fail. Is there a step where I might need to tell the Crunch pipeline about the jars loaded by Oozie? On Fri, Nov 21, 2014 at 5:27 PM, Micah Whitacre <[email protected]> wrote: > The support of a lib folder inside of a jar is not necessarily guaranteed > to be supported on all versions of Hadoop.[1] > > We typically go with the "uber" jar where we use maven-shade-plugin to > actually explode the crunch dependencies and others into the assembly jar. > Another approach since you are using Oozie is to include the jar in the > workflow lib directory. That should put the jar on the classpath. The > last approach is obviously to manually use DistributedCache yourself which > will distribute it out to the cluster. > > [1] - > http://blog.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/ > > On Fri, Nov 21, 2014 at 4:15 PM, Mike Barretta <[email protected]> > wrote: > >> All, >> >> I'm running an MRPipeline from crunch-core 0.11.0-hadoop2 on a CDH5.1 >> cluster via oozie. While the main job runs okay, the doFn() it calls fails >> due to the CNFE. The jar containing my classes does indeed contain >> lib/crunch-core-0.11.0-hadoop2.jar. >> >> Does the crunch jar need to be added to the hadoop lib on all nodes? It >> seems like that would/should be unnecessary. >> >> Thanks, >> Mike >> > >
