Take a look at the pig-withouthadoop target in the build.xml from your pig
release.  Usage of the target is documented here (for a different goal,
although):

http://thedatachef.blogspot.com/2011/01/apache-pig-08-with-cloudera-cdh3.html

Essentially, the target allows you to build pig without hadoop's JARs, but
then you're responsible for supplying the JARs back in at runtime (via
bin/pig modifications or some other method).

Norbert

On Wed, Oct 12, 2011 at 1:46 AM, Babak Farhang <[email protected]> wrote:

> Thanks for chiming in Dmitriy.
>
> Yes, I think I have verified that the version that ends up getting
> called in my udf is in fact the one bundled with hadoop; not the one I
> bundled w/ my udf jar. As I understand it, this is a common problem
> any "container" app such as hadoop (or say, tomcat) that loads
> "external" user-defined classes must deal with. Usually, the "how" of
> achieving the desired behavior involves setting up rules for which
> classes are looked up first in the parent loader (and if not found
> there then in the child loader) and which classes are first looked up
> in reverse (child loader first, then in parent loader). The rules that
> govern this behavior are called the "delegation model" for the [child]
> classloader. I'm new to this hadoop/pig environment and would like to
> learn what these rules are.
>
> Hope I've made my question clearer :-)
>
> Regards
> -Babak
>
> On Tue, Oct 11, 2011 at 8:04 PM, Dmitriy Ryaboy <[email protected]>
> wrote:
> > I think the problem isn't the classloader -- it's the fact that both
> jackson
> > and joda are already bundled into the pig jar (presumably, different
> > versions of both libraries than the ones you are using). You need to
> either
> > repackage pig to not bundle those libraries, or make your code work with
> > Pig's versions of joda and jackson.
> >
> > D
> >
> > On Tue, Oct 11, 2011 at 7:39 PM, Babak Farhang <[email protected]>
> wrote:
> >
> >> Reading my original post over again, I see that I should have been
> >> clearer. I *am* including a copy of the versions of the jackson and
> >> joda libs that I need in my udf jar file. These libs are included in
> >> "exploded" form (i.e. not as embedded jars within the udf jar file,
> >> but in unzipped form alongside my own class files). However, they
> >> don't seem to get picked up by the hadoop/pig classloader. Am I doing
> >> this all wrong?
> >>
> >> On Tue, Oct 11, 2011 at 4:39 PM, Babak Farhang <[email protected]>
> wrote:
> >> > Greetings everyone,
> >> >
> >> > My pig script contains a call to my custom udf and I seem to be
> >> > running into a couple of classloader issues when running it. Below are
> >> > the specifics (the call stack), but I have some beginner general
> >> > questions regarding classloaders in pig:
> >> >
> >> > 1. Is there a way to configure the classloader used to load the udf
> >> > class and its deps? (I see, for example, a setClassloader method in
> >> > the impl PigContext class which is not directly exposed to the user)
> >> >
> >> > 2. What's delegation model does pig's udf classloader use when
> >> > resolving classes (e.g. parent classloader first, then child--or more
> >> > likely something a bit more complicated)?
> >> >
> >> > Any info/ideas you can share would be much appreciated.
> >> > Thx!
> >> >
> >> > -Babak
> >> >
> >> > ===============================
> >> >
> >> > Now about the context/specifics of my error:
> >> >
> >> > My UDF uses the joda time and the jackson json libs (versions 2.0, and
> >> > 1.8.6, respectively) which I package along with my UDF in the jar that
> >> > ends up being registered in my pig script. Here are the call stacks:
> >> >
> >> > Joda-related:
> >> > Exception in thread "main" java.io.IOException: Resource not found:
> >> > "org/joda/time/tz/data/ZoneInfoMap" ClassLoader:
> >> > sun.misc.Launcher$AppClassLoader@4aad3ba4
> >> >        at
> >>
> org.joda.time.tz.ZoneInfoProvider.openResource(ZoneInfoProvider.java:211)
> >> >        at
> >> org.joda.time.tz.ZoneInfoProvider.<init>(ZoneInfoProvider.java:123)
> >> >        at
> >> org.joda.time.tz.ZoneInfoProvider.<init>(ZoneInfoProvider.java:82)
> >> >        at
> >> org.joda.time.DateTimeZone.getDefaultProvider(DateTimeZone.java:462)
> >> >        at
> org.joda.time.DateTimeZone.setProvider0(DateTimeZone.java:416)
> >> >        at org.joda.time.DateTimeZone.<clinit>(DateTimeZone.java:115)
> >> >        at
> >>
> org.joda.time.chrono.GregorianChronology.<clinit>(GregorianChronology.java:71)
> >> >        at
> >> org.joda.time.chrono.ISOChronology.<clinit>(ISOChronology.java:66)
> >> >        at org.joda.time.base.BaseDateTime.<init>(BaseDateTime.java:97)
> >> >        at org.joda.time.DateTime.<init>(DateTime.java:193)
> >> >        at com.qf.util.time.GregorianDate.<init>(GregorianDate.java:46)
> >> >        at
> >>
> com.qf.timeseries.TimeSeriesBinner.initBinDate(TimeSeriesBinner.java:482)
> >> >        at
> >> com.qf.timeseries.TimeSeriesBinner.toBinned(TimeSeriesBinner.java:368)
> >> >        at
> >> com.qf.timeseries.TimeSeriesBinner.toBinned(TimeSeriesBinner.java:267)
> >> >        at
> >> com.qf.timeseries.TimeSeriesBinner.toBinned(TimeSeriesBinner.java:186)
> >> >        at
> >> com.qf.timeseries.TimeSeriesBinner.bin(TimeSeriesBinner.java:148)
> >> >        at
> >> com.qf.timeseries.BinnedTimeSeries.newInstance(BinnedTimeSeries.java:69)
> >> >        at
> >> com.qf.timeseries.BinnedTimeSeries.newInstance(BinnedTimeSeries.java:46)
> >> >        at
> >>
> com.qf.pig.udf.BinnedTargetEntityDelayCorrelationMatrices.exec(BinnedTargetEntityDelayCorrelationMatrices.java:171)
> >> >        at
> >>
> com.qf.pig.udf.BinnedTargetEntityDelayCorrelationMatrices.exec(BinnedTargetEntityDelayCorrelationMatrices.java:29)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:229)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:273)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:343)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:240)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:240)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:240)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:433)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:401)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:381)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:251)
> >> >        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
> >> >        at
> >> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:571)
> >> >        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:413)
> >> >        at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
> >> >        at java.security.AccessController.doPrivileged(Native Method)
> >> >        at javax.security.auth.Subject.doAs(Subject.java:396)
> >> >        at
> >>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
> >> >        at org.apache.hadoop.mapred.Child.main(Child.java:262)
> >> >
> >> > Jackson-related:
> >> >
> >> > 2011-10-11 15:37:30,435 FATAL org.apache.hadoop.mapred.Child: Error
> >> > running child : java.lang.NoSuchFieldError: WRITE_NULL_MAP_VALUES  <--
> >> > (Note: there's a way in Jackson to config this to be more lenient, but
> >> > I don't think I should be mucking w/ pig/hadoop's jackson lib)
> >> >        at
> >> com.pico.result.JSONFactory.getObjectMapper(JSONFactory.java:20)
> >> >        at
> >>
> com.qf.timeseries.TargetTimeSeriesInput.newInstance(TargetTimeSeriesInput.java:48)
> >> >        at
> >>
> com.qf.pig.udf.BtedcmClasspathImpl.getTargetSeries(BtedcmClasspathImpl.java:163)
> >> >        at
> >>
> com.qf.pig.udf.BinnedTargetEntityDelayCorrelationMatrices.exec(BinnedTargetEntityDelayCorrelationMatrices.java:210)
> >> >        at
> >>
> com.qf.pig.udf.BinnedTargetEntityDelayCorrelationMatrices.exec(BinnedTargetEntityDelayCorrelationMatrices.java:29)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:229)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:273)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:343)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:240)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:240)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:240)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:433)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:401)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:381)
> >> >        at
> >>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:251)
> >> >        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
> >> >        at
> >> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:571)
> >> >        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:413)
> >> >        at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
> >> >        at java.security.AccessController.doPrivileged(Native Method)
> >> >        at javax.security.auth.Subject.doAs(Subject.java:396)
> >> >        at
> >>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
> >> >        at org.apache.hadoop.mapred.Child.main(Child.java:262)
> >> >
> >>
> >
>

Reply via email to