Greetings everyone,
My pig script contains a call to my custom udf and I seem to be
running into a couple of classloader issues when running it. Below are
the specifics (the call stack), but I have some beginner general
questions regarding classloaders in pig:
1. Is there a way to configure the classloader used to load the udf
class and its deps? (I see, for example, a setClassloader method in
the impl PigContext class which is not directly exposed to the user)
2. What's delegation model does pig's udf classloader use when
resolving classes (e.g. parent classloader first, then child--or more
likely something a bit more complicated)?
Any info/ideas you can share would be much appreciated.
Thx!
-Babak
===============================
Now about the context/specifics of my error:
My UDF uses the joda time and the jackson json libs (versions 2.0, and
1.8.6, respectively) which I package along with my UDF in the jar that
ends up being registered in my pig script. Here are the call stacks:
Joda-related:
Exception in thread "main" java.io.IOException: Resource not found:
"org/joda/time/tz/data/ZoneInfoMap" ClassLoader:
sun.misc.Launcher$AppClassLoader@4aad3ba4
at
org.joda.time.tz.ZoneInfoProvider.openResource(ZoneInfoProvider.java:211)
at org.joda.time.tz.ZoneInfoProvider.<init>(ZoneInfoProvider.java:123)
at org.joda.time.tz.ZoneInfoProvider.<init>(ZoneInfoProvider.java:82)
at org.joda.time.DateTimeZone.getDefaultProvider(DateTimeZone.java:462)
at org.joda.time.DateTimeZone.setProvider0(DateTimeZone.java:416)
at org.joda.time.DateTimeZone.<clinit>(DateTimeZone.java:115)
at
org.joda.time.chrono.GregorianChronology.<clinit>(GregorianChronology.java:71)
at org.joda.time.chrono.ISOChronology.<clinit>(ISOChronology.java:66)
at org.joda.time.base.BaseDateTime.<init>(BaseDateTime.java:97)
at org.joda.time.DateTime.<init>(DateTime.java:193)
at com.qf.util.time.GregorianDate.<init>(GregorianDate.java:46)
at
com.qf.timeseries.TimeSeriesBinner.initBinDate(TimeSeriesBinner.java:482)
at
com.qf.timeseries.TimeSeriesBinner.toBinned(TimeSeriesBinner.java:368)
at
com.qf.timeseries.TimeSeriesBinner.toBinned(TimeSeriesBinner.java:267)
at
com.qf.timeseries.TimeSeriesBinner.toBinned(TimeSeriesBinner.java:186)
at com.qf.timeseries.TimeSeriesBinner.bin(TimeSeriesBinner.java:148)
at
com.qf.timeseries.BinnedTimeSeries.newInstance(BinnedTimeSeries.java:69)
at
com.qf.timeseries.BinnedTimeSeries.newInstance(BinnedTimeSeries.java:46)
at
com.qf.pig.udf.BinnedTargetEntityDelayCorrelationMatrices.exec(BinnedTargetEntityDelayCorrelationMatrices.java:171)
at
com.qf.pig.udf.BinnedTargetEntityDelayCorrelationMatrices.exec(BinnedTargetEntityDelayCorrelationMatrices.java:29)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:229)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:273)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:343)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:240)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:240)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:240)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:433)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:401)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:381)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:251)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
at
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:571)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:413)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Jackson-related:
2011-10-11 15:37:30,435 FATAL org.apache.hadoop.mapred.Child: Error
running child : java.lang.NoSuchFieldError: WRITE_NULL_MAP_VALUES <--
(Note: there's a way in Jackson to config this to be more lenient, but
I don't think I should be mucking w/ pig/hadoop's jackson lib)
at com.pico.result.JSONFactory.getObjectMapper(JSONFactory.java:20)
at
com.qf.timeseries.TargetTimeSeriesInput.newInstance(TargetTimeSeriesInput.java:48)
at
com.qf.pig.udf.BtedcmClasspathImpl.getTargetSeries(BtedcmClasspathImpl.java:163)
at
com.qf.pig.udf.BinnedTargetEntityDelayCorrelationMatrices.exec(BinnedTargetEntityDelayCorrelationMatrices.java:210)
at
com.qf.pig.udf.BinnedTargetEntityDelayCorrelationMatrices.exec(BinnedTargetEntityDelayCorrelationMatrices.java:29)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:229)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:273)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:343)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:240)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:240)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:240)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:433)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:401)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:381)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:251)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
at
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:571)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:413)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
at org.apache.hadoop.mapred.Child.main(Child.java:262)