[
https://issues.apache.org/jira/browse/PIG-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030188#comment-13030188
]
Woody Anderson commented on PIG-1821:
-------------------------------------
this checkin has caused a classloader failure when using my loader UDF.
i've narrowed it the the checking that references this bug:
r1099860 | thejas | 2011-05-05 09:10:26 -0700 (Thu, 05 May 2011) | 3 lines
PIG-1821: UDFContext.getUDFProperties does not handle collisions
in hashcode of udf classname (+ arg hashcodes) (thejas)
I rebuilt my loader against the new pig jar, still fails.
here's the output of my code (it works if i run using pig built with the
previous revision):
Backend error message during job submission
-------------------------------------------
java.io.IOException: Deserialization error:
com.yahoo.ymail.pigfunctions.AsStorage
at
org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:55)
at org.apache.pig.impl.util.UDFContext.deserialize(UDFContext.java:183)
at
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil.setupUDFContext(MapRedUtil.java:155)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.setupUdfEnvAndStores(PigOutputFormat.java:228)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecs(PigOutputFormat.java:185)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:770)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
at java.lang.Thread.run(Thread.java:680)
Caused by: java.lang.ClassNotFoundException:
com.yahoo.ymail.pigfunctions.AsStorage
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:603)
at
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1574)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1495)
at java.io.ObjectInputStream.readClass(ObjectInputStream.java:1461)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1311)
at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)
at java.util.HashMap.readObject(HashMap.java:1029)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1848)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)
at
org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:53)
... 10 more
> UDFContext.getUDFProperties does not handle collisions in hashcode of udf
> classname (+ arg hashcodes)
> -----------------------------------------------------------------------------------------------------
>
> Key: PIG-1821
> URL: https://issues.apache.org/jira/browse/PIG-1821
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.8.0
> Reporter: Thejas M Nair
> Assignee: Thejas M Nair
> Fix For: 0.9.0
>
> Attachments: PIG-1821.1.patch, PIG-1821.2.patch
>
>
> In code below, if generateKey() returns same value for two udfs, the udfs
> would end up sharing the properties object.
> {code}
> private HashMap<Integer, Properties> udfConfs = new HashMap<Integer,
> Properties>();
> public Properties getUDFProperties(Class c) {
> Integer k = generateKey(c);
> Properties p = udfConfs.get(k);
> if (p == null) {
> p = new Properties();
> udfConfs.put(k, p);
> }
> return p;
> }
> private int generateKey(Class c) {
> return c.getName().hashCode();
> }
> public Properties getUDFProperties(Class c, String[] args) {
> Integer k = generateKey(c, args);
> Properties p = udfConfs.get(k);
> if (p == null) {
> p = new Properties();
> udfConfs.put(k, p);
> }
> return p;
> }
> private int generateKey(Class c, String[] args) {
> int hc = c.getName().hashCode();
> for (int i = 0; i < args.length; i++) {
> hc <<= 1;
> hc ^= args[i].hashCode();
> }
> return hc;
> }
> {code}
> To prevent this, a new class (say X) that can hold the classname and args
> should be created, and instead of HashMap<Integer, Properties>, HashMap<X,
> Properties> should be used. Then HahsMap will deal with the collisions.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira