[
https://issues.apache.org/jira/browse/PIG-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413532#comment-13413532
]
Raghu Angadi commented on PIG-2815:
-----------------------------------
An example:
{noformat}
register elephant-bird.jar; -- for working with Thrift objects.
-- (1)
register T_One.jar;
-- (2)
-- ThriftPigLoader takes name of a Thrift class that corresponds to input.
a = load '/logs/T_One' using ThriftPigLoader('thrift.gen.T_One');
-- (3)
register second.jar;
-- (4)
b = load '/logs/T_two' using ThriftPigLoader('thrift.gen.T_two');
-- (5)
-- FAIL!
{noformat}
* (1): new classlaoder cl_A is created with root classloader as the parent.
* (2): cl_B is created with root as the parent.
* (3): {{ThirftPigLoader.class}} is instantiated with cl_B and cached.
* (4): cl_C is created with root as the parent.
* (5): {{thrift.gen.T_two.class}} is instantiated with cl_C, but
'{{ThriftPigLoader.class}}' from cl_B is reused by Pig. So all the Thrift
classes seen by ThriftPigLoader are entirely different from all the Thrift
classes seen by {{thrift.gen.T_two}} since cl_B is not a parent of cl_C. That
can lead to a number of issues and it does.
> class loader management in PigContext
> -------------------------------------
>
> Key: PIG-2815
> URL: https://issues.apache.org/jira/browse/PIG-2815
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.9.0
> Reporter: Raghu Angadi
> Fix For: 0.11
>
>
> The way {{PigContext.classloader}} and resolveClassName() are managed can
> lead to strange class loading issues, especially when not all {{register}}
> statements are at the top (example in the first comment).
> Two factors contribute to this: sometimes only one of them and sometimes
> together:
> # a new classloader (CL) is created after registering each jar.
> ** but the new jar's parent is the root CL rather than previous CL,
> effectively throwing previous CL away.
> # resolveClassName() caches classes based on just the name
> ** A class is not defined by name alone. Classes loaded by two different
> unrelated CLs are different objects even if both extract the class from same
> physical jar file.
> ** because of (1), the cached class is not necessarily same as the class
> that would be loaded based on 'current' CL
> having different class objects for same class have many subtle side effects.
> e.g. there would be two instances of static variables.
> I think both should be fixed.. thought fixing one of them might be good
> enough in many cases. I will add a patch.
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira