[
https://issues.apache.org/jira/browse/HIVE-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15028192#comment-15028192
]
Ratandeep Ratti commented on HIVE-11878:
----------------------------------------
That's correct [~jdere]
> ClassNotFoundException can possibly occur if multiple jars are registered
> one at a time in Hive
> ------------------------------------------------------------------------------------------------
>
> Key: HIVE-11878
> URL: https://issues.apache.org/jira/browse/HIVE-11878
> Project: Hive
> Issue Type: Bug
> Components: Hive
> Affects Versions: 1.2.1
> Reporter: Ratandeep Ratti
> Assignee: Ratandeep Ratti
> Labels: URLClassLoader
> Attachments: HIVE-11878 ClassLoader Issues when Registering
> Jars.pptx, HIVE-11878.2.patch, HIVE-11878.patch, HIVE-11878_approach3.patch,
> HIVE-11878_approach3_per_session_clasloader.patch,
> HIVE-11878_approach3_with_review_comments.patch,
> HIVE-11878_approach3_with_review_comments1.patch, HIVE-11878_qtest.patch
>
>
> When we register a jar on the Hive console. Hive creates a fresh URL
> classloader which includes the path of the current jar to be registered and
> all the jar paths of the parent classloader. The parent classlaoder is the
> current ThreadContextClassLoader. Once the URLClassloader is created Hive
> sets that as the current ThreadContextClassloader.
> So if we register multiple jars in Hive, there will be multiple
> URLClassLoaders created, each classloader including the jars from its parent
> and the one extra jar to be registered. The last URLClassLoader created will
> end up as the current ThreadContextClassLoader. (See details:
> org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath)
> Now here's an example in which the above strategy can lead to a CNF exception.
> We register 2 jars *j1* and *j2* in Hive console. *j1* contains the UDF class
> *c1* and internally relies on class *c2* in jar *j2*. We register *j1* first,
> the URLClassLoader *u1* is created and also set as the
> ThreadContextClassLoader. We register *j2* next, the new URLClassLoader
> created will be *u2* with *u1* as parent and *u2* becomes the new
> ThreadContextClassLoader. Note *u2* includes paths to both jars *j1* and *j2*
> whereas *u1* only has paths to *j1* (For details see:
> org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath).
> Now when we register class *c1* under a temporary function in Hive, we load
> the class using {code} class.forName("c1", true,
> Thread.currentThread().getContextClassLoader()) {code} . The
> currentThreadContext class-loader is *u2*, and it has the path to the class
> *c1*, but note that Class-loaders work by delegating to parent class-loader
> first. In this case class *c1* will be found and *defined* by class-loader
> *u1*.
> Now *c1* from jar *j1* has *u1* as its class-loader. If a method (say
> initialize) is called in *c1*, which references the class *c2*, *c2* will not
> be found since the class-loader used to search for *c2* will be *u1* (Since
> the caller's class-loader is used to load a class)
> I've added a qtest to explain the problem. Please see the attached patch
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)