Lewis John McGibbney created NUTCH-2163:
-------------------------------------------
Summary: Utilize current JVM threads to augment URLClassLoader
with newlt discovered classes
Key: NUTCH-2163
URL: https://issues.apache.org/jira/browse/NUTCH-2163
Project: Nutch
Issue Type: Bug
Components: util
Reporter: Lewis John McGibbney
I found [this
code|https://github.com/apache/nutch/compare/trunk...infolinks:nutch-osgi] a
while back and have been thinking about OSGi again for Nutch.
Our justification here is that we want to dynamically create
[InteractiveSeleniumHandler's|https://github.com/apache/nutch/blob/trunk/src/plugin/protocol-interactiveselenium/src/java/org/apache/nutch/protocol/interactiveselenium/handlers/InteractiveSeleniumHandler.java]
and inject the code into the .job artifacts which can then be used in the next
round of fetching.
The code looks like the following
{code}
+ List<URL> nutchConfigurationClasspathURLs = new ArrayList<URL>();
+
+ // Collect classpath URLs from Hadoop's Configuration class CL
+ URLClassLoader hadoopBundleConfigurationClassLoader = (URLClassLoader)
conf.getClassLoader();
+ for (URL hadoopBundleClasspathURL :
hadoopBundleConfigurationClassLoader.getURLs()) {
+ nutchConfigurationClasspathURLs.add(hadoopBundleClasspathURL);
+ }
+
+ // Append classpath URLs from current thread, which ostensibly include a
Nutch job file
+ URLClassLoader tccl = (URLClassLoader)
Thread.currentThread().getContextClassLoader();
+ for (URL tcclClasspathURL : tccl.getURLs()) {
+ nutchConfigurationClasspathURLs.add(tcclClasspathURL);
+ }
+
+ URLClassLoader nutchConfigurationClassLoader = new
URLClassLoader(nutchConfigurationClasspathURLs.toArray(new URL[0]));
+ // Reset the Configuration object's CL to the new one
+ conf.setClassLoader(nutchConfigurationClassLoader);
{code}
The Thread.currentThread().getContextClassLoader(); is the secret sauce...
however I just wonder what thoughts are about this approach?
We have, from time to time over the years discussed
[Nutch|http://wiki.apache.org/nutch/NutchOSGi] and I spoke with [~bdelacretaz]
a good few years ago @ApacheCon but I don't have the time to implement total
OSGi coverage of the Nutch codebase.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)