Lewis John McGibbney created NUTCH-2163:
-------------------------------------------

             Summary: Utilize current JVM threads to augment URLClassLoader 
with newlt discovered classes
                 Key: NUTCH-2163
                 URL: https://issues.apache.org/jira/browse/NUTCH-2163
             Project: Nutch
          Issue Type: Bug
          Components: util
            Reporter: Lewis John McGibbney


I found [this 
code|https://github.com/apache/nutch/compare/trunk...infolinks:nutch-osgi] a 
while back and have been thinking about OSGi again for Nutch. 
Our justification here is that we want to dynamically create 
[InteractiveSeleniumHandler's|https://github.com/apache/nutch/blob/trunk/src/plugin/protocol-interactiveselenium/src/java/org/apache/nutch/protocol/interactiveselenium/handlers/InteractiveSeleniumHandler.java]
 and inject the code into the .job artifacts which can then be used in the next 
round of fetching. 
The code looks like the following
{code}
+    List<URL> nutchConfigurationClasspathURLs = new ArrayList<URL>();
+
+    // Collect classpath URLs from Hadoop's Configuration class CL
+    URLClassLoader hadoopBundleConfigurationClassLoader = (URLClassLoader) 
conf.getClassLoader();
+    for (URL hadoopBundleClasspathURL : 
hadoopBundleConfigurationClassLoader.getURLs()) {
+      nutchConfigurationClasspathURLs.add(hadoopBundleClasspathURL);
+    }
+
+    // Append classpath URLs from current thread, which ostensibly include a 
Nutch job file
+    URLClassLoader tccl = (URLClassLoader) 
Thread.currentThread().getContextClassLoader();
+    for (URL tcclClasspathURL : tccl.getURLs()) {
+      nutchConfigurationClasspathURLs.add(tcclClasspathURL);
+    }
+
+    URLClassLoader nutchConfigurationClassLoader = new 
URLClassLoader(nutchConfigurationClasspathURLs.toArray(new URL[0]));
+    // Reset the Configuration object's CL to the new one
+    conf.setClassLoader(nutchConfigurationClassLoader);
{code}
The Thread.currentThread().getContextClassLoader(); is the secret sauce... 
however I just wonder what thoughts are about this approach?
We have, from time to time over the years discussed 
[Nutch|http://wiki.apache.org/nutch/NutchOSGi] and I spoke with [~bdelacretaz] 
a good few years ago @ApacheCon but I don't have the time to implement total 
OSGi coverage of the Nutch codebase. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to