[ 
https://issues.apache.org/jira/browse/NUTCH-2163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated NUTCH-2163:
----------------------------------------
    Summary: Utilize current JVM threads to augment URLClassLoader with newly 
discovered classes  (was: Utilize current JVM threads to augment URLClassLoader 
with newlt discovered classes)

> Utilize current JVM threads to augment URLClassLoader with newly discovered 
> classes
> -----------------------------------------------------------------------------------
>
>                 Key: NUTCH-2163
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2163
>             Project: Nutch
>          Issue Type: Bug
>          Components: util
>            Reporter: Lewis John McGibbney
>
> I found [this 
> code|https://github.com/apache/nutch/compare/trunk...infolinks:nutch-osgi] a 
> while back and have been thinking about OSGi again for Nutch. 
> Our justification here is that we want to dynamically create 
> [InteractiveSeleniumHandler's|https://github.com/apache/nutch/blob/trunk/src/plugin/protocol-interactiveselenium/src/java/org/apache/nutch/protocol/interactiveselenium/handlers/InteractiveSeleniumHandler.java]
>  and inject the code into the .job artifacts which can then be used in the 
> next round of fetching. 
> The code looks like the following
> {code}
> +    List<URL> nutchConfigurationClasspathURLs = new ArrayList<URL>();
> +
> +    // Collect classpath URLs from Hadoop's Configuration class CL
> +    URLClassLoader hadoopBundleConfigurationClassLoader = (URLClassLoader) 
> conf.getClassLoader();
> +    for (URL hadoopBundleClasspathURL : 
> hadoopBundleConfigurationClassLoader.getURLs()) {
> +      nutchConfigurationClasspathURLs.add(hadoopBundleClasspathURL);
> +    }
> +
> +    // Append classpath URLs from current thread, which ostensibly include a 
> Nutch job file
> +    URLClassLoader tccl = (URLClassLoader) 
> Thread.currentThread().getContextClassLoader();
> +    for (URL tcclClasspathURL : tccl.getURLs()) {
> +      nutchConfigurationClasspathURLs.add(tcclClasspathURL);
> +    }
> +
> +    URLClassLoader nutchConfigurationClassLoader = new 
> URLClassLoader(nutchConfigurationClasspathURLs.toArray(new URL[0]));
> +    // Reset the Configuration object's CL to the new one
> +    conf.setClassLoader(nutchConfigurationClassLoader);
> {code}
> The Thread.currentThread().getContextClassLoader(); is the secret sauce... 
> however I just wonder what thoughts are about this approach?
> We have, from time to time over the years discussed 
> [Nutch|http://wiki.apache.org/nutch/NutchOSGi] and I spoke with 
> [~bdelacretaz] a good few years ago @ApacheCon but I don't have the time to 
> implement total OSGi coverage of the Nutch codebase. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to