[ 
https://issues.apache.org/jira/browse/NUTCH-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14847281#comment-14847281
 ] 

Sebastian Nagel commented on NUTCH-2106:
----------------------------------------

Avoiding conflicting dependencies is the reason for the Nutch plugin system 
[[1|https://wiki.apache.org/nutch/WhatsTheProblemWithPluginsAndClass-loading]]. 
However, if a plugin depends on another plugin and both depend on a library, 
there is no way: both plugins must rely on the same version (or two versions 
with compatible API).
- protocol-selenium depends on lib-selenium
- both depend on selenium-java (currently the same version)
- when the plugin protocol-selenium is loaded the lib-selenium.jar is just 
added to the classpath of protocol-selenium's own class loader. The classes 
from lib-selenium.jar do not live in it's own class loader! They are used 
directly (and not via the lib-selenium plugin instance) from classes in 
protocol-selenium.
- the same situation for protocol-interactiveselenium

As a consequence, the Selenium version used by lib-selenium dictates the 
version to be used by the two protocol plugins. So, why not bundle Selenium 
jars and dependencies in lib-selenium?

> Runtime to contain Selenium and dependencies only once
> ------------------------------------------------------
>
>                 Key: NUTCH-2106
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2106
>             Project: Nutch
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 1.11
>            Reporter: Sebastian Nagel
>             Fix For: 1.11
>
>         Attachments: NUTCH-2106.patch
>
>
> All Selenium-based plugins contain the same dependendent jars which 
> significantly affects the size of runtime and bin package:
> {noformat}
> % du -hs runtime/local/plugins/*selenium/ runtime/deploy/*.job
> 25M runtime/local/plugins/lib-selenium/
> 25M runtime/local/plugins/protocol-interactiveselenium/
> 25M runtime/local/plugins/protocol-selenium/
> 182M runtime/deploy/apache-nutch-1.11-SNAPSHOT.job
> {noformat}
> Since all plugins depend on the same Selenium version we could bundle the 
> dependencies in lib-selenium and let the other plugins load it from there:
> - let lib-selenium export all dependent libs, e.g.:
> {code:xml|title=lib-selenium/plugin.xml}
> <runtime>
>   ...
>   <library name="selenium-java-2.44.0.jar">
>     <export name="*"/>
>   </library>
> {code}
> - both protocol plugins already import lib-selenium: the dependencies in 
> ivy.xml can be removed
> As expected, these changes make the runtime smaller:
> {noformat}
> 25M runtime/local/plugins/lib-selenium/
> 20K runtime/local/plugins/protocol-interactiveselenium/
> 16K runtime/local/plugins/protocol-selenium/
> 138M runtime/deploy/apache-nutch-1.11-SNAPSHOT.job
> {noformat}
> Open points:
> - I've tested only protocol-selenium using chromedriver. Should also test 
> protocol-interactiveselenium?
> - What about phantomjsdriver-1.2.1.jar? It was contained in lib-selenium and 
> protocol-selenium but not protocol-interactiveselenium. Is there a reason for 
> this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to