[
https://issues.apache.org/jira/browse/NUTCH-2083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2083:
----------------------------------------
Attachment: NUTCH-2083v2.patch
Hi [~kwhitehall] please see attached patch
* It adds the default configuration to HttpWebClient as well as
nutch-default.xml instead of nutch-site.xml. The latter should be kept empty.
* You will see in the patch that instructions are included within
src/plugin/protocol-selenium/README.md. This includes referenced to the
selenium-hub/grid stuff. It is down to the user to define this themselves. If
we provide all the documentation then it gets out of date which is hellish.
Please review, test and comment. Thank you [~kwhitehall]
> Implement functionality to shadow nutch-selenium-grid-plugin from Mo Omer
> -------------------------------------------------------------------------
>
> Key: NUTCH-2083
> URL: https://issues.apache.org/jira/browse/NUTCH-2083
> Project: Nutch
> Issue Type: Improvement
> Components: plugin
> Reporter: Lewis John McGibbney
> Assignee: Lewis John McGibbney
> Fix For: 1.11
>
> Attachments: NUTCH-2083.patch, NUTCH-2083v2.patch
>
>
> This issue should augment the lib-selenium
> src/plugin/lib-selenium/src/java/org/apache/nutch/protocol/selenium/HttpWebClient.java
> and implement the same functionality as provided within [~momer]'s
> [https://github.com/momer/nutch-selenium-grid-plugin|nutch-selenium-grid-plugin].
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)