[
https://issues.apache.org/jira/browse/NUTCH-2062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2062:
----------------------------------------
Attachment: NUTCH-2062v2.patch
[~mjoyce] can you please try this patch out? I've
* renamed all relevant properties within nutch-default.xml to your new
convention of libselenium.blah
* included the new package within default.properties
* added license headers and corrected package naming within the new handlers
interfaces
* modularized
src/plugin/lib-selenium/src/java/org/apache/nutch/protocol/selenium/HttpWebClient.java
such that we can now getDriver based upon the new adaptive driver
configuration which defaults to FirefoxDriver.
One thing to possibly consider. The dependency inclusions within the new
plugin.xml may conflict with whats existing in lib-selenium and
protocol-selenium. I think we may have to ensure that these are in sync.
Excellent job on this one Jimmy.
Please let me know how this tests out. Thanks
> Add Plugin for interacting with Selenium WebDriver
> --------------------------------------------------
>
> Key: NUTCH-2062
> URL: https://issues.apache.org/jira/browse/NUTCH-2062
> Project: Nutch
> Issue Type: Improvement
> Components: plugin
> Affects Versions: 1.10
> Reporter: Michael Joyce
> Assignee: Michael Joyce
> Fix For: 1.11
>
> Attachments: NUTCH-2062v2.patch
>
>
> The protocol-selenium plugin is great for pulling webpages that dynamically
> load content. However, I've run into use cases where I need to actively
> interact with a page in Selenium before it becomes useful. For instance, I
> may need to paginate through a table to get all results that I'm interested
> in. This plugin will handle that use case.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)