[
https://issues.apache.org/jira/browse/NUTCH-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14958233#comment-14958233
]
ASF GitHub Bot commented on NUTCH-2141:
---------------------------------------
GitHub user balajig17 opened a pull request:
https://github.com/apache/nutch/pull/77
fix for NUTCH-2141 contributed by Balaji Gurumurthy
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/balajig17/nutch NUTCH-2141
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/nutch/pull/77.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #77
----
commit d9486a5567ceb9a6c77e6fe3994350f37a433510
Author: Balaji <[email protected]>
Date: 2015-10-15T03:10:16Z
fix for NUTCH-2141 contributed by Balaji Gurumurthy
----
> Change the InteractiveSelenium plugin handler Interface to return page content
> ------------------------------------------------------------------------------
>
> Key: NUTCH-2141
> URL: https://issues.apache.org/jira/browse/NUTCH-2141
> Project: Nutch
> Issue Type: Improvement
> Components: plugin
> Reporter: Balaji Gurumurthy
> Labels: selenium
>
> The handler interface in the protocol-interactiveselenium plugin currently
> provide methods to manipulate the page content and the HTTPResponse class
> read's the page content from the driver. This limits the amount of HTML
> content that could be returned to nutch.
> The processDriver method could return a String object instead. This is
> particularly helpful in cases such as handling pagination when multiple
> pages' content can be appended and returned from the handler.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)