[ 
https://issues.apache.org/jira/browse/NUTCH-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris A. Mattmann resolved NUTCH-2141.
--------------------------------------
       Resolution: Fixed
    Fix Version/s: 1.11

Thanks [~BalaJira] [[email protected]] plenty to  improve on but a great start!

{noformat}
[chipotle:~/tmp/nutch1.11] mattmann% svn commit -m "Fix for NUTCH-2141: Change 
the InteractiveSelenium plugin handler Interface to return page content 
contributed by Balaji <[email protected]> this closes #77 #75"
Sending        CHANGES.txt
Sending        
src/plugin/protocol-interactiveselenium/src/java/org/apache/nutch/protocol/interactiveselenium/HttpResponse.java
Sending        
src/plugin/protocol-interactiveselenium/src/java/org/apache/nutch/protocol/interactiveselenium/handlers/DefalultMultiInteractionHandler.java
Sending        
src/plugin/protocol-interactiveselenium/src/java/org/apache/nutch/protocol/interactiveselenium/handlers/DefaultClickAllAjaxLinksHandler.java
Sending        
src/plugin/protocol-interactiveselenium/src/java/org/apache/nutch/protocol/interactiveselenium/handlers/DefaultHandler.java
Sending        
src/plugin/protocol-interactiveselenium/src/java/org/apache/nutch/protocol/interactiveselenium/handlers/InteractiveSeleniumHandler.java
Transmitting file data ......
Committed revision 1709307.
[chipotle:~/tmp/nutch1.11] mattmann% 
{noformat}


> Change the InteractiveSelenium plugin handler Interface to return page content
> ------------------------------------------------------------------------------
>
>                 Key: NUTCH-2141
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2141
>             Project: Nutch
>          Issue Type: Improvement
>          Components: plugin
>            Reporter: Balaji Gurumurthy
>            Assignee: Chris A. Mattmann
>              Labels: selenium
>             Fix For: 1.11
>
>
> The handler interface in the protocol-interactiveselenium plugin currently 
> provide methods to manipulate the page content and the HTTPResponse class 
> read's the page content from the driver. This limits the amount of HTML 
> content that could be returned to nutch.
> The processDriver method could return a String object instead. This is 
> particularly helpful  in cases such as handling pagination when multiple 
> pages' content can be appended and returned from the handler. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to