Balaji Gurumurthy created NUTCH-2141:
----------------------------------------

             Summary: Change the InteractiveSelenium plugin handler Interface 
to return page content
                 Key: NUTCH-2141
                 URL: https://issues.apache.org/jira/browse/NUTCH-2141
             Project: Nutch
          Issue Type: Improvement
          Components: plugin
            Reporter: Balaji Gurumurthy


The handler interface in the protocol-interactiveselenium plugin currently 
provide methods to manipulate the page content and the HTTPResponse class 
read's the page content from the driver. This limits the amount of HTML content 
that could be returned to nutch.

The processDriver method could return a String object instead. This is 
particularly helpful  in cases such as handling pagination when multiple pages' 
content can be appended and returned from the handler. 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to