Hi Dzmitry,

have a look on
 https://issues.apache.org/jira/browse/NUTCH-1870

Work is ongoing (I'm short before pushing an improved patch).
Help in testing and improving the patches is always welcome! :)
It's currently only for 1.x, but plugins are relatively easy
to port.

Best,
Sebastian

On 02/25/2015 05:11 PM, Dzmitry wrote:
> Hi everybody,
> 
> 
> Could you point to a relevant sample/tutorial how to implement custom xpath 
> parser of html page for nutch 2.x? The idea is that i want to get from page 
> not the whole page content but only set of text values that could be 
> extracted by xpath expressions
> 
> 
> Regards,
> Dzmitry
> 
> —
> Sent from Mailbox
> 

Reply via email to