Dzmitry,

I have only used this with Nutch 1.x, but have found it to be excellent and 
easily extended.

http://www.atlantbh.com/precise-data-extraction-with-apache-nutch/


Iain

-----Original Message-----
From: Dzmitry [mailto:[email protected]] 
Sent: Wednesday, February 25, 2015 10:11 AM
To: [email protected]
Subject: custom parser (xpath)

Hi everybody,


Could you point to a relevant sample/tutorial how to implement custom xpath 
parser of html page for nutch 2.x? The idea is that i want to get from page not 
the whole page content but only set of text values that could be extracted by 
xpath expressions


Regards,
Dzmitry

—
Sent from Mailbox

Reply via email to