Boilerpipe and getting all URL's

Markus Jelsma Tue, 20 Dec 2011 07:42:39 -0800

Hi,

How can i parse documents with the Boilerpipe content handler and still be 
able to read all hyperlinks? Right now we parse twice, once to get the text 
without boilerplate text and once to get all hyperlinks.


Any advice?
Thanks
-- 
Markus Jelsma - CTO - Openindex

Boilerpipe and getting all URL's

Reply via email to