Re: Is there a way to tell nutch fetcher not to parse for text in the page? (i.e. just links)

joel.gump Fri, 26 Oct 2007 05:45:43 -0700

maybe you can try to use

http://search.capan.org/~podmaster/HTML-LinkExtractor-0.13


eyal edri wrote:

Hi,

Is there a way to tell nutch not to parse the pages it fetches? meaning just
to extract the links from it?
I know there is a "-no parsing" attribute,but still i need to d/l some
contentTypes using the parse-XXX plugins.. so i'm not sure it will work if i
use the option.

Thank you,

Re: Is there a way to tell nutch fetcher not to parse for text in the page? (i.e. just links)

Reply via email to