Re: Using Nutch to Crawl News via RSS

Sourajit Basak Mon, 31 Dec 2012 20:55:53 -0800

Nutch's default feedparser plugin only indexes the rss feed and does not go
to the referenced urls. Do you wish to crawl the referenced urls ?


On Tue, Jan 1, 2013 at 4:53 AM, Rendy Bambang Junior <[email protected]>wrote:

> Hi guys,
>
> could I use nutch to crawl a feed, then crawl news from that feed? I've
> been succeed in crawling the rss feed itself, but what I want is my index
> contains only news, without the rss feed. Do anybody know how to do this
> using nutch? Or, it will be better to use another tools to do this?
>
> Thank you, and happy new year!
>
> --
> Regards,
> Rendy Bambang Junior
> Informatics Engineering '09
> Bandung Institute of Technology
>

Re: Using Nutch to Crawl News via RSS

Reply via email to