Nutch's default feedparser plugin only indexes the rss feed and does not go to the referenced urls. Do you wish to crawl the referenced urls ?
On Tue, Jan 1, 2013 at 4:53 AM, Rendy Bambang Junior <[email protected]>wrote: > Hi guys, > > could I use nutch to crawl a feed, then crawl news from that feed? I've > been succeed in crawling the rss feed itself, but what I want is my index > contains only news, without the rss feed. Do anybody know how to do this > using nutch? Or, it will be better to use another tools to do this? > > Thank you, and happy new year! > > -- > Regards, > Rendy Bambang Junior > Informatics Engineering '09 > Bandung Institute of Technology >

