So can I ask you a few question 1. Can I disable plugin-rss and leave only feed active? 2. Do I need to have any other plugins enabled to have only RSS feeds parsed and indexed? (html, text)
can I user the crawl command (used for intranet crawling) to perform crawling. thanks Alexander 2009/2/3 Doğacan Güney <doga...@gmail.com> > On Tue, Feb 3, 2009 at 10:30 AM, Alexander Aristov > <alexander.aris...@gmail.com> wrote: > > People > > > > Question about rss feed parsers. > > > > I am trying to configure Nutch to crawl rss feeds. I have enabled the > feed > > and parse-rss plugins. I found out that these are two separate plugins > and > > that parse-rss is older. Thats ok. > > > > I expect that these parsers would produce me separate documents for each > > item in a feed but instead I get only rss header parsed and stored in the > > index. Items are not included in the lucene indexes. > > > > How can point me on necessary configuration params I should change to > have > > RSSs indexed. > > > > Plugin feed should work like that on nutch trunk (separate document for > separate > entry) > > > -- > > Best Regards > > Alexander Aristov > > > > > > -- > Doğacan Güney > -- Best Regards Alexander Aristov