2009/2/3 Alexander Aristov <alexander.aris...@gmail.com>:
> So can I ask you a few question
>
> 1. Can I disable plugin-rss and leave only feed active?

This is a good idea because parse-rss takes priority over feed.
So if both are active, parse-rss will work, not feed.

> 2. Do I need to have any other plugins enabled to have only RSS feeds parsed
> and indexed? (html, text)
>

You do not need to but it is advised to leave parse-html active. This way,
if feed entry contains html, parse-html can parse html content.

> can I user the crawl command (used for intranet crawling) to perform
> crawling.
>

I don't know :) I never used it. But I think you can.

> thanks
>
> Alexander
>
>
> 2009/2/3 Doğacan Güney <doga...@gmail.com>
>
>> On Tue, Feb 3, 2009 at 10:30 AM, Alexander Aristov
>> <alexander.aris...@gmail.com> wrote:
>> > People
>> >
>> > Question about rss feed parsers.
>> >
>> > I am trying to configure Nutch to crawl rss feeds. I have enabled the
>> feed
>> > and parse-rss plugins. I found out that these are two separate plugins
>> and
>> > that parse-rss is older. Thats ok.
>> >
>> > I expect that these parsers would produce me separate documents for each
>> > item in a feed but instead I get only rss header parsed and stored in the
>> > index. Items are not included in the lucene indexes.
>> >
>> > How can point me on necessary configuration params I should change to
>> have
>> > RSSs indexed.
>> >
>>
>> Plugin feed should work like that on nutch trunk (separate document for
>> separate
>> entry)
>>
>> > --
>> > Best Regards
>> > Alexander Aristov
>> >
>>
>>
>>
>> --
>> Doğacan Güney
>>
>
>
>
> --
> Best Regards
> Alexander Aristov
>



-- 
Doğacan Güney

Reply via email to