On Fri, Jul 10, 2009 at 13:21, Beats<tarun_agrawal...@yahoo.com> wrote: > > hi, > > thanx for the help > > but it is giving parsing error. is there some other changes to b made??? > > > the error is > fetcher.Fetcher (Fetcher.java:output(796)) - Error parsing: > http://www.indeed.co.in/rss: failed(2,0) >
http://www.indeed.co.in/robots.txt /rss is Disallow-ed. So nutch doesn't crawl it. > > Doğacan Güney-3 wrote: >> >> On Fri, Jul 10, 2009 at 10:01, Beats<tarun_agrawal...@yahoo.com> wrote: >>> >>> hi, >>> >>> i m new to nutch. >>> i m trying to crawl and index the rss feed using feed plugin. >>> >>> what i want is to parse the rss page and index each item's content >>> seperately. >>> so that when the user search the content , the content in the item is >>> searched and displayed...(not the whole rss feed page content). >>> >> >> Try using the feed plugin. It extracts each item in rss as a different >> page. >> >>> any suggestion would b appriciated.. >>> >>> >>> thanx in advance >>> >>> Beats >>> -- >>> View this message in context: >>> http://www.nabble.com/indexing-each-item-in-seperate-page-tp24422674p24422674.html >>> Sent from the Nutch - User mailing list archive at Nabble.com. >>> >>> >> >> >> >> -- >> Doğacan Güney >> >> > > -- > View this message in context: > http://www.nabble.com/indexing-each-item-in-seperate-page-tp24422674p24424901.html > Sent from the Nutch - User mailing list archive at Nabble.com. > > -- Doğacan Güney