Thanks RemyA. My ignore outlinks was set to true and most of the urls on the feed were outlinks. that was the problem.
Wen I set it to false, it worked. also, i had to remove ? and = from regex url filter. to allows feed urls like ?format=rss Thanks for ur time. On Thu, Jun 7, 2012 at 1:10 AM, Rémy Amouroux <[email protected]> wrote: > first problem coming to mind : is your regexp-urlfilter accepting those > urls ? > > You should also do a readseg on the crawled segment to see of those urls > are listed in the outlinks of the feeds. > > Regards > > RemyA > > Le 6 juin 2012 à 19:14, Shameema Umer a écrit : > > > I have added the feed plugin to the nutch-site.xml > > > > and provided some feed urls on the seed.txt. > > > > but nutch is not crawling those urls found on the feed file. Please help. > >

