What is in your regex-urlfilter.txt?
> -----Original Message----- > From: joshua paul [mailto:jos...@neocodesoftware.com] > Sent: Wednesday, 21 April 2010 9:44 AM > To: nutch-user@lucene.apache.org > Subject: nutch says No URLs to fetch - check your seed list and URL > filters when trying to index fmforums.com > > nutch says No URLs to fetch - check your seed list and URL filters when > trying to index fmforums.com. > > I am using this command: > > bin/nutch crawl urls -dir crawl -depth 3 -topN 50 > > - urls directory contains urls.txt which contains > http://www.fmforums.com/ > - crawl-urlfilter.txt contains +^http://([a-z0-9]*\.)*fmforums.com/ > > Note - my nutch setup indexes other sites fine. > > For example I am using this command: > > bin/nutch crawl urls -dir crawl -depth 3 -topN 50 > > - urls directory contains urls.txt which contains > http://dispatch.neocodesoftware.com > - crawl-urlfilter.txt contains > +^http://([a-z0-9]*\.)*dispatch.neocodesoftware.com/ > > And nutch generates a good crawl. > > How can I troubleshoot why nutch says "No URLs to fetch"?