nutch says No URLs to fetch - check your seed list and URL filters when trying to index fmforums.com

2010-04-20 Thread joshua paul
nutch says No URLs to fetch - check your seed list and URL filters when trying to index fmforums.com. I am using this command: bin/nutch crawl urls -dir crawl -depth 3 -topN 50 - urls directory contains urls.txt which contains http://www.fmforums.com/ - crawl-urlfilter.txt contains +^http://([

Re: nutch says No URLs to fetch - check your seed list and URL filters when trying to index fmforums.com

2010-04-20 Thread joshua paul
times, to break loops -.*(/[^/]+)/[^/]+\1/[^/]+\1/ +^http://([a-z0-9]*\.)*fmforums.com/ # skip everything else -. arkadi.kosmy...@csiro.au wrote on 2010-04-20 4:49 PM: What is in your regex-urlfilter.txt? -Original Message----- From: joshua paul [mailto:jos...@neocodesoftware

Re: nutch says No URLs to fetch - check your seed list and URL filters when trying to index fmforums.com

2010-04-21 Thread joshua paul
filter.txt? -Original Message----- From: joshua paul [mailto:jos...@neocodesoftware.com] Sent: Wednesday, 21 April 2010 9:44 AM To: nutch-user@lucene.apache.org Subject: nutch says No URLs to fetch - check your seed list and URL filters when trying to index fmforums.com nutch says No URL