Hello, I'm trying to crawl a large number of sites (eventually) one by one. After playing with Nutch and Solr for a couple of days now, I 'm not really sure why crawling takes such a long time.
I was crawling ONE web-site, that has 5 pages on it with very minimal text content with about 10 pictures, and it took ~3 minutes. - I turned off external-link crawling in the configuration, - command: bin/nutch crawl urls -solr http://localhost:8983/solr/ -depth 2 -topN 10000 - the URL file has one URL in it (MYDOMAIN.COM as an example!), - and in the conf-crawl-urlfilter.txt has 1 rule set +^http://([a-z0-9]*\.)*MYDOMAIN.COM/ Is there a way I can speed this up? thanks, --i -- View this message in context: http://lucene.472066.n3.nabble.com/Why-is-my-Nutch-crawling-so-slow-tp4037964.html Sent from the Nutch - User mailing list archive at Nabble.com.

