Hi, There's only 1 url in table 'webpage'. I run command: bin/nutch crawl -solr http://localhost:8080/solr/collection2 -threads 10 -depth 2 -topN 10000, then I find the url is crawled twice.
Here's the log: 55 2013-02-17 20:45:00,965 INFO fetcher.FetcherJob - fetching http://www.p5w.net/stock/lzft/gsyj/201209/t4470475.htm 84 2013-02-17 20:45:11,021 INFO parse.ParserJob - Parsing http://www.p5w.net/stock/lzft/gsyj/201209/t4470475.htm 215 2013-02-17 20:45:38,922 INFO fetcher.FetcherJob - fetching http://www.p5w.net/stock/lzft/gsyj/201209/t4470475.htm 244 2013-02-17 20:45:46,031 INFO parse.ParserJob - Parsing http://www.p5w.net/stock/lzft/gsyj/201209/t4470475.htm Do you know how to fix this? Besides, when I run the command again. The same log is written in hadoop.log. I don't know why the configuration 'db.fetch.interval.default' in nutch-site.xml doesn't take effect. Thanks. Regards, Rui

