Hi all, I am using crawl tool in Nutch81 under cygwin,trying to retrieve pages from about 2 thousand websites,and the crawl process has been running for nearly 20 hours. But during the past 10 hours, the fetch status always remain the same as below: TOTAL urls: 165212 retry 0: 164110 retry 1: 814 retry 2: 288 min score: 0.0 avg score: 0.029228665 max score: 2.333 status 1 (DB_unfetched): 134960 status 2 (DB_fetched): 27812 status 3 (DB_gone): 2440 all the number in the status remain the same. DB_fetched page always is 27812. From the console output and hadoop.log I can see the the page fetching process is running without any error.
the size of the crawl db also have no change,always be 328M. I have tried to solve this problem during all the last week. any hints for this problem is appreciated. Thanks and bow~~~
