crawl returned just one url

Rizwan Raza Fri, 24 Dec 2010 19:59:53 -0800

Hi guys:

I am using nutch 1.2. I did a crawl on www.thedogplace.com using the command
below


./bin/nutch crawl urls -dir crawl-thedogplace -depth 3 >& crawl.log

It finished the crawl successfully.

I then checked the stats to see how many urls it retrieved and I was
surprized to see it returned only 1 url.

>From the readdb command below

./bin/nutch readdb crawl/crawldb -dump outdir

it dumped the following output

http://www.thedogplace.com/ Version: 7
Status: 4 (db_redir_temp)
Fetch time: Sun Jan 23 01:54:24 CST 2011
Modified time: Wed Dec 31 18:00:00 CST 1969
Retries since fetch: 0
Retry interval: 2592000 seconds (30 days)
Score: 1.0
Signature: null
Metadata: _pst_: temp_moved(13), lastModified=0:
http://www.thedogplace.com/Main/Default.aspx?p=5&s=0

I was expecting crawl to bring multiple urls but it brought only
http://www.thedogplace.com/Main/Default.aspx?p=5&s=0

Is there anything I am missing?

Thanks
-rizwan

crawl returned just one url

Reply via email to