Hello

I installed Nutch 2.2 on my linux machine.

I defined the seed directory with one file containing:
http://en.wikipedia.org/
http://edition.cnn.com/


I ran the following:
sh bin/nutch inject ~/DataExplorerCrawl_gpfs/seed/

After this step:
the call
-bash-4.1$ sh bin/nutch readdb -stats

returns
TOTAL urls:     2
status 0 (null):        2
avg score:      1.0


Then, I ran the following:
bin/nutch generate -topN 10
bin/nutch fetch -all
bin/nutch parse -all
bin/nutch updatedb
bin/nutch generate -topN 1000
bin/nutch fetch -all
bin/nutch parse -all
bin/nutch updatedb


However, the stats call after these steps is still:
the call
-bash-4.1$ sh bin/nutch readdb -stats
status 5 (status_redir_perm):   1
max score:      2.0
TOTAL urls:     3
avg score:      1.3333334



Only 3 urls?!
What do I miss?

thanks

Benjamin

Reply via email to