Hi,
 
when we increased the topN to 100k, how come the number of fetched URLs
only increased a little bit.
 
i ran the command "grep -c fetching 2009*.indexweb.log" to count the
number of occurrences of the "fetching" word in the log file
   log file name                            :fetched ; command line
   20081001_221000.indexweb.log:30000   ; crawl -depth 1 -topN 30000
   20081002_221001.indexweb.log:50000   ; crawl -depth 1 -topN 50000
   20081203_221000.indexweb.log:53812   ; crawl -depth 1 -topN 100000
 
what is the right combination for depth and topN.
we are missing some of our pages and looks like after the 5th URL jump /
deep (is this called iteration?), those pages do not get fetched nor
indexed anymore.

Thanks,
Ann

Reply via email to