hi there,

I have webdb with over 60,000 pages (using nutch/admin
dumptxt command) and refetching interval is set as 1
day

<property>
  <name>db.default.fetch.interval</name>
  <value>1</value>
  <description>The default number of days between
re-fetches of a page.
  </description>
</property>

But, when I do crawling based on this webdb next day,
the generate log only showing that around 8,000 pages
being generated for fetching and actually 7,500 pages
being fetched down.

Any reason why it behaves like that? Should 60,000
pages being fetching this time?

thanks,

Michael,

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to