Raghavendra Prabhu wrote:
Hi Andrzej
After applying the patch, i seemed to find some strange behaviour
The fetch list for each URL was getting created inspite of the fact that
db.default.fetch.interval had not been reached
You probably forgot to change the interval from days to seconds. It's
now expressed in seconds. This defines the maximum allowed interval, and
any pages with interval higher than that will be refetched anyway - so
if it's 30 (seconds :) ) then there is a high probability that you reach
this limit before each cycle completes...
I thought this was supposed to be in this order
1)For the particular url/file get db fetch interval (which changes)
2) if current date exceeds db fetch interval, generate fetch list for the
particular file url
3) fetch list checks for file modified date and then decides to fetch the
latest contents file/URL
It is supposed to function in the above manner right. Did i miss out
anything???
Yes, this is how it's supposed to work.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general