Andrew Chen wrote:

All,

I'm working on a project where one component is looking at RSS feeds -
Although Nutch has been quite focused on lots of pages without
necessarily "freshness" as the goal, for RSS feeds it's pretty crucial
to be able to hit the server every couple hours, rather than even
every day...

Right now Page.getFetchInterval() is storing an integer carrying the
number of days, and it would be great to either have it more granular
(number of hours) or for it to be a float variable where I can say .1
days.

I'm getting around this by storing the more granular time information
off in a database, and querying the DB to set "nextFetchTime" to be
right now + 2 hours.

Is there a better way to handle this that I'm missing?

Before vacation I was almost ready with a patch to change fetchInterval to float, however Nutch is a rapidly moving target... I could send you what I've got, relative to a verson from ca. end of May, or you can wait a 2-3 weeks until I can find the time to update it... ;-)


--
Best regards,
Andrzej Bialecki

-------------------------------------------------
Software Architect, System Integration Specialist
CEN/ISSS EC Workshop, ECIMF project chair
EU FP6 E-Commerce Expert/Evaluator
-------------------------------------------------
FreeBSD developer (http://www.freebsd.org)



-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to