[ http://issues.apache.org/jira/browse/NUTCH-61?page=comments#action_12449170 ] Andrzej Bialecki commented on NUTCH-61: ----------------------------------------
Unfortunately, this patch hasn't been applied yet, due to its complexity and lack of testing. But it will be, sooner or later, because this functionality is required for any serious use. I'm planning to bring this patch to the latest trunk, and then apply it piece-wise over the next couple of weeks. > Adaptive re-fetch interval. Detecting umodified content > ------------------------------------------------------- > > Key: NUTCH-61 > URL: http://issues.apache.org/jira/browse/NUTCH-61 > Project: Nutch > Issue Type: New Feature > Components: fetcher > Reporter: Andrzej Bialecki > Assigned To: Andrzej Bialecki > Attachments: 20050606.diff, 20051230.txt, 20060227.txt, > nutch-61-417287.patch > > > Currently Nutch doesn't adjust automatically its re-fetch period, no matter > if individual pages change seldom or frequently. The goal of these changes is > to extend the current codebase to support various possible adjustments to > re-fetch times and intervals, and specifically a re-fetch schedule which > tries to adapt the period between consecutive fetches to the period of > content changes. > Also, these patches implement checking if the content has changed since last > fetching; protocol plugins are also changed to make use of this information, > so that if content is unmodified it doesn't have to be fetched and processed. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
