Hi, Yeah this is something I noticed too some while ago. Although it does not directly break the crawling directly, it is not a nice implementation. Notice that the Generator tries to correct for fetchtime too far off in the future. (In the AbstractFetchSchedule shouldFetch method.)
As a matter of fact I have refactored the updating process slightly to only update the fetchtime once. (Directly after a fetch that is). The best part is that this change allows for running several generate-fetch cycles without running the updater every time. There is a slight downside but I will post it in the issue after I have attached a patch for this improvement: https://issues.apache.org/jira/browse/NUTCH-1457 Ferdy. On Wed, Aug 15, 2012 at 2:11 PM, lin weijian <[email protected]> wrote: > > Hi, > When DbUpdateReducer executes, it will call setFetchSchedule for a > fetched page. This function will > add fetch interval to the new fetch time, no matter if it has been added > up. It makes the fetch time becoming more and more big. It's should add > fetch interval to last fetch time. > > Thanks. >

