Re: setting modifiedTime in DefaultFetchSchedule

2012-11-20 Thread Cesare Zavattari
On Tue, Nov 20, 2012 at 12:12 AM, Sebastian Nagel wastl.na...@googlemail.com wrote: Hi Cesare, Ciao Sebastian and thanks for your email. modifiedTime = fetchTime; instead of: if (modifiedTime = 0) modifiedTime = fetchTime; This will always overwrite modified time with the time the

setting modifiedTime in DefaultFetchSchedule

2012-11-15 Thread Cesare Zavattari
Hi all, the AdaptiveFetchSchedure has the following line: if (modifiedTime = 0) modifiedTime = fetchTime; that DefaultFetchSchedule has not. This seems to prevent DefaultFetchSchedule handle correctly possible 403 responses (modifiedTime seems to be always zero and HttpRequest.java doesn't set

Re: setting modifiedTime in DefaultFetchSchedule

2012-11-15 Thread Sebastian Nagel
Hi Cesare, hmhh... Good catch! The modifiedTime is also set in CrawlDbReducer.reduce right after FetchSchedule.setFetchSchedule is called and the signature hasn't changed compared to the previous fetch, cf. NUTCH-1341. At a first glance, it looks like the modifiedTime is indeed never set with