Thank you very much, feng lu! I have same problem with the url https://www.mail-archive.com/[email protected]/msg12111.html, can it be solved?
Regards. 2014-05-19 15:37 GMT+08:00 feng lu <[email protected]>: > I see that the ModifiedTime used in protocol plugins that mean if the > webpage has not changed , no need to download again. And their have also > used in FetchSchedule implementation that used for continuously monitor a > site and crawl updates. > > > On Mon, May 19, 2014 at 2:52 PM, 韩驰 <[email protected]> wrote: > > > Hi everyone! > > > > > > After reading the issue: > > https://issues.apache.org/jira/browse/NUTCH-1651, I have some doubts. > > What is the modifiedTime and prevmodifiedTime? And is the target to > > avoid fetching the same urls when fetching for a second time? > > > > > > Thank you in advance! > > > > > > -- > Don't Grow Old, Grow Up... :-) >

