[ https://issues.apache.org/jira/browse/NUTCH-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12516623 ]
Doğacan Güney commented on NUTCH-532: ------------------------------------- > res.getFetchTime() - Math.round(res.getFetchInterval() * 1000d); always give > the last fetch time even in > AdaptiveFetchSchedule. That's good to hear. [...snip...] > Regardings your suggestion to change the CrawlDatum object, I'm wondering few > things: > - why do we keep the fetchinterval as a FLOAT ? store as a Long could avoid > to make this kind of conversion. > - why do we use fetchinterval in second ? it seems that we always use the > fetchinterval in milisecond, so we can maybe > import it from the config files in second and store it in the crawldatum in > milisecond. Unfortunately, I don't know :). Andrzej Bialecki has written that code and he knows fetch interval / fetch time stuff better than me. So he can make a much more informed comment. > - Perhaps we can also create a new method (setFetchTimeBasedOnInterval(time) > ), instead of repeating > datum.setFetchTime(fetchTime + Math.round(datum.getFetchInterval() * > 1000.0d)) in different class. don't u think ? +1 IMHO, any code duplicated in more than a few places should be refactored into a method. [...snip...] > CrawlDbMerger: wrong computation of last fetch time > --------------------------------------------------- > > Key: NUTCH-532 > URL: https://issues.apache.org/jira/browse/NUTCH-532 > Project: Nutch > Issue Type: Bug > Reporter: Emmanuel Joke > Assignee: Emmanuel Joke > Fix For: 1.0.0 > > Attachments: NUTCH-532.patch > > > CrawlDbMerger.reduce analyse the last fetch time of each record and keep the > more recent record. > This comparison is based on a FetchInterval in days : resTime = > res.getFetchTime() - Math.round(res.getFetchInterval() * 3600 * 24 * 1000); > It was not really a noticeable as the Math.Round method return the > INTEGER.MAX_VALUE i.e 25 days. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers