[ 
https://issues.apache.org/jira/browse/NUTCH-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebastian Nagel resolved NUTCH-2297.
------------------------------------
    Resolution: Fixed

See comments in NUTCH-2474.

> CrawlDbReader -stats wrong values for earliest fetch time and shortest 
> interval
> -------------------------------------------------------------------------------
>
>                 Key: NUTCH-2297
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2297
>             Project: Nutch
>          Issue Type: Bug
>          Components: crawldb
>    Affects Versions: 1.13
>            Reporter: Sebastian Nagel
>            Assignee: Sebastian Nagel
>            Priority: Minor
>             Fix For: 1.14
>
>
> NUTCH-2286 added min, max and average for fetch interval and fetch time.
> When running in distributed mode (not reproducible in local mode), the values 
> for the minimum (earliest fetch time and shortest fetch interval) may be 
> wrong with implausible values:
> {noformat}
> TOTAL urls: 7180518032
>  shortest fetch interval:    175 days, 00:00:00             <<<<<< ????
>  avg fetch interval: 10 days, 08:01:36
>  longest fetch interval:     15 days, 18:00:00
>  earliest fetch time:        Thu Dec 20 05:30:00 UTC 3106   <<<<<< ????
>  avg of fetch times: Fri Feb 19 00:07:00 UTC 2016
>  latest fetch time:  Mon Jul 18 05:22:00 UTC 2016
>  retry 0:    6907984913
>  retry 1:    148125397
>  retry 2:    82761892
>  retry 3:    41645830
>  min score:  0.0
>  avg score:  0.014360981
>  max score:  9.25
>  ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to