[
https://issues.apache.org/jira/browse/NUTCH-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14598479#comment-14598479
]
Sebastian Nagel edited comment on NUTCH-2045 at 6/23/15 10:20 PM:
------------------------------------------------------------------
+1 to patch 2.x
Is 1.x (1.10) really affected? BasicIndexingFilter uses the fetch time of the
fetch datum (from segment) for "tstamp". The next fetch time is contained in
CrawlDb's crawl datum which is not passed to the indexing filers.
was (Author: wastl-nagel):
Is 1.x (1.10) really affected? BasicIndexingFilter uses the fetch time of the
fetch datum (from segment) for "tstamp". The next fetch time is contained in
CrawlDb's crawl datum which is not passed to the indexing filers.
> index-basic incorrect assignment of next fetch time (page.getFetchTime()) as
> page fetch time
> --------------------------------------------------------------------------------------------
>
> Key: NUTCH-2045
> URL: https://issues.apache.org/jira/browse/NUTCH-2045
> Project: Nutch
> Issue Type: Bug
> Components: plugin
> Affects Versions: 2.3, 1.10
> Reporter: Lewis John McGibbney
> Assignee: Lewis John McGibbney
> Fix For: 1.11, 2.3.1
>
> Attachments: NUTCH-2045.patch
>
>
> The issue here as flagged up when using indexer-elastic plugin where the page
> fetch time is incorrectly assigned as the NEXT fetch time as oppose to the
> time at which the page was actually fetched (prevFetchTime).
> The ML thread for this issue can be found below
> http://www.mail-archive.com/user%40nutch.apache.org/msg13661.html
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)