[
https://issues.apache.org/jira/browse/CONNECTORS-735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright updated CONNECTORS-735:
-----------------------------------
Fix Version/s: ManifoldCF 1.3
> Include crawling date as metadata in OutputConnector
> ----------------------------------------------------
>
> Key: CONNECTORS-735
> URL: https://issues.apache.org/jira/browse/CONNECTORS-735
> Project: ManifoldCF
> Issue Type: New Feature
> Components: Framework core
> Affects Versions: ManifoldCF 1.2
> Reporter: Stephane Gamard
> Assignee: Karl Wright
> Fix For: ManifoldCF 1.3
>
>
> While datum is a nightmare (not all connectors get their dates in the same
> manner, same way, etc etc etc) it might be interesting to leverage the
> crawling to date some volatile media (such as web).
> In case of webcrawling there are 3 dates that can certainly be inferred from
> the crawler's activity:
> - Date of page first appeared in queue (somewhat loosely equivalent to a
> created date)
> - Date of last checked by the crawler (might not reflect a version update,
> content could still be exactly the same)
> - Date of last update (since the URL exists in the queue, it might have
> changed over time and the crawler m ight know about this).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira