[ 
https://issues.apache.org/jira/browse/CONNECTORS-735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-735:
-----------------------------------

    Fix Version/s: ManifoldCF 1.3
    
> Include crawling date as metadata in OutputConnector
> ----------------------------------------------------
>
>                 Key: CONNECTORS-735
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-735
>             Project: ManifoldCF
>          Issue Type: New Feature
>          Components: Framework core
>    Affects Versions: ManifoldCF 1.2
>            Reporter: Stephane Gamard
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 1.3
>
>
> While datum is a nightmare (not all connectors get their dates in the same 
> manner, same way, etc etc etc) it might be interesting to leverage the 
> crawling to date some volatile media (such as web). 
> In case of webcrawling there are 3 dates that can certainly be inferred from 
> the crawler's activity: 
> - Date of page first appeared in queue (somewhat loosely equivalent to a 
> created date)
> - Date of last checked by the crawler (might not reflect a version update, 
> content could still be exactly the same)
> - Date of last update (since the URL exists in the queue, it might have 
> changed over time and the crawler m ight know about this). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to