[
https://issues.apache.org/jira/browse/CONNECTORS-1009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright updated CONNECTORS-1009:
------------------------------------
Priority: Major (was: Minor)
Affects Version/s: ManifoldCF 1.7
Fix Version/s: ManifoldCF 1.7
> Cmis Repository Connector does not handle Document updating properly
> --------------------------------------------------------------------
>
> Key: CONNECTORS-1009
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1009
> Project: ManifoldCF
> Issue Type: Bug
> Components: CMIS connector
> Affects Versions: ManifoldCF 1.7
> Reporter: Prasad Perera
> Assignee: Karl Wright
> Fix For: ManifoldCF 1.7
>
> Attachments: std_logs.txt, std_prints.diff
>
>
> As a part of the Fix for CONNECTORS-1004, It seems CmisRepositoryConnector
> does not handle document updating properly.
> Case Scenario:
> * Create a continuous crawling job using CmisRepositoryConnector.
> * Update a document on repository end.
> * The document keep submitting to OutputConnector at each crawling interval
> though it was not updated afterwards.
> One possible Fix needed I is : @ CmisRepositoryConnector:processDocument,
> activities.ingestDocumentWithException(nodeId, version, documentURI, rd);
> The documentURI should point to the old document URI (Now it points to the
> latest documentURI discovered and it may seems to confuse document references
> ?)
> Also, In ECM systems, for example in Alfresco, the documentIDs are formulated
> with the version number as well.
> Ex: workspace://SpacesStore/8e12a887-3fa8-48d6-8516-5bcfad358ba2;1.0 -->
> version 1.0
> workspace://SpacesStore/8e12a887-3fa8-48d6-8516-5bcfad358ba2;1.1 --> version
> 1.1
> When we setup a query to crawl a repository folder, we discover content by
> referring the child nodes. Because of that, now it seems to queue all the
> document versions and submit them to OutputConnector thus producing duplicate
> documents at the output (search) side.
> Is there a way to avoid this problem ? It will be great if the repository can
> just take the latest document version and submit it as an update.
--
This message was sent by Atlassian JIRA
(v6.2#6252)