[
https://issues.apache.org/jira/browse/CONNECTORS-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13498003#comment-13498003
]
Karl Wright commented on CONNECTORS-567:
----------------------------------------
I think not being able to handle deletions is a significant problem, since this
is an incremental crawler. We'd have to have a solution to that problem before
this could be a possibility. Right now the only ways deletion is detected is
by getting the version string for the document that no longer exists.
Also, FWIW, I make free copies of ManifoldCF in Action available to all
committers, upon request.
> Extended seeding interface which provides document versions
> -----------------------------------------------------------
>
> Key: CONNECTORS-567
> URL: https://issues.apache.org/jira/browse/CONNECTORS-567
> Project: ManifoldCF
> Issue Type: Wish
> Reporter: Maciej Lizewski
>
> There are some cases when seeding function can provide document version with
> data it already has.
> Current data flow needs one call to addSeedDocuments, then call to
> getDocumentVersions, which essentialy must fetch same data, and after that
> one more call to processDocuments. The last one probably needs separate call
> because it needs to fetch document body, however seeding and getting versions
> in many cases work on very same data (and probably duplicating requests to
> repository).
> Now - reducing number of needed request to repository by eliminating
> getDocumentVersions call for document which have version returned by
> addSeedDocuments could significantly reduce load.
> getDocumentVersions would still be called for older docuemnts (not returned
> by addSeedDocuments) to check if they were modified or deleted.
> This is only proposition...
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira