My fault actually,
        I was making experiments so
        I indexed the document D directly to solr 
        added a reference in processDocuments to doc D
                getDocumentVersions() was returning null for doc D
        but it wasn't removed…

        then I realized that manifold doesn't remove what it didn't index 
itself (not all crawlers behave this way)
        So I made another test indexing doc D with manifold and everything 
works as expected

        hope this helps others
-- 
Matteo Grolla
Sourcesense - making sense of Open Source
http://www.sourcesense.com

Il giorno 16/giu/2014, alle ore 19:11, Karl Wright ha scritto:

> Hi Matteo,
> 
> The document should be deleted from the target repository when you return a
> null document version.  Why do you think it does not?
> 
> As for your second question, please read up on the various models that the
> crawler supports.  They're described pretty thoroughly in ManifoldCF in
> Action.
> 
> Karl
> 
> 
> 
> On Mon, Jun 16, 2014 at 12:47 PM, Matteo Grolla <[email protected]>
> wrote:
> 
>> Hi,
>>        I see that if I return null in getDocumentVersions()  (actually
>> the array values are null)
>> the method processDocuments is not called for the corresponding identifiers
>> But the document is not deleted from the target repository.
>> I'm using the filesystem connector, so those are my settings for the
>> crawling mode.
>> Supposing that my source repository gives me the list of deleted
>> documents, what should I do to handle the deletion?
>> 
>> Cheers
>> 
>> --
>> Matteo Grolla
>> Sourcesense - making sense of Open Source
>> http://www.sourcesense.com
>> 
>> 

Reply via email to