Hi Prasad, Since the CMIS and Alfresco connectors do not pay attention to the scanOnly flag, they are not correctly written and should be fixed. Could you create a ticket to address this?
Thanks, Karl On Tue, Jul 15, 2014 at 5:06 PM, Paththamestrige Perera < [email protected]> wrote: > Hello All, > > I'm new to Apache ManifoldCF and I have spent sometime referring the > publication 'ManifoldCF in Action' as well. I have started using the > ManifoldCF system with the available repository connectors, CMIS Repository > Connector, Alfresco Repository Connector and File System Connector. > > I have used them as continuous crawlers with specific re-crawl intervals. > What I have noticed is that, irrelevant to the Document version (whether it > has changed or not), in all re-crawl jobs, CMIS and Alfresco connectors > process all seeded documents. I took a look at their implementations and as > I could see, these repository connectors does not use the property > 'scanOnly' at the processing time of seeded documents which hints if the > document version has changed. It seems intentional by design. So I'm hoping > to know why is it necessary to process all seeded documents (oppose to only > process documents that were updated within the re-crawling interval) ? > > Thanks! > > Prasad Perera. >
