I have checked out trunk from below location. made the build but i can still see its crawling the same file again and again.
svn checkout http://svn.apache.org/repos/asf/manifoldcf/trunk mcf-trunk My configuration : Nuxeo input connector Max connections: 10 Connection type: CMIS Authority group: None (global authority) Parameters: username=Administrator password=******** binding=atom protocol=http server=localhost port=8080 path=/nuxeo/atom/cmis repositoryId= *output connector : solr *connector with max connections 10. as far as i know output connector has no information about whether its same file or different. *job configuration : * Priority: 5 Start method: Start at beginning of schedule window Schedule type: Rescan documents dynamically Minimum recrawl interval: 10 minutes Maximum recrawl interval: Infinity Expiration interval: Infinity Reseed interval: 60 minutes No scheduled run times No forced metadata Maximum hop count for link type 'child': Unlimited Hop count mode: Delete unreachable documents i have only one file in my nuxeo repository and i see after every 10 mins same file is sent to output connector again and again. i mean the call goes to addOrReplaceDocument method inside output connector even though there is no change to the file in nuxeo repository. regards, Jitu On Tue, Jul 29, 2014 at 11:27 PM, Jitu <[email protected]> wrote: > Thanks Karl and Prasad. its great to hear back so quickly. Thanks for the > info it really helped me. > > Thanks for the support > > Regards, > Jitu > > > On Tue, Jul 29, 2014 at 10:41 PM, Karl Wright <[email protected]> wrote: > >> Hi Jitu, >> >> The bug is that the CMIS and Alfresco connectors reindexed documents even >> though they had not changed. This is now corrected. >> >> Karl >> >> >> >> On Tue, Jul 29, 2014 at 12:28 PM, Jitu <[email protected]> wrote: >> >>> Hi Prasad, >>> Thanks for the reply. the bug says "The CMIS and Alfresco >>> connectors currently do not look at scanOnly but should". does that mean >>> cmis connector and alfresco connector crawls all the files and hands over >>> to output connector no matter whether they are modified or not. Ideally it >>> should crawl only if the file is modified else not. am i correct? >>> >>> regards, >>> jitu >>> >>> >>> >>> >>> >>> On Tue, Jul 29, 2014 at 9:19 PM, Paththamestrige Perera < >>> [email protected]> wrote: >>> >>>> Hello Jitu, I had the same issue and this was fixed with CONNECTORS-994 >>>> <https://issues.apache.org/jira/browse/CONNECTORS-994> for the MCF 1.7 >>>> If you could checkout the mcf-trunk, it will work as expected. >>>> >>>> >>>> >>>> On Tue, Jul 29, 2014 at 11:31 AM, Jitu <[email protected]> wrote: >>>> >>>>> Hi, >>>>> >>>>> I am a freelancer. for my current project i am using manifoldcf >>>>> framework where i need to pull documents from cmis repository and output >>>>> to >>>>> solr connector. >>>>> >>>>> But i noticed when i set job type as continuous. it is crawling all >>>>> the files everytime no matter whether they are modified or not. but my >>>>> requirement is to crawl the files again only if there is any modification. >>>>> >>>>> how can i do it with manifoldcf. >>>>> >>>>> Regards, >>>>> abjitu >>>>> >>>> >>>> >>> >> >
