You can either add your connector's class name to the connectors.xml file, in which case it will be registered when ManifoldCF is started, or you can use the command-line command for registering a connector. See the Programmatic Operation page for that option.
Karl On Tue, Sep 17, 2013 at 11:47 AM, Pranesh Vadhirajan < [email protected]> wrote: > Hi Karl, > > Thanks for your response. I've decided to not use my file monitoring > approach for a more elegant solution, I think. I have started to implement > my own File system output connector by extending the > org.apache.manifoldcf.agents.output.filesystem.FileOutputConnector class. > I > have a question about this however: Once I've created my own > implementation > of an output connector from the above class, how can I register my output > connector with ManifoldCF so that jobs can run properly with my own output > connector implementation? > > I.e., at the moment (before implementing my own connector) when I create an > output connection in ManifoldCF, I use the JSON API with > "org.apache.manifoldcf.agents.output.filesystem.FileOutputConnector" as my > class name. How can I register my own class name (let's say > "sample_package.myfileoutputconnector", which extends > "org.apache.manifoldcf.agents.output.filesystem.FileOutputConnector") with > ManifoldCF, so that jobs can be defined to use my own output connector? > > Thanks, > Pranesh Vadhirajan > > -----Original Message----- > From: Karl Wright [mailto:[email protected]] > Sent: Monday, September 16, 2013 5:59 PM > To: dev > Subject: Re: File System crawler question > > Hi Pranesh, > > The API basically allows you to do anything you can do in the UI. In the > UI > you would use the Document Status report to figure out what documents > belongs to a given job that were in a particular state, and that's exactly > what you will need to do here. See the "Programmatic Operation" page. > Here are some sections of interest: > > > http://manifoldcf.apache.org/release/trunk/en_US/programmatic-operation.html > #Queue+query+parameters > > ... and the following REST operation: > > repositoryconnectionquery/*<encoded_connection_name> > > * > *Karl > * > > > On Mon, Sep 16, 2013 at 5:50 PM, Pranesh Vadhirajan < > [email protected]> wrote: > > > Hi All, > > > > > > > > I am implementing my own java client to crawl file system resources > > via the ManifoldCF JSON based API. I have been able to define and run > > a job to crawl a file system repository and output to a file system > destination. > > > > > > > > The trouble I'm having currently is to be able to know which documents > > have been crawled via the ManifoldCF API. I have looked through the > > API documentation on the ManifoldCF release pages, but I'm unable to > > find this information. Could someone point me in the right direction? > > > > > > > > When I try to use java API for file system monitoring (to check on the > > contents of the output folder), I'm having issues with the files being > > locked during the execution of the job. Therefore, I need to query > > ManifoldCF engine to understand what documents have been changed in > > the output area so that I can run my file system monitoring code on a > > different schedule. > > > > > > > > Please let me know if I didn't explain myself well here. > > > > > > > > Thanks, > > Pranesh > > > > > > > > Pranesh Vadhirajan > > > > > > > > > > >
