unsubscribe Da: Karl Wright <[email protected]> Risposta: "[email protected]" <[email protected]> Data: martedì 13 marzo 2018 20:51 A: "[email protected]" <[email protected]> Oggetto: Re: Modify job to add excludes files and directory
Hi Maxence, If you EXPORT a job that works in JSON, and then IMPORT the exported JSON into a new job, is that job broken? Karl On Tue, Mar 13, 2018 at 1:50 PM, msaunier <[email protected]<mailto:[email protected]>> wrote: Hello Karl, I have created 3 situations : 1. Create job manually (1_job_manually.json | 1_job_manually.png) 2. Create job with script and modify the order manually (2_job_mixte.json | 2_job_mixte.png) 3. Create job with script (3_job_script.json | 3_job_script.png) I do not see the difference. So : 1 and 2 work good, with the good order, but 3 have included files and directories in first. Thanks, Maxence De : Karl Wright [mailto:[email protected]<mailto:[email protected]>] Envoyé : lundi 12 mars 2018 21:29 À : [email protected]<mailto:[email protected]> Cc : Fabien Harrang <[email protected]<mailto:[email protected]>>; REUILLON Dominique <[email protected]<mailto:[email protected]>> Objet : Re: Modify job to add excludes files and directory Here is an idea. Define your job in the ui and use the API to fetch the json for it. Karl On Mon, Mar 12, 2018, 12:51 PM Karl Wright <[email protected]<mailto:[email protected]>> wrote: I will need to look at this later tonight before I can respond in detail. The document specification part of the API uses EXACTLY the same data as is stored for the job. There only difference is that the job specification is stored in XML, not JSON. The converters between the two do preserve ordering, however. Karl On Mon, Mar 12, 2018 at 12:38 PM, msaunier <[email protected]<mailto:[email protected]>> wrote: 1 : I have find a problem on the file system connector parts in this page (I think) : https://manifoldcf.apache.org/release/release-2.9.1/en_US/programmatic-operation.html You have read this JSON : {"startpoint":[{"_attribute_path":"c:\path_to_files","include":[{"_attribute_type":"file","_attribute_match":"*.txt"},{"_attribute_type":"file","_attribute_match":"*.doc"\,"_attribute_type":"directory","_attribute_match":"*"],"exclude":["*.mov"]]} I think, the json syntax is bad. I fink the correct JSON is : {"startpoint":[{"_attribute_path":"c:\\path_to_files","include":[{"_attribute_type":"file","_attribute_match":"*.txt"},{"_attribute_type":"file","_attribute_match":"*.doc","_attribute_type":"directory","_attribute_match":"*"}],"exclude":["*.mov"]}]} Corrections list : {"startpoint":[{"_attribute_path":"c:\\path_to_files","include":[{"_attribute_type":"file","_attribute_match":"*.txt"},{"_attribute_type":"file","_attribute_match":"*.doc"\,"_attribute_type":"directory","_attribute_match":"*"}],"exclude":["*.mov"]}]} But, this configuration does not working with the Windows Share connector. Syntax error on the exclude. 2 : For my problem, the JSON format is not the problem. It work. I join the json, generated with my python script and my database. (srvics33.json) If I go on the interface after PUT the configuration, they included files are in first and excluded in second. (image1.png) In my JSON, I have add excludes in first, but they are in second. I am forced to go on the interface and manually modify the order to optain a good result. (image2.png) Can I enter an order parameter [1-*] to place excluded files and directories in first? Thanks. Maxence De : Karl Wright [mailto:[email protected]<mailto:[email protected]>] Envoyé : lundi 12 mars 2018 14:38 À : [email protected]<mailto:[email protected]> Cc : Fabien Harrang <[email protected]<mailto:[email protected]>>; REUILLON Dominique <[email protected]<mailto:[email protected]>> Objet : Re: Modify job to add excludes files and directory Hi Maxence, You can have as many clauses in your JSON rule list as you like. You do not need to have both include and exclude rules in each clause. So you can precisely do in the JSON what you do in the UI. Thanks, Karl On Mon, Mar 12, 2018 at 9:07 AM, msaunier <[email protected]<mailto:[email protected]>> wrote: Ok. I have read that on the documentation : Rules are evaluated from top to bottom, and the first rule that matches the file name is the one that is chosen. But, in the API, if I PUT a new Job definition with the good order, ManifoldCF add included documents in first all the time. If I need to exlude in first, I can’t with API definition. I add the JSON at this email. API have an order parameter for the Startpoint, included and excluded files/directories ? (PS : I prefer exclude in first and include * to have a total control on the GED, to keep an eye on they documents) (PS2 : I generate this JSON and send it with a python script and it working good) Thanks De : Karl Wright [mailto:[email protected]<mailto:[email protected]>] Envoyé : vendredi 9 mars 2018 12:53 À : [email protected]<mailto:[email protected]> Cc : Fabien Harrang <[email protected]<mailto:[email protected]>>; REUILLON Dominique <[email protected]<mailto:[email protected]>> Objet : Re: Modify job to add excludes files and directory Hi Maxence, In the middle of job run, if you change the specification of what documents are included and excluded, the implementation of the connector determines how it will behave. There is no guarantee that documents that are excluded will be removed, for example if the connector filters documents only when they are queued. You may need to run the job a second time to be sure everything is removed. So the official answer is that "it depends". Karl On Fri, Mar 9, 2018 at 5:38 AM, msaunier <[email protected]<mailto:[email protected]>> wrote: Hello Karl, If I add on a job (in live) new files and directories to exclude, ManifoldCF delete old indexed files that meet these exclusions? Or I need to reseed all of my documents? Thanks you. Maxence SAUNIER
