Hi , Yes this works only the difference is when a single file is ingested we are having ingested one as C:/Users/Dell/Desktop/abc.txt/.-with a UNWANTED slash at end
*The file spec part should include the file name.:- *This way I have tried, I am getting Access denied. Also checked about all the Access is granted to the user who is accessing On Wed, Aug 11, 2021 at 4:43 PM Karl Wright <daddy...@gmail.com> wrote: > The "path" attribute is not meant to include terminal file names, only > directories. I'm surprised that this works at all. The file spec part > should include the file name. > > Karl > > > On Wed, Aug 11, 2021 at 2:14 AM ritika jain <ritikajain5...@gmail.com> > wrote: > >> *Dynamic Job * >> >> {"job":{"_children_":[{"_type_":"id","_value_":"1628595470228"},{"_type_":"description","_value_":"DEMo >> TEMP >> API-1628595484"},{"_type_":"repository_connection","_value_":"Demo_Repo"},{"_type_":"document_specification","_children_":[{"_type_":"startpoint","include":[{"_attribute_indexable":"yes","_attribute_filespec":"\/*.pdf","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.doc","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.docx","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.docb","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.dotx","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.dot","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.docm","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.ppt","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.pptx","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.wpd","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.wp","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.wp5","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.wp4","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.wp6","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.wp7","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.png","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.jpg","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.jpeg","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.gif","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.bmp","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.mpg","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.xlsm","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.xlsb","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.xls","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.xlsx","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.doc","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.mpeg","_value_":"","_attribute_type":"file"},{"_attribute_filespec":"*","_value_":"","_attribute_type":"directory"}],"_attribute_path":*"windows\/Job\/Demo >> School >> Network\/Information\/restpuntion.docx"*,"_value_":""},{"_type_":"maxlength","_value_":"","_attribute_value":"2000000"},{"_type_":"security","_value_":"","_attribute_value":"on"},{"_type_":"sharesecurity","_value_":"","_attribute_value":"on"},{"_type_":"parentfoldersecurity","_value_":"","_attribute_value":"on"}]},{"_type_":"pipelinestage","_children_":[{"_type_":"stage_id","_value_":"0"},{"_type_":"stage_isoutput","_value_":"false"},{"_type_":"stage_connectionname","_value_":"Tika"},{"_type_":"stage_specification","_children_":[{"_type_":"keepAllMetadata","_value_":"","_attribute_value":"true"},{"_type_":"ignoreException","_value_":"","_attribute_value":"true"},{"_type_":"lowerNames","_value_":"","_attribute_value":"false"},{"_type_":"writeLimit","_value_":"","_attribute_value":""},{"_type_":"boilerplateprocessor","_value_":"","_attribute_value":"de.l3s.boilerpipe.extractors.KeepEverythingExtractor"}]}]},{"_type_":"pipelinestage","_children_":[{"_type_":"stage_id","_value_":"1"},{"_type_":"stage_prerequisite","_value_":"0"},{"_type_":"stage_isoutput","_value_":"false"},{"_type_":"stage_connectionname","_value_":"Metadata >> >> Adjuster"},{"_type_":"stage_specification","_children_":[{"_type_":"expression","_attribute_parameter":"d_connector_type","_value_":"","_attribute_value":"FileShare"},{"_type_":"expression","_attribute_parameter":"d_description","_value_":"","_attribute_value":"\"${dc:description}\""},{"_type_":"keepAllMetadata","_value_":"","_attribute_value":"true"},{"_type_":"filterEmpty","_value_":"","_attribute_value":"true"}]}]},{"_type_":"pipelinestage","_children_":[{"_type_":"stage_id","_value_":"2"},{"_type_":"stage_prerequisite","_value_":"1"},{"_type_":"stage_isoutput","_value_":"true"},{"_type_":"stage_connectionname","_value_":"Deltares_Output"},{"_type_":"stage_specification"}]},{"_type_":"start_mode","_value_":"manual"},{"_type_":"run_mode","_value_":"scan >> >> once"},{"_type_":"hopcount_mode","_value_":"accurate"},{"_type_":"priority","_value_":"1"},{"_type_":"recrawl_interval","_value_":"86400000"},{"_type_":"max_recrawl_interval","_value_":"infinite"},{"_type_":"expiration_interval","_value_":"infinite"},{"_type_":"reseed_interval","_value_":"3600000"}]}} >> >> >> *Other Manual Job* >> >> {"job":{"_children_":[{"_type_":"id","_value_":"1599130705168"},{"_type_":"description","_value_":"Demo_job"},{"_type_":"repository_connection","_value_":"mas_Repo"},{"_type_":"document_specification","_children_":[{"_type_":"startpoint","include":[{"_attribute_indexable":"yes","_attribute_filespec":"\/*.pdf","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.doc","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.docm","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.docx","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.docb","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.dot","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.dotx","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.wpd >> >> ","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.pptx","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.ppt","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.wp","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.wp4","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.wp5","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.wp6","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.wp7","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.xlsm >> >> ","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.xls","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.xls","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.xlsb","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.xlsx","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.png","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.jpg","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.jpeg","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.bmp","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.gif","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.mpeg","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.mpg","_value_":"","_attribute_type":"file"},{"_attribute_filespec":"*","_value_":"","_attribute_type":"directory"}],"_attribute_path":"*windows\/Job\/Demo >> School >> Network\/Information\*","_value_":""},{"_type_":"maxlength","_value_":"","_attribute_value":"5000000"},{"_type_":"security","_value_":"","_attribute_value":"on"},{"_type_":"sharesecurity","_value_":"","_attribute_value":"on"},{"_type_":"parentfoldersecurity","_value_":"","_attribute_value":"off"}]},{"_type_":"pipelinestage","_children_":[{"_type_":"stage_id","_value_":"0"},{"_type_":"stage_isoutput","_value_":"false"},{"_type_":"stage_connectionname","_value_":"Tika"},{"_type_":"stage_specification","_children_":[{"_type_":"keepAllMetadata","_value_":"","_attribute_value":"true"},{"_type_":"lowerNames","_value_":"","_attribute_value":"false"},{"_type_":"writeLimit","_value_":"","_attribute_value":""},{"_type_":"ignoreException","_value_":"","_attribute_value":"true"},{"_type_":"boilerplateprocessor","_value_":"","_attribute_value":"de.l3s.boilerpipe.extractors.KeepEverythingExtractor"}]}]},{"_type_":"pipelinestage","_children_":[{"_type_":"stage_id","_value_":"1"},{"_type_":"stage_prerequisite","_value_":"0"},{"_type_":"stage_isoutput","_value_":"false"},{"_type_":"stage_connectionname","_value_":"Metadata >> >> Adjuster"},{"_type_":"stage_specification","_children_":[{"_type_":"expression","_attribute_parameter":"d_connector_type","_value_":"","_attribute_value":"FileShare"},{"_type_":"expression","_attribute_parameter":"d_description","_value_":"","_attribute_value":"\"${dc:description}\" >> >> "},{"_type_":"keepAllMetadata","_value_":"","_attribute_value":"true"},{"_type_":"filterEmpty","_value_":"","_attribute_value":"true"}]}]},{"_type_":"pipelinestage","_children_":[{"_type_":"stage_id","_value_":"2"},{"_type_":"stage_prerequisite","_value_":"1"},{"_type_":"stage_isoutput","_value_":"true"},{"_type_":"stage_connectionname","_value_":"Deltares_Output"},{"_type_":"stage_specification"}]},{"_type_":"start_mode","_value_":"manual"},{"_type_":"run_mode","_value_":"scan >> >> once"},{"_type_":"hopcount_mode","_value_":"accurate"},{"_type_":"priority","_value_":"5"},{"_type_":"recrawl_interval","_value_":"86400000"},{"_type_":"max_recrawl_interval","_value_":"infinite"},{"_type_":"expiration_interval","_value_":"infinite"},{"_type_":"reseed_interval","_value_":"3600000"}]}} >> >> Basically these two job structures are fully same.Except Path:- is >> mentioned as 1) Complete path till File location 2) only path till folders. >> >> In the first case the ingestion file has a slash at the end and In second >> case we don't. >> >> >> Thanks' >> >> Ritika >> >> >> On Tue, Aug 10, 2021 at 6:52 PM Karl Wright <daddy...@gmail.com> wrote: >> >>> I am sorry, but I'm having trouble understanding how exactly you are >>> configuring the JCIFS connector in these two cases. Can you view the job >>> in each case and provide cut-and-paste of the view? >>> >>> Karl >>> >>> >>> On Tue, Aug 10, 2021 at 9:09 AM ritika jain <ritikajain5...@gmail.com> >>> wrote: >>> >>>> Hi All, >>>> >>>> I am using Window shares connector in 2.14 manifoldcf version and >>>> Elastic as output. >>>> I have created a dynamic manifoldcf job API via which a job will be >>>> created in manifoldcf with inclusions list and path, only particular file >>>> path is to be mentioned . Example file path:- >>>> C:/Users/Dell/Desktop/abc.txt. >>>> >>>> A job will be created to crawl only this single file . >>>> *Issue is :-* >>>> When this job ingest document in Elastic search there is slash, that >>>> is getting appended in the end >>>> >>>> *Ingested file is* :- C:/Users/Dell/Desktop/abc.txt/ >>>> >>>> But when same file is crawled via Manifoldcf job settings by mentioning >>>> path till folder structure (as manual job creation does not allow file path >>>> till particular file it allows till folders only). >>>> It does not append / >>>> >>>> *Ingested file in this case:-* >>>> C:/Users/Dell/Desktop/abc.txt >>>> as expected original file. >>>> >>>> *Query* >>>> Why is this the case as it makes searching in ES ambiguous. >>>> >>>> Thanks >>>> Ritika >>>> >>>> >>>>