Addendum:
We actually send this regular expression, to escape the dot, yet mlcp.sh import 
still does not filter our desired files

-input_file_pattern '.*\.xml'

From: Morales-Martin, Kristina
Sent: Monday, July 13, 2015 11:43 AM
To: '[email protected]'
Subject: mlcp.sh help with filtering to ingest only XML files in zip files

Dear all,

We need help in ingesting a directory of many* zip files, each with many* XML 
files.

We are using the mlcp (Mark Logic Content Pump) out of the box to import 
content as-is from a directory of zip files.

In particular, we are using these options:
-mode local \
-input_file_path [a directory that has zip files, each zip file has a 
heterogenous mix of .xml and other binary files] \
-input_compressed true \
-input_file_pattern '.*.xml' \
-output_uri_replace 
"(\/.+\/+)(?=.+\.zip),'/ourOverrideOfTheURIToRemoveTheLeadingNASPath/'" \
...

Can anyone help with the -input_file_pattern option?  Our intent is to only 
load the .xml files in the zip files in the directory.
We do not want to load other files.  For some reason, the -input_file_pattern 
is not successfully filtering.
If you have encountered this non-filtering behavior, what have you done to make 
it work?

Thank you,
Kristina Morales-Martin
Sr. Technical Information Specialist, e-Content Operations
CAS, a division of the American Chemical Society
2540 Olentangy River Road
Columbus, OH 43202
Phone: 614-447-3600, ext. 2322
Fax: 614-447-3827
www.cas.org<http://www.cas.org/>


Confidentiality Notice: This electronic message transmission, including any 
attachment(s), may contain confidential, proprietary, or privileged information 
from Chemical Abstracts Service ("CAS"), a division of the American Chemical 
Society ("ACS"). If you have received this transmission in error, be advised 
that any disclosure, copying, distribution, or use of the contents of this 
information is strictly prohibited. Please destroy all copies of the message 
and contact the sender immediately by either replying to this message or 
calling 614-447-3600.

_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to