When I'm loading directories of slightly fewer than 100,000 XML files into
a large MarkLogic instance, I often get timeout and transaction errors. If
I re-run the same directory of files which got those errors, I typically
don't get any errors.

So, I have a few questions:

* Can I get prevent the errors from happening in the first place - e.g. by
tuning MarkLogic parameters or altering my use of mlcp?
* If I do get errors, what is the best way to get a report on the files
which failed, so I can retry just those ones? Is the best option for me to
write some code to pick out the errors from the log file? And, if so, am I
guaranteed to get all of the files reported?

Some Details

The command line template is

mlcp.sh import -username {1} -password {2} -host localhost -port {4}
-input_file_path {5} -output_uri_replace \"{6},'{7}'\"

Sometimes, the imports run just fine. However, often I get a large number
of SVC-EXTIME errors followed by a XDMP-NOTXN error. For example:

16/09/22 17:54:03 ERROR mapreduce.ContentWriter: SVC-EXTIME: Time limit
exceeded
16/09/22 17:54:03 WARN mapreduce.ContentWriter: Failed document
029ccd8ac3323658277ca28fead7a73d.0.xml in
file:/mnt/ingestion/MarkLogicIngestion/smyles/todo/2014_0005.done/029ccd8ac3323658277ca28fead7a73d.0.xml
16/09/22 17:54:03 ERROR mapreduce.ContentWriter: SVC-EXTIME: Time limit
exceeded
16/09/22 17:54:03 WARN mapreduce.ContentWriter: Failed document
02eb4562784255e249c4ec3ed472f9aa.1.xml in
file:/mnt/ingestion/MarkLogicIngestion/smyles/todo/2014_0005.done/02eb4562784255e249c4ec3ed472f9aa.1.xml
16/09/22 17:54:04 INFO contentpump.LocalJobRunner:  completed 33%
16/09/22 17:54:21 ERROR mapreduce.ContentWriter: XDMP-NOTXN: No transaction
with identifier 9076269665213828952

So far, I'm just rerunning the entire directory again. Most of the time, it
ingests fine on the second attempt. However, I have thousands of these
directories to process. So, I would prefer to avoid getting the errors in
the first place. Failing that, I would like to capture the errors and just
retry the files which failed.

Any help much appreciated.

Regards,

Stuart
_______________________________________________
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to