Is there an option to run mlcp with SKIP_EXISTING? I received an error during
the middle of the mlcp execution and I can't figure out from any of the logs
which documents out of the 5 million that were to be loaded didn't load. 5.2
million were supposed to load but there are only 5.1 million that are in the
DB. This is an aggregate input type, but I was hoping that if I just reran it,
it would aggregate them exactly the same and if it did, it could skip existing
docs and only insert the ones that didn't make it. The DB has only one forest.
mlcp-Hadoop2-1.2-1/bin/mlcp.sh import -host host -port port -username uname
-password pwd -input_file_path /tmp/input -input_file_type aggregates
-aggregate_record_element element -mode local
14/02/11 17:32:33 WARN mapreduce.ContentWriter: SVC-EXTIME: Time limit exceeded
14/02/11 17:32:33 WARN mapreduce.ContentWriter: XDMP-NOTXN: No transaction with
identifier 6710880463393030084
14/02/11 17:32:33 WARN mapreduce.ContentWriter: XDMP-NOTXN: No transaction with
identifier 7695774781503062893
2014-02-11 17:32:33.797 SEVERE [16] (SessionImpl.throwIllegalState): Cannot
commit without an active transaction
14/02/11 17:32:33 ERROR contentpump.LocalJobRunner: Error running task:
java.lang.IllegalStateException: Cannot commit without an active transaction
at
com.marklogic.xcc.impl.SessionImpl.throwIllegalState(SessionImpl.java:531)
at com.marklogic.xcc.impl.SessionImpl.commit(SessionImpl.java:176)
at com.marklogic.mapreduce.ContentWriter.write(ContentWriter.java:425)
at com.marklogic.mapreduce.ContentWriter.write(ContentWriter.java:62)
at
org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
at
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:106)
at com.marklogic.contentpump.DocumentMapper.map(DocumentMapper.java:46)
at com.marklogic.contentpump.DocumentMapper.map(DocumentMapper.java:32)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
at
com.marklogic.contentpump.LocalJobRunner$LocalMapTask.call(LocalJobRunner.java:385)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
14/02/11 17:32:33 WARN mapreduce.ContentWriter: XDMP-NOTXN: No transaction with
identifier 16003411726161326417
2014-02-11 17:32:34.122 SEVERE [16] (SessionImpl.throwIllegalState): Cannot
commit without an active transaction
14/02/11 17:32:34 ERROR contentpump.LocalJobRunner: Error committing task:
java.lang.IllegalStateException: Cannot commit without an active transaction
at
com.marklogic.xcc.impl.SessionImpl.throwIllegalState(SessionImpl.java:531)
at com.marklogic.xcc.impl.SessionImpl.commit(SessionImpl.java:176)
at com.marklogic.mapreduce.ContentWriter.close(ContentWriter.java:559)
at
com.marklogic.contentpump.LocalJobRunner$LocalMapTask.call(LocalJobRunner.java:394)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Thanks,
David
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general