Jakob,
Sorry about the unresponsiveness. That looks like you might have uncovered a 
bug. I’ll have an Engineer on our end try to reproduce this and contact you 
offline for more details. I’ll follow up with the list with the eventual 
outcome. Thanks for your feedback.

Justin



Justin Makeig
Director, Product Management
MarkLogic Corporation
[email protected]<mailto:[email protected]>
www.marklogic.com<http://www.marklogic.com/>



On Nov 6, 2013, at 7:12 AM, Jakob Fix 
<[email protected]<mailto:[email protected]>>
 wrote:

Hello,

I was just wondering whether the silence in response to my question is 
"stunned", "not interested", "well, can't he figure this out by himself" or 
"oops, didn't see this one"?

:-)

cheers,
Jakob.


On Fri, Nov 1, 2013 at 12:01 AM, Jakob Fix 
<[email protected]<mailto:[email protected]>> wrote:
Hi,

we've run into something we think might be a bug with the most recent version 
of mlcp. We did an export of a database with XML documents and lots of binary 
documents, and an import of the exported data into another database.  In the 
second step of the procedure, i.e. the import into the new database, the error 
below appeared (the line with Archive damaged ...). Apparently, mlcp stores XML 
and binary documents in different zip files. Also, each binary document gets 
its metadata document. In our case, the export created two zip files containing 
the binaries. For some reason, in the case of one document, the actual binary 
file and its metadata file were separated, as shown below:

20131031140432+0100-000001-BINARY.zip ==> RO-GE_DTC.pdf.metadata
20131031140432+0100-000002-BINARY.zip ==> RO-GE_DTC.pdf

which seems to have caused the error below. The PDF file is indeed not loaded 
into the database.

Reuniting the PDF file with its metadata equivalent in the same binary zip file 
made the import procedure run without errors.

thanks,
Jakob.


marklogic-contentpump-1.0.3\bin\mlcp.bat EXPORT -host 192.168.56.90 -port 50000 
-username abc  -password abc  -output_type archive -output_file_path 
db-prod-20131031

marklogic-contentpump-1.0.3\bin\mlcp.bat IMPORT -host 192.168.56.90 -port 40100 
-username abc -password abc -input_file_path db-prod-20131031 -input_file_type 
archive

13/10/31 14:09:54 INFO contentpump.LocalJobRunner: Content type: XML
13/10/31 14:09:54 INFO jvm.JvmMetrics: Initializing JVM Metrics with 
processName=JobTracker, sessionId=
13/10/31 14:09:54 INFO input.FileInputFormat: Total input paths to process : 3
13/10/31 14:09:55 ERROR contentpump.ArchiveRecordReader: Archive damaged: 
no/incorrect metadata for /content/assets/agreements/RO-GE_DTC.pdf in 
/D:/Projects/EOI/deployment/mlcp/eoi-db-prod-20131032/20131031140432+0100-000002-BINARY.zip
13/10/31 14:09:55 ERROR contentpump.LocalJobRunner: Error running task: 
attempt__0000_m_000001_0
java.lang.NullPointerException
        at 
com.marklogic.contentpump.DatabaseContentWriter.write(DatabaseContentWriter.java:231)
        at 
com.marklogic.contentpump.DatabaseContentWriter.write(DatabaseContentWriter.java:58)
        at 
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
        at com.marklogic.contentpump.DocumentMapper.map(DocumentMapper.java:46)
        at com.marklogic.contentpump.DocumentMapper.map(DocumentMapper.java:32)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at 
com.marklogic.contentpump.LocalJobRunner$LocalMapTask.call(LocalJobRunner.java:375)
        at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
        at java.util.concurrent.FutureTask.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)
13/10/31 14:09:56 INFO contentpump.LocalJobRunner:  completed 0%
13/10/31 14:14:42 INFO contentpump.LocalJobRunner:  completed 33%
13/10/31 14:15:41 WARN contentpump.DatabaseContentWriter: SEC-PERMDENIED: 
Permission denied
13/10/31 14:18:27 INFO contentpump.LocalJobRunner:  completed 66%
13/10/31 14:18:27 INFO contentpump.LocalJobRunner: 
com.marklogic.contentpump.ContentPumpStats:
13/10/31 14:18:27 INFO contentpump.LocalJobRunner: 
ATTEMPTED_INPUT_RECORD_COUNT: 20230
13/10/31 14:18:27 INFO contentpump.LocalJobRunner: SKIPPED_INPUT_RECORD_COUNT: 0
13/10/31 14:18:27 INFO contentpump.LocalJobRunner: Total execution time: 512 sec




_______________________________________________
General mailing list
[email protected]<mailto:[email protected]>
http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to