Hi Chris, To help narrow down the issue, keep image extraction disabled and only do a small range of pages (like pages 4-5) on a few of the documents in your set.
This error is usually the product of an abnormal termination of the PDF conversion process. If it succeeds on a few pages of a few pdfs, then there maybe corruption on the PDF that the readers are compensating for. Otherwise, it might be an issue with the configuration file you are using and then try to use a different one to see if it succeeds. -- Alex Ebadirad Senior Consultant, DoD Team MarkLogic Corporation [email protected]<mailto:[email protected]> Cell +1 928 246 7318 www.marklogic.com<http://www.marklogic.com> ________________________________ From: [email protected] [[email protected]] on behalf of Chris Hamlin [[email protected]] Sent: Friday, December 06, 2013 10:31 AM To: MarkLogic Developer Discussion Subject: [MarkLogic Dev General] xdmp:pdf-convert failure Hello, I'm converting about 1000 pdfs to xhtml for some extraction. One file throws an error: <error:error xsi:schemaLocation="http://marklogic.com/xdmp/error error.xsd" xmlns:error="http://marklogic.com/xdmp/error" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <error:code>ICN-FAILED</error:code> <error:name/> <error:xquery-version>1.0-ml</error:xquery-version> <error:message>Conversion failed due to abnormal process termination</error:message> <error:format-string>ICN-FAILED: xdmp:pdf-convert(document{binary{"255044462d312e360d25e2e3cfd30d0a3135342030206f626a0d3c3c2f4d61726b496e666f3c3c2f4d61726b656420747275653e3e2f4d657461646174612031..."}}, "RR:D04-1015", <options xmlns:tidy="xdmp:tidy" xmlns="xdmp:pdf-convert"><config>PDFtoXHTML_exact.cfg</config><image-output>false</image-...</options>) -- Conversion failed due to abnormal process termination: -1. Loading configuration... Parsing macros... Macro synth-bookmarks='true' Macro image-output='true' Macro text-output='true' Macro zones='false' Macro ignore-text='true' Macro remove-overprint='false' Macro illustrations='true' Macro line-breaks='true' Macro image-quality='75' Macro page-start='' Macro page-end='' Macro document-start='' Macro document-end='' Macro image-output='false' Macro illustrations='false' features='160004' Processing... Analysing '/var/opt/MarkLogic/Temp/db397b7505bb4bf0/conv.pdf' Pages 1 to 30 Processing page 1 Processing page 2 Processing page 3 Processing page 4 Processing page 5 Processing page 6 Processing page 7 Processing page 8 Processing page 9 Processing page 10 Processing page 11 Processing page 12 Processing page 13 Processing page 14 Processing page 15 Processing page 16 Processing page 17 Processing page 18 Processing page 19 Processing page 20 Processing page 21 Processing page 22 Processing page 23 Processing page 24 Processing page 25 Processing page 26</error:format-string> <error:retryable>false</error:retryable> <error:expr>xdmp:pdf-convert(document{binary{"255044462d312e360d25e2e3cfd30d0a3135342030206f626a0d3c3c2f4d61726b496e666f3c3c2f4d61726b656420747275653e3e2f4d657461646174612031..."}}, "RR:D04-1015", <options xmlns:tidy="xdmp:tidy" xmlns="xdmp:pdf-convert"><config>PDFtoXHTML_exact.cfg</config><image-output>false</image-...</options>)</error:expr> I can open it in Adobe Reader and Preview, and scroll through all the pages. Is there some way to check if the PDF is bad, or if this is a conversion bug? Thanks, Chris Hamlin
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
