Hello,

I'm converting about 1000 pdfs to xhtml for some extraction.

One file throws an error:


<error:error xsi:schemaLocation="http://marklogic.com/xdmp/error error.xsd"
xmlns:error="http://marklogic.com/xdmp/error";
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";>
<error:code>ICN-FAILED</error:code>
<error:name/>
<error:xquery-version>1.0-ml</error:xquery-version>
<error:message>Conversion failed due to abnormal process 
termination</error:message>
<error:format-string>ICN-FAILED:
xdmp:pdf-convert(document{binary{"255044462d312e360d25e2e3cfd30d0a3135342030206f626a0d3c3c2f4d61726b496e666f3c3c2f4d61726b656420747275653e3e2f4d657461646174612031..."}},
"RR:D04-1015", &lt;options xmlns:tidy="xdmp:tidy"
xmlns="xdmp:pdf-convert"&gt;&lt;config&gt;PDFtoXHTML_exact.cfg&lt;/config&gt;&lt;image-output&gt;false&lt;/image-...&lt;/options&gt;)
-- Conversion failed due to abnormal process termination: -1. Loading
configuration... Parsing macros... Macro synth-bookmarks='true' Macro
image-output='true' Macro text-output='true' Macro zones='false' Macro
ignore-text='true' Macro remove-overprint='false' Macro illustrations='true' 
Macro
line-breaks='true' Macro image-quality='75' Macro page-start='' Macro 
page-end=''
Macro document-start='' Macro document-end='' Macro image-output='false' Macro
illustrations='false' features='160004' Processing... Analysing
'/var/opt/MarkLogic/Temp/db397b7505bb4bf0/conv.pdf' Pages 1 to 30 Processing 
page 1
Processing page 2 Processing page 3 Processing page 4 Processing page 5 
Processing
page 6 Processing page 7 Processing page 8 Processing page 9 Processing page 10
Processing page 11 Processing page 12 Processing page 13 Processing page 14
Processing page 15 Processing page 16 Processing page 17 Processing page 18
Processing page 19 Processing page 20 Processing page 21 Processing page 22
Processing page 23 Processing page 24 Processing page 25 Processing page
26</error:format-string>
<error:retryable>false</error:retryable>
<error:expr>xdmp:pdf-convert(document{binary{"255044462d312e360d25e2e3cfd30d0a3135342030206f626a0d3c3c2f4d61726b496e666f3c3c2f4d61726b656420747275653e3e2f4d657461646174612031..."}},
"RR:D04-1015", &lt;options xmlns:tidy="xdmp:tidy"
xmlns="xdmp:pdf-convert"&gt;&lt;config&gt;PDFtoXHTML_exact.cfg&lt;/config&gt;&lt;image-output&gt;false&lt;/image-...&lt;/options&gt;)</error:expr>



I can open it in Adobe Reader and Preview, and scroll through all the pages.


Is there some way to check if the PDF is bad, or if this is a conversion bug?



Thanks,


Chris Hamlin
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to