[ https://issues.apache.org/jira/browse/PDFBOX-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13872279#comment-13872279 ]
Johan van der Knijff commented on PDFBOX-1812: ---------------------------------------------- Thanks again Andreas and Guillaume, I just did a quick test with the latest build and it all seems to work for me as well! > Illegal characters in XML output > -------------------------------- > > Key: PDFBOX-1812 > URL: https://issues.apache.org/jira/browse/PDFBOX-1812 > Project: PDFBox > Issue Type: Bug > Components: Preflight > Affects Versions: 2.0.0 > Environment: Bug reproduced under Win 7, Ubuntu > Reporter: Johan van der Knijff > Assignee: Andreas Lehmkühler > Labels: characters, utf-8, xml > Fix For: 1.8.4, 2.0.0 > > Attachments: 013814.pdf, 013814.xml, 013814_old.xml, > 1812-additionalPDFs09012014.zip, 598659.pdf, 598659.xml, 598659_old.xml, > 600111.pdf, 600111.xml, 600111_old.xml, preflight-app.jar > > > When running Preflight in XML mode, the latest Preflight version (I used the > JAR from build #747) sometimes produces output that contains characters that > are illegal in XML. This can cause unexpected behavior if such files are > further processed with tools that expect well-formed XML. See attached PDFs, > which all result in illegal characters in the description of a 1.0 Syntax > error, Error: Expected a long type. Output of older versions of Preflight > didn't contain these illegal characters; instead they would give something > like *actual='/O'*, *actual='Pages'*. etc. So I suppose this must have been > caused by a fairly recent change. > See attachments below. -- This message was sent by Atlassian JIRA (v6.1.5#6160)