[ https://issues.apache.org/jira/browse/PDFBOX-3471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15434456#comment-15434456 ]
Maruan Sahyoun commented on PDFBOX-3471: ---------------------------------------- I did it locally but wanted to run it past you to get your feedback. As for the TODO - I included it as I haven't fully thought about that and wanted to make sure that this is captured. IMHO there is a difference between an empty text node and a text node which contains only whitespace. Moving forward I'd prefer not to change the XMP while parsing to ensure that if you serialize you get the same content. Having said that XMPBox is not a general XMP handling library but in it's current state targeted to validating XMP as part of PDF/A - so the changes are (currently) OK. > XMP parsing fails if XMP contain comments > ----------------------------------------- > > Key: PDFBOX-3471 > URL: https://issues.apache.org/jira/browse/PDFBOX-3471 > Project: PDFBox > Issue Type: Bug > Components: XmpBox > Affects Versions: 2.0.2 > Reporter: Petras > Attachments: PDFBOX-3471_XmpParsingIgnoringComments.patch > > > DomXmpParser parser fails with such correct XMP: > {code:xml} > <?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?> > <x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 5.1.0-jc003"> > <!-- PDF/A standarto versija (1 ar 2) ir suderinamumo lygmuo (A, B ar U) > --> > <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> > <rdf:Description rdf:about = "" > xmlns:pdfaid = "http://www.aiim.org/pdfa/ns/id/"> > <pdfaid:part>1</pdfaid:part> > <pdfaid:conformance>B</pdfaid:conformance> > </rdf:Description> > </rdf:RDF> > </x:xmpmeta> > <?xpacket end="w"?> > {code} > DomXmpParser finds comment node and fails: > {code} > org.apache.xmpbox.xml.XmpParsingException: More than one element found in > x:xmpmeta > at > org.apache.xmpbox.xml.DomXmpParser.findDescriptionsParent(DomXmpParser.java:750) > at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:183) > at org.apache.xmpbox.xml.DomXmpParser.parse(DomXmpParser.java:111) > ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org