Dear list members,
 
I need to create PDF/A-3B compliant documents with special references to an 
embedded document (ZUGFeRD compliant invoices). Currently, this fails in 
VeraPDF with some very vague and rather misleading validation errors:
* "Specification: ISO 19005-3:2012, Clause: 6.6.2.1, Test number: 5
All metadata streams present in the PDF shall conform to the XMP Specification. 
The XMP package must be encoded as UTF-8"
* "Specification: ISO 19005-3:2012, Clause: 6.6.4, Test number: 1
The PDF/A version and conformance level of a file shall be specified using the 
PDF/A Identification extension schema"
* "Specification: ISO 19005-3:2012, Clause: 6.6.2.1, Test number: 4
All metadata streams present in the PDF shall conform to the XMP Specification. 
All content of all XMP packets shall be well-formed, as defined by Extensible 
Markup Language (XML) 1.0 (Third Edition), 2.1, and the RDF/XML Syntax 
Specification (Revised)"
 
I finally tracked this down to FOP, which dropped some necessary namespaces in 
the created XMP stream. This can be reproduced with the corresponding example 
from the FOP homepage (https://xmlgraphics.apache.org/fop/2.10/metadata.html). 
E.g. here is the fo:

<fo:simple-page-master master-name="simple">
  <fo:region-body/>
  <pdf:page page-numbers="*">
    <x:xmpmeta xmlns:x="adobe:ns:meta/">
      <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"; 
xmlns:abc="http://www.abc.de/abc/";>
        <rdf:Description rdf:about="" abc:def="val"/>
        <rdf:Description rdf:about="" 
xmlns:pdfaExtension="http://www.aiim.org/pdfa/ns/extension/";
                         xmlns:pdfaSchema="http://www.aiim.org/pdfa/ns/schema#";
                         
xmlns:pdfaProperty="http://www.aiim.org/pdfa/ns/property#";>
          <pdfaExtension:schemas>
            <rdf:Bag>
              <rdf:li rdf:parseType="Resource">
                <pdfaSchema:property>
                  <rdf:Seq>
                    <rdf:li rdf:parseType="Resource">
                      <pdfaProperty:name>split</pdfaProperty:name>
                    </rdf:li>
                  </rdf:Seq>
                </pdfaSchema:property>
              </rdf:li>
            </rdf:Bag>
          </pdfaExtension:schemas>
        </rdf:Description>
      </rdf:RDF>
    </x:xmpmeta>
  </pdf:page>
</fo:simple-page-master>

And this is returned when I run pdfinfo -meta example.pdf:
<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?><x:xmpmeta 
xmlns:x="adobe:ns:meta/">
  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";>
     <rdf:RDF xmlns:abc="http://www.abc.de/abc/"; abc:def="val" rdf:about=""/>
     <rdf:RDF xmlns:dc="http://purl.org/dc/elements/1.1/"; rdf:about="">
        <dc:format>application/pdf</dc:format>
        <dc:language>
           <rdf:Bag>
              <rdf:li>x-unknown</rdf:li>
           </rdf:Bag>
        </dc:language>
        <dc:date>
           <rdf:Seq>
              <rdf:li>2024-11-05T11:46:27+01:00</rdf:li>
           </rdf:Seq>
        </dc:date>
     </rdf:RDF>
     <rdf:RDF xmlns:pdf="http://ns.adobe.com/pdf/1.3/"; rdf:about="">
        <pdf:Producer>Apache FOP Version 2.10</pdf:Producer>
        <pdf:PDFVersion>1.4</pdf:PDFVersion>
     </rdf:RDF>
     <rdf:RDF xmlns:pdfaExtension="http://www.aiim.org/pdfa/ns/extension/"; 
rdf:about="">
        <pdfaExtension:schemas>
           <rdf:Bag>
              <rdf:li rdf:parseType="Resource">
                 <pdfaSchema:property>
                    <rdf:Seq>
                       <rdf:li rdf:parseType="Resource">
                          <pdfaProperty:name>split</pdfaProperty:name>
                       </rdf:li>
                    </rdf:Seq>
                 </pdfaSchema:property>
              </rdf:li>
           </rdf:Bag>
        </pdfaExtension:schemas>
     </rdf:RDF>
     <rdf:RDF xmlns:pdfaid="http://www.aiim.org/pdfa/ns/id/"; rdf:about="">
        <pdfaid:conformance>B</pdfaid:conformance>
        <pdfaid:part>3</pdfaid:part>
     </rdf:RDF>
     <rdf:RDF xmlns:xmp="http://ns.adobe.com/xap/1.0/"; rdf:about="">
        <xmp:MetadataDate>2024-11-05T11:46:27+01:00</xmp:MetadataDate>
        <xmp:CreateDate>2024-11-05T11:46:27+01:00</xmp:CreateDate>
     </rdf:RDF>
  </rdf:RDF>
</x:xmpmeta>
<?xpacket end="r"?>

As you can see, the two namespaces 
xmlns:pdfaSchema="http://www.aiim.org/pdfa/ns/schema#";   and 
xmlns:pdfaProperty="http://www.aiim.org/pdfa/ns/property#";
are dropped from the metadata, while the prefixes are still used. Thus, the XML 
is invalid and so is the PDF/A.
This is pretty unfortunate, as I don't have any workaround for this. I am using 
Apache FOP 2.10 in the context of Apache Camel 4.8.1. Any help would be greatly 
appreciated.
 

With kind regards,

   Jörn Willhöft

Reply via email to