Jochen Stärk created PDFBOX-6062:
------------------------------------

             Summary: XMPMetadata can only be parsed if xmp:CreateDate does not 
contain fractions of seconds
                 Key: PDFBOX-6062
                 URL: https://issues.apache.org/jira/browse/PDFBOX-6062
             Project: PDFBox
          Issue Type: Bug
          Components: XmpBox
    Affects Versions: 3.0.5 PDFBox
            Reporter: Jochen Stärk
         Attachments: invoice.gs.pdf, invoice.pdf

Attached invoice.pdf will raise a 

 

{{{}org.apache.xmpbox.xml.XmpParsingException: Failed to instantiate DateType 
property with value 2025-09-03T15:43:47.989082+00:00 in 
xmp:CreateDate{}}}{{{}at 
org.apache.xmpbox.xml.DomXmpParser.createProperty(DomXmpParser.java:373){}}}
 
 
when Metadata is parsed using {{xmpParser.parse }}(snippet below) . It has a 
createDate of 
2025-09-03T15:43:47.989082+00:00. I ran the PDF file through ghostscript 
(attached invoice.gs.pdf) and that file works and displays XMP code and the 
PDF/A Version as instructed. Notably the creation date of the ghostscript 
version is only 
2025-09-04T10:34:57+02:00, i.e. without nanoseconds.
 
Snippet:
 
{{try {}}
{{PDDocument document = 
Loader.loadPDF(Files.readAllBytes(Paths.get("invoice.pdf")));}}
{{PDDocumentCatalog catalog = document.getDocumentCatalog();}}
{{PDMetadata metadata = catalog.getMetadata();}}
 
{{if (metadata != null) {}}
{{DomXmpParser xmpParser = new DomXmpParser();}}
{{System.out.println(metadata.getCOSObject().toTextString());}}
{{XMPMetadata xmp = xmpParser.parse(metadata.createInputStream());}}

{{PDFAIdentificationSchema pdfaSchema = xmp.getPDFAIdentificationSchema();}}
{{if (pdfaSchema != null) {}}
{{System.out.println(pdfaSchema.getPart());}}
{{}}}
{{}}}
{{document.close();}}
{{} catch (XmpParsingException e) {}}
{{e.printStackTrace();}}
{{} catch (IOException e) {}}
{{e.printStackTrace();}}
{{}}}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to