Jochen Stärk created PDFBOX-6062: ------------------------------------ Summary: XMPMetadata can only be parsed if xmp:CreateDate does not contain fractions of seconds Key: PDFBOX-6062 URL: https://issues.apache.org/jira/browse/PDFBOX-6062 Project: PDFBox Issue Type: Bug Components: XmpBox Affects Versions: 3.0.5 PDFBox Reporter: Jochen Stärk Attachments: invoice.gs.pdf, invoice.pdf
Attached invoice.pdf will raise a {{{}org.apache.xmpbox.xml.XmpParsingException: Failed to instantiate DateType property with value 2025-09-03T15:43:47.989082+00:00 in xmp:CreateDate{}}}{{{}at org.apache.xmpbox.xml.DomXmpParser.createProperty(DomXmpParser.java:373){}}} when Metadata is parsed using {{xmpParser.parse }}(snippet below) . It has a createDate of 2025-09-03T15:43:47.989082+00:00. I ran the PDF file through ghostscript (attached invoice.gs.pdf) and that file works and displays XMP code and the PDF/A Version as instructed. Notably the creation date of the ghostscript version is only 2025-09-04T10:34:57+02:00, i.e. without nanoseconds. Snippet: {{try {}} {{PDDocument document = Loader.loadPDF(Files.readAllBytes(Paths.get("invoice.pdf")));}} {{PDDocumentCatalog catalog = document.getDocumentCatalog();}} {{PDMetadata metadata = catalog.getMetadata();}} {{if (metadata != null) {}} {{DomXmpParser xmpParser = new DomXmpParser();}} {{System.out.println(metadata.getCOSObject().toTextString());}} {{XMPMetadata xmp = xmpParser.parse(metadata.createInputStream());}} {{PDFAIdentificationSchema pdfaSchema = xmp.getPDFAIdentificationSchema();}} {{if (pdfaSchema != null) {}} {{System.out.println(pdfaSchema.getPart());}} {{}}} {{}}} {{document.close();}} {{} catch (XmpParsingException e) {}} {{e.printStackTrace();}} {{} catch (IOException e) {}} {{e.printStackTrace();}} {{}}} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org