Jochen Stärk created PDFBOX-6062:
------------------------------------
Summary: XMPMetadata can only be parsed if xmp:CreateDate does not
contain fractions of seconds
Key: PDFBOX-6062
URL: https://issues.apache.org/jira/browse/PDFBOX-6062
Project: PDFBox
Issue Type: Bug
Components: XmpBox
Affects Versions: 3.0.5 PDFBox
Reporter: Jochen Stärk
Attachments: invoice.gs.pdf, invoice.pdf
Attached invoice.pdf will raise a
{{{}org.apache.xmpbox.xml.XmpParsingException: Failed to instantiate DateType
property with value 2025-09-03T15:43:47.989082+00:00 in
xmp:CreateDate{}}}{{{}at
org.apache.xmpbox.xml.DomXmpParser.createProperty(DomXmpParser.java:373){}}}
when Metadata is parsed using {{xmpParser.parse }}(snippet below) . It has a
createDate of
2025-09-03T15:43:47.989082+00:00. I ran the PDF file through ghostscript
(attached invoice.gs.pdf) and that file works and displays XMP code and the
PDF/A Version as instructed. Notably the creation date of the ghostscript
version is only
2025-09-04T10:34:57+02:00, i.e. without nanoseconds.
Snippet:
{{try {}}
{{PDDocument document =
Loader.loadPDF(Files.readAllBytes(Paths.get("invoice.pdf")));}}
{{PDDocumentCatalog catalog = document.getDocumentCatalog();}}
{{PDMetadata metadata = catalog.getMetadata();}}
{{if (metadata != null) {}}
{{DomXmpParser xmpParser = new DomXmpParser();}}
{{System.out.println(metadata.getCOSObject().toTextString());}}
{{XMPMetadata xmp = xmpParser.parse(metadata.createInputStream());}}
{{PDFAIdentificationSchema pdfaSchema = xmp.getPDFAIdentificationSchema();}}
{{if (pdfaSchema != null) {}}
{{System.out.println(pdfaSchema.getPart());}}
{{}}}
{{}}}
{{document.close();}}
{{} catch (XmpParsingException e) {}}
{{e.printStackTrace();}}
{{} catch (IOException e) {}}
{{e.printStackTrace();}}
{{}}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]