Andrea Vacondio created PDFBOX-6119:
---------------------------------------

             Summary: DateConverter fails on valid date
                 Key: PDFBOX-6119
                 URL: https://issues.apache.org/jira/browse/PDFBOX-6119
             Project: PDFBox
          Issue Type: Bug
          Components: XmpBox
    Affects Versions: 3.0.7 PDFBox
            Reporter: Andrea Vacondio
         Attachments: xmp-date-patch-1.diff, xmp-date.xml

I think there is an issue with the DateConverter in XMPBox. Consider the 
following:

{code:xml}
      <xmp:CreateDate>2024-04-09T14:41:38</xmp:CreateDate>
      <xmp:ModifyDate>2015-02-02T16:37:19.192Europe/Berlin</xmp:ModifyDate>
{code}


The first is a valid date, according to the XMP spec chapter 8.2.1.2 "The time 
zone designator need not be present in XMP. When not present, the time zone is 
unknown, and an
XMP processor should not assume anything about the missing time zone"

The second is invalid, according to the spec "TZD = time zone designator (Z or 
+hh:mm or -hh:mm)"

Adobe XMPCore parses the first one as if it's UTC and fails on the second while 
XMPBox fails on the first and parses the second.

The DateConverter::fromISO8601 method is responsible for parsing the date 
string, it's quite complicated and error prone, likely because it comes from 
ancient ages. In my opinion, it could be greatly simplified using what's 
provided by the JDK and also fix the behavior to parse the first date and fail 
on the second like XMPCore does.
The only caveat is that the first date will be considered as UTC because we 
can't have a Calendar without time zone so we can't comply with the "should not 
assume anything about the missing time zone", but that's also what XMPCore does.
I added a patch and modified the tests, all test pass.

If you prefer to keep the current behavior, it should at least be modified to 
support the first valid date string.

I also attached the original xmp stream found in the document.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to