[
https://issues.apache.org/jira/browse/TIKA-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12885604#action_12885604
]
Jukka Zitting commented on TIKA-451:
------------------------------------
See page 11 of http://www.adobe.com/devnet/xmp/pdfs/XMPSpecificationPart2.pdf
for the ISO 8601 subset used by XMP. I think that matches our needs pretty well.
One of my forward-looking ideas behind introducing the Property class was to
use it for these kinds of type-safe value conversions. We could add
Property.setDate(Metadata, Date) and Property.getDate(Metadata) methods that
could also take advantage of the static value type information included in the
Property constants. For example an integer property constant could throw an
exception (or use some predefined conversion rule) when you attempt to get its
value as a date. For added compile-time type-safety we could even add explicit
DateProperty, IntegerProperty, etc. subclasses for specific kinds of metadata
properties.
> Inconsistent date format for Metadata.CREATION_DATE and Metadata.LAST_MODIFIED
> ------------------------------------------------------------------------------
>
> Key: TIKA-451
> URL: https://issues.apache.org/jira/browse/TIKA-451
> Project: Tika
> Issue Type: Improvement
> Components: metadata, parser
> Affects Versions: 0.7
> Reporter: Nick Burch
> Priority: Minor
>
> Currently, the PDF Parser does calendar.getTime().toString() which means
> dates end up in your local timezone, and are hard to parse
> The Open Document parsers output in iso 8601 format, which avoids these two
> problems
> The poi ole2 based parsers also output in date.toString() format, with the
> same timezone/parsing problems
> We should probably select one format, and update the parsers to all output in
> it
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.