[
https://issues.apache.org/jira/browse/TIKA-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898794#action_12898794
]
Staffan Olsson commented on TIKA-451:
-------------------------------------
Converted DublinCore.DATE to Property.internalDate in
http://github.com/solsson/tika/commit/2d637712053a758e7a6d5940c1a635615913056e
This affects parsers DcXML, Mbox, ooxml and image.
This patch makes use of refactoring I did to get better access to the Metadata
Extractor API, for example to getDate(tagType). I'll post these changes as a
new ticket shortly.
> Inconsistent date format for Metadata.CREATION_DATE and Metadata.LAST_MODIFIED
> ------------------------------------------------------------------------------
>
> Key: TIKA-451
> URL: https://issues.apache.org/jira/browse/TIKA-451
> Project: Tika
> Issue Type: Improvement
> Components: metadata, parser
> Affects Versions: 0.7
> Reporter: Nick Burch
> Assignee: Nick Burch
> Priority: Minor
>
> Currently, the PDF Parser does calendar.getTime().toString() which means
> dates end up in your local timezone, and are hard to parse
> The Open Document parsers output in iso 8601 format, which avoids these two
> problems
> The poi ole2 based parsers also output in date.toString() format, with the
> same timezone/parsing problems
> We should probably select one format, and update the parsers to all output in
> it
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.