[jira] Commented: (TIKA-451) Inconsistent date format for Metadata.CREATION_DATE and Metadata.LAST_MODIFIED

Staffan Olsson (JIRA) Tue, 03 Aug 2010 05:38:48 -0700

    [ 
https://issues.apache.org/jira/browse/TIKA-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894877#action_12894877
 ]


Staffan Olsson commented on TIKA-451:
-------------------------------------

Jpeg parser (TiffExtractor.handleCommonImageTags and JpegParserTest) has the 
same issue.

The test asserts for a date format that is not iso. The field's 
(DublinCore.DATE) javadoc says ISO 8601 so the test is clearly wrong. There is 
a "TODO Make me a Date Property" on it. I have code for parsing Metadata 
Extractor's date to ISO so I could fix this, but what field should we use? This 
issue discusses MSOffice.CREATION_DATE but I think DublinCore makes more sense 
for images. However Tika will be easier to use if there is only one creation 
date field.

> Inconsistent date format for Metadata.CREATION_DATE and Metadata.LAST_MODIFIED
> ------------------------------------------------------------------------------
>
>                 Key: TIKA-451
>                 URL: https://issues.apache.org/jira/browse/TIKA-451
>             Project: Tika
>          Issue Type: Improvement
>          Components: metadata, parser
>    Affects Versions: 0.7
>            Reporter: Nick Burch
>            Assignee: Nick Burch
>            Priority: Minor
>
> Currently, the PDF Parser does   calendar.getTime().toString()   which means 
> dates end up in your local timezone, and are hard to parse
> The Open Document parsers output in iso 8601 format, which avoids these two 
> problems
> The poi ole2 based parsers also output in date.toString() format, with the 
> same timezone/parsing problems
> We should probably select one format, and update the parsers to all output in 
> it

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (TIKA-451) Inconsistent date format for Metadata.CREATION_DATE and Metadata.LAST_MODIFIED

Reply via email to