[
https://issues.apache.org/jira/browse/TIKA-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886116#action_12886116
]
Nick Burch commented on TIKA-451:
---------------------------------
Well, there are two validation steps. Firstly, for integers, we have a pair of
asserts that check when you do set(property,int) that the property is both
simple and int based. Those could certainly be replaced with test + throw
PropertyTypeException. (We'll want the same for getDate(property) for non date
property definitions)
Then there's the get when the string value is of the wrong type (eg should be
date but isn't in the right format). That could be PropertyValidationException
or similar. Or we could make them both the same exception for now?
> Inconsistent date format for Metadata.CREATION_DATE and Metadata.LAST_MODIFIED
> ------------------------------------------------------------------------------
>
> Key: TIKA-451
> URL: https://issues.apache.org/jira/browse/TIKA-451
> Project: Tika
> Issue Type: Improvement
> Components: metadata, parser
> Affects Versions: 0.7
> Reporter: Nick Burch
> Assignee: Nick Burch
> Priority: Minor
>
> Currently, the PDF Parser does calendar.getTime().toString() which means
> dates end up in your local timezone, and are hard to parse
> The Open Document parsers output in iso 8601 format, which avoids these two
> problems
> The poi ole2 based parsers also output in date.toString() format, with the
> same timezone/parsing problems
> We should probably select one format, and update the parsers to all output in
> it
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.