[ 
https://issues.apache.org/jira/browse/TIKA-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886116#action_12886116
 ] 

Nick Burch commented on TIKA-451:
---------------------------------

Well, there are two validation steps. Firstly, for integers, we have a pair of 
asserts that check when you do set(property,int) that the property is both 
simple and int based. Those could certainly be replaced with test + throw 
PropertyTypeException. (We'll want the same for getDate(property) for non date 
property definitions)

Then there's the get when the string value is of the wrong type (eg should be 
date but isn't in the right format). That could be PropertyValidationException 
or similar. Or we could make them both the same exception for now?

> Inconsistent date format for Metadata.CREATION_DATE and Metadata.LAST_MODIFIED
> ------------------------------------------------------------------------------
>
>                 Key: TIKA-451
>                 URL: https://issues.apache.org/jira/browse/TIKA-451
>             Project: Tika
>          Issue Type: Improvement
>          Components: metadata, parser
>    Affects Versions: 0.7
>            Reporter: Nick Burch
>            Assignee: Nick Burch
>            Priority: Minor
>
> Currently, the PDF Parser does   calendar.getTime().toString()   which means 
> dates end up in your local timezone, and are hard to parse
> The Open Document parsers output in iso 8601 format, which avoids these two 
> problems
> The poi ole2 based parsers also output in date.toString() format, with the 
> same timezone/parsing problems
> We should probably select one format, and update the parsers to all output in 
> it

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to