[
https://issues.apache.org/jira/browse/TIKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025287#comment-14025287
]
Tim Allison commented on TIKA-1325:
-----------------------------------
Doh! Same issue for those of us in non-standard land. :)
Failed tests: testTTFParsing(org.apache.tika.parser.font.FontParsersTest):
expected:<1904-01-01T0[0]:00:00Z> but was:<1904-01-01T0[5]:00:00Z>
As of now, FontBox is setting the Calendar to my default timezone:
1904-01-01T00:00:00 (EDT)
When setTimeZone(UTC) in formatDate is called, this converts the calendar to
UTC and the value is now: 1904-01-01T05:00:00Z
I like the addition of formatDate(Calendar ...), and I like that it
converts/normalizes to UTC.
For this one test case, though, I think we need to add some modifications to
the test case until PDFBOX-2122 is fixed. One simple thing we could do (given
that we know the source of the issue) is to set the default time zone to UTC
before parser.parse:
{code}
//until PDFBOX-2122 is fixed, we need to set a common default
//for the sake of this test.
TimeZone.setDefault(TimeZone.getTimeZone("UTC"));
try {
parser.parse(stream, handler, metadata, context);
} finally {
stream.close();
}
{code}
> Move the font metadata definitions to properties
> ------------------------------------------------
>
> Key: TIKA-1325
> URL: https://issues.apache.org/jira/browse/TIKA-1325
> Project: Tika
> Issue Type: Improvement
> Components: metadata, parser
> Affects Versions: 1.5, 1.6
> Reporter: Nick Burch
> Attachments: TIKA-1325_TimeZone.patch
>
>
> As noticed while working on TIKA-1182, the AFM font parser has a bunch of
> hard coded strings it uses as metadata keys, while the TTF font parser
> doesn't have many
> We should switch these to being proper Properties, with definitions from a
> well known standard (+ compatibility fallbacks), and have both use largely
> the same set
--
This message was sent by Atlassian JIRA
(v6.2#6252)