[ 
https://issues.apache.org/jira/browse/TIKA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025287#comment-14025287
 ] 

Tim Allison commented on TIKA-1325:
-----------------------------------

Doh!  Same issue for those of us in non-standard land. :)

Failed tests:   testTTFParsing(org.apache.tika.parser.font.FontParsersTest): 
expected:<1904-01-01T0[0]:00:00Z> but was:<1904-01-01T0[5]:00:00Z>

As of now, FontBox is setting the Calendar to my default timezone:
1904-01-01T00:00:00 (EDT)

When setTimeZone(UTC) in formatDate is called, this converts the calendar to 
UTC and the value is now: 1904-01-01T05:00:00Z 

I like the addition of formatDate(Calendar ...), and I like that it 
converts/normalizes to UTC.

For this one test case, though, I think we need to add some modifications to 
the test case until PDFBOX-2122 is fixed.  One simple thing we could do (given 
that we know the source of the issue) is to set the default time zone to UTC 
before parser.parse:

{code}
         //until PDFBOX-2122 is fixed, we need to set a common default
         //for the sake of this test.
        TimeZone.setDefault(TimeZone.getTimeZone("UTC"));

        try {
            parser.parse(stream, handler, metadata, context);
        } finally {
            stream.close();
        }

{code}

> Move the font metadata definitions to properties
> ------------------------------------------------
>
>                 Key: TIKA-1325
>                 URL: https://issues.apache.org/jira/browse/TIKA-1325
>             Project: Tika
>          Issue Type: Improvement
>          Components: metadata, parser
>    Affects Versions: 1.5, 1.6
>            Reporter: Nick Burch
>         Attachments: TIKA-1325_TimeZone.patch
>
>
> As noticed while working on TIKA-1182, the AFM font parser has a bunch of 
> hard coded strings it uses as metadata keys, while the TTF font parser 
> doesn't have many
> We should switch these to being proper Properties, with definitions from a 
> well known standard (+ compatibility fallbacks), and have both use largely 
> the same set



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to