[
https://issues.apache.org/jira/browse/TIKA-1182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019351#comment-14019351
]
Matthias Krueger commented on TIKA-1182:
----------------------------------------
I retested with the current Tika trunk and can confirm that this file is now
handled in FontBox properly (throwing a java.io.EOFException). We should revert
https://github.com/apache/tika/commit/bbd065b7070651d939a84e043b4f6f22f80269d9
to remove the temporary workaround and can resolve this ticket.
This is trivial and would be good to have as the workaround has its own issues:
{code}
Font.createFont(Font.TRUETYPE_FONT, tis.getFile());
{code}
Not 100% sure but I think the JDK's FontManager will permanently keep a file
handle open for any font created that way.
{code}
tis.mark(0);
Font.createFont(Font.TRUETYPE_FONT, stream);
tis.reset();
{code}
This never really worked as a mark(0) with subsequent reads from the stream
will cause reset to throw an IOException.
> Out of memory exception when parsing TTF file
> ---------------------------------------------
>
> Key: TIKA-1182
> URL: https://issues.apache.org/jira/browse/TIKA-1182
> Project: Tika
> Issue Type: Bug
> Affects Versions: 1.4
> Environment: Ubuntu
> java version "1.7.0_40"
> Java(TM) SE Runtime Environment (build 1.7.0_40-b43)
> Java HotSpot(TM) 64-Bit Server VM (build 24.0-b56, mixed mode)
> Reporter: Erik Hetzner
> Attachments: 16A4FF_8.ttf, TIKA-1182-fix1.patch, TIKA_1182.java
>
>
> When parsing attached file using tika-app-1.4.jar, CPU usage is high and
> it never seems to finish.
> When parsing using attached java code, I get an out of memory exception.
> Let me know what other information I can provide.
> Thank you!
--
This message was sent by Atlassian JIRA
(v6.2#6252)