[ 
https://issues.apache.org/jira/browse/TIKA-1182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13795539#comment-13795539
 ] 

Erik Hetzner commented on TIKA-1182:
------------------------------------

The pdfbox project is suggesting that, because this is a bad font, we should 
try to precheck the font with java.awt.Font.createFont. I have modified 
TrueTypeParser.java to do this. It passes all tests and does not go into an 
infinite loop or blow out the memory, but I don't know if this will work with 
Tika. Patch attached.

> Out of memory exception when parsing TTF file
> ---------------------------------------------
>
>                 Key: TIKA-1182
>                 URL: https://issues.apache.org/jira/browse/TIKA-1182
>             Project: Tika
>          Issue Type: Bug
>    Affects Versions: 1.4
>         Environment: Ubuntu
> java version "1.7.0_40"
> Java(TM) SE Runtime Environment (build 1.7.0_40-b43)
> Java HotSpot(TM) 64-Bit Server VM (build 24.0-b56, mixed mode)
>            Reporter: Erik Hetzner
>         Attachments: 16A4FF_8.ttf, TIKA-1182-fix1.patch, TIKA_1182.java
>
>
>   When parsing attached file using tika-app-1.4.jar, CPU usage is high and it 
> never seems to finish.
> When parsing using attached java code, I get an out of memory exception.
> Let me know what other information I can provide.
> Thank you!



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to