[ 
https://issues.apache.org/jira/browse/PDFBOX-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15400820#comment-15400820
 ] 

John Hewson edited comment on PDFBOX-3442 at 7/30/16 8:45 PM:
--------------------------------------------------------------

The font cache only handles indirect objects as only object (id, gen) is 
consistent across pages. Direct objects are indeed unusual, and can't be cached 
across pages, but what we should probably be doing here is caching the direct 
object for the given page. Currently each call to getFont within even the same 
page is resulting in a new PDFont.

We could extend ResourceCache with methods to cache direct objects keyed by 
(type, name, page) and then have PDResources use that cache if the object is 
direct.


was (Author: jahewson):
The font cache only handles indirect objects as only object (id, gen) is 
consistent across pages. Direct objects are indeed unusual, and can't be cached 
across pages, but what we should probably be doing here is caching the direct 
object for the given page.

We could extend ResourceCache with methods to cache direct objects keyed by 
(type, name, page) and then have PDResources use that cache if the object is 
direct.

> OOM for single page pdf file
> ----------------------------
>
>                 Key: PDFBOX-3442
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3442
>             Project: PDFBox
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Priority: Minor
>
> On TIKA-2045, a user posted a single page document that leads to OOM with 
> -Xmx1g.  I confirmed this with PDFBox's ExtractText.
> Might be a memory leak with the fonts?  See 
> [this|https://issues.apache.org/jira/browse/TIKA-2045?focusedCommentId=15399583&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15399583]
>  for some diagnostics I did.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to