[ 
https://issues.apache.org/jira/browse/PDFBOX-765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883906#action_12883906
 ] 

Arjohn Kampman commented on PDFBOX-765:
---------------------------------------

The performance degradation seems to be related to files that can not be found. 
For example, with some PDF files, pdfbox tries to load 
org/apache/pdfbox/resources/afm/MicrosoftSansSerif.afm over and over again. 
Normally, the result of such load operations are cached in 
PDTrueTypeFont.afmObjects, but not so when the result is <null>.

Here's (one of?) the relevant stack trace(s):

ResourceLoader.loadResource(String) line: 54    
PDTrueTypeFont(PDFont).getAFM() line: 305       
PDTrueTypeFont(PDSimpleFont).getFontHeight(byte[], int, int) line: 119  
PDFTextStripper(PDFStreamEngine).processEncodedText(byte[]) line: 402   
ShowTextGlyph.process(PDFOperator, List<COSBase>) line: 61      
PDFTextStripper(PDFStreamEngine).processOperator(PDFOperator, List) line: 567   
PDFTextStripper(PDFStreamEngine).processSubStream(PDPage, PDResources, 
COSStream) line: 250     
PDFTextStripper(PDFStreamEngine).processStream(PDPage, PDResources, COSStream) 
line: 208        
PDFTextStripper.processPage(PDPage, COSStream) line: 378        
PDFTextStripper.processPages(List<COSObjectable>) line: 302     
PDFTextStripper.writeText(PDDocument, Writer) line: 258 
PDFTextStripper.getText(PDDocument) line: 184   



> Performance regression in PDFBox 1.2.0
> --------------------------------------
>
>                 Key: PDFBOX-765
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-765
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 1.2.0
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>            Priority: Critical
>
> Arjohn Kampman reported a notable performance drop in PDFBox 1.2.0, possibly 
> caused by PDFBOX-754.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to