OutOfMemoryError with PDFTextStripper
-------------------------------------

                 Key: PDFBOX-899
                 URL: https://issues.apache.org/jira/browse/PDFBOX-899
             Project: PDFBox
          Issue Type: Bug
          Components: Text extraction
    Affects Versions: 1.3.1
         Environment: java version "1.6.0_22"
Java(TM) SE Runtime Environment (build 1.6.0_22-b04)
Java HotSpot(TM) Client VM (build 17.1-b03, mixed mode)
            Reporter: Alexander Veit
            Priority: Critical


PDFBox 1.3.1 has high memory demands when stripping text from PDF files.

http://www.unicode.org/Public/5.1.0/charts/CodeCharts.pdf even crashes an 
application server by requiring esimated aditional 300MB+ of heap memory. The 
heap dump suggests that PDFStreamEngine#documentFontCache might be the root of 
the leaking objects.

PDFBox 1.0.0 did not show this behaviour. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to