Tilman Hausherr commented on PDFBOX-3606:

That is an 28 MB file. That's not "small". I also uses tesseract but I don't 
see the library.

A few thinks I noticed:
This code
BufferedImage bufferedImage = ImageIO.read(new FileInputStream(file));
does not close its input stream. You can use file directly.

The t5 file does not decode. Maybe the exception handling is bad?

Your threadpool starts as much threads as there are pages. It should start a 
maximum of threads depending of your CPU core count. From what I see, you have 
several thread pools so a very high number of threads will be started.

I'm also missing an explanation how to configure this, and a pom file for maven 
(I'd like to build this project without using your jar files, for security 
reasons), and explain what terrible things are going to happen, and what should 
happen instead. 

Please improve your test project so that it gets smaller, but somehow 
reproduces the effect.

> FontBox Incrise memory usage on multythread
> -------------------------------------------
>                 Key: PDFBOX-3606
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3606
>             Project: PDFBox
>          Issue Type: Bug
>          Components: FontBox
>    Affects Versions: 2.0.3
>         Environment: Ubuntu 15.10
>            Reporter: Dmitri Russu
> Hi, I have a problem of memory usage for "org.apache.fontbox.cmap.CMap"
> My app is working on multythread, how I can avoid incrise of memory for this 
> fonbox
> App: https://drive.google.com/open?id=0B9izTHWJQ7xlUm5wckQ3a0t2dVU
> https://drive.google.com/open?id=0B9izTHWJQ7xlY1M0UEZVbTY5bDg
> https://drive.google.com/open?id=0B9izTHWJQ7xleHd6WUNubUhDTkE
> https://drive.google.com/open?id=0B9izTHWJQ7xlTTE0TmZ1eWN0OFE

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to