Memory Consumption in PDDocument.load

Gabriel Pessoa Thu, 20 Apr 2017 06:01:41 -0700

Hello.

Recently at our company we started to worry about how much memory wasbeing used during our PDF signing process. We are using the 1.8.13 now,mostly because the loading time on 2.0.x got longer (I actually askedabout it some six months ago and Tilman explained the reason why).

This question on StackOverflow I think cleared some doubts I had abouthow PDFBox worked:http://stackoverflow.com/questions/22340674/performance-itext-vs-pdfbox

The main point being: PDFBox parses and have ALL the objects in the PDFloaded. So, complex objects will use a lot of memory. Am I correct?

If that is the case, I understand that is necessary for PDFmanipulation, but is that necessary for PDF signing? Looking at a signedPDF structure it looks like only the Root entry (to update the AcroFormentry) and the signed page entry (to update the Annots entry) are reallyneeded for signing.

So I would be too wrong in suggesting a new load method that would beused only for singing and that would only load those necessary entriesand would not load things like images and fonts and tables, etc.

If not that, something akin to "lazy loading" could be done? With thePDF objects only being actually parsed and loaded when being accessed.The load would only map all the references in that case.

If any on those two options is possible but you don't have anyonecurrently available to work on it, I could try to develop that solution.I would only need to know if it would be better to use the 2.0.6 branchor the 3.0.0 trunk.


Thank you very much for your time.

--
Atenciosamente,

Gabriel Pessoa
Analista
BRy Tecnologia
Rua Lauro Linhares, 2123 Torre B - 3º andar
88036-002 - Florianópolis - SC - Brasil
+55 (48) 3234 6696


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Memory Consumption in PDDocument.load

Reply via email to