Le 08/02/2022 à 19:17, Tilman Hausherr a écrit :
Am 08.02.2022 um 14:35 schrieb Marton Róbert:
Dear PDFBox developers,
We are working on an application that is parsing and verifying the
quality of PDF files. We wanted to extract the text of the pages
using multiple threads to speed up the process but eventually we
found that PDFBox 2 is not thread safe.
Do you plan to make it thread safe in PDFBox 3 at least for text
extraction?
I don't know (But maybe yes? Due to PDFBOX-5214).
Hello,
Do you have any details regarding this thread safety issue ?
(sorry if this is not be the right for this discussion,
I ask this because we have a strange Heisenbug in production)
Andreas Lehmkühler wrote in
https://issues.apache.org/jira/browse/PDFBOX-5214#comment-17398118 :
«After some debugging I've found the reason. The static instances of the
14 standard fonts within PDType1Font aren't threadsafe. Those fonts are
backed by a COSDictionary which isn't supposed to be stable. »
I do not understand which field is mutable in the PDType1Font object ?
(yes, some fields are lazily calculated, but they use thread safe Map etc.
so we should be safe ?)
MM.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org