Am 13.02.2022 um 10:19 schrieb Mo Maison:
Le 08/02/2022 à 19:17, Tilman Hausherr a écrit :
Am 08.02.2022 um 14:35 schrieb Marton Róbert:
Dear PDFBox developers,
We are working on an application that is parsing and verifying the
quality of PDF files. We wanted to extract the text of the pages
using multiple threads to speed up the process but eventually we
found that PDFBox 2 is not thread safe.
Do you plan to make it thread safe in PDFBox 3 at least for text
extraction?
I don't know (But maybe yes? Due to PDFBOX-5214).
Hello,
Do you have any details regarding this thread safety issue ?
The problem that occurred is that fonts couldn't be read, got wrong
values, or EOF. I've never really understood what happened. These
problems are difficult to fix.
Tilman
(sorry if this is not be the right for this discussion,
I ask this because we have a strange Heisenbug in production)
Andreas Lehmkühler wrote in
https://issues.apache.org/jira/browse/PDFBOX-5214#comment-17398118 :
«After some debugging I've found the reason. The static instances of
the 14 standard fonts within PDType1Font aren't threadsafe. Those
fonts are backed by a COSDictionary which isn't supposed to be stable. »
I do not understand which field is mutable in the PDType1Font object ?
(yes, some fields are lazily calculated, but they use thread safe Map
etc.
so we should be safe ?)
MM.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org