Maybe the best would be to create some early check. E.g. detecting this
invalid table and throwing an exception there and wait until it happens.
And hope that this will bring a clue about that to do next.
Tilman
Am 22.12.2021 um 18:48 schrieb Maison Mo:
Hello Tilman,
Thank you for your answer.
Unfortunately, I can't try your workaround ; for now we can only increase
pdfbox log level.
Given our server logs, it does not seem to be related to fonts parallel
loading,because this problem appeared last time 12h after server startup (and
giventhe number of pdf that have been processed during this time, the Arial
fontshave surely been loaded during this time ; they may have been GCed though).
I tried hard to reproduce with a stress unit test that launches a new
process,without luck (that triggered a bug in jdk ColorSpace though).
The enigm is that the heap dump shows a TrueTypeFont with invalid tables map
;this 'tables' map is loaded by TTFParser, which is also the object that opens
the .ttf file(to create the RAFDataStream) :
even if 2 threads were loading the same ttf file at the same time, they would
get differentRAFDataStream and thus should not step one over each other(and
such parallel loading is impossible since FSFontInfo.getFont() is synchronized
by now)
The RAFDataStream in heap dump contains a BufferedRandomAccessFile with its
16kB buffer,containing exactly the .ttf file content, and current pointer
bufpos = end of tables headers.So as far as I can see, everything is consistent
except the tables map.
How is it possible that all tables directories have been read, with a table tag =
"\u0000" * 4 ?
Given pointer position, it looks like all tables have been read with such a
tag,then put into 'tables' map, so only the last read table remains in map (as
key=tag) ;
this would also explain why HashMap modCount=1.But tag is read by calling
String tag = raf.readString(4);and why would that fail ?
Until we can't reproduce the problem I am afraid we can't go further.
I'll try to get some more debug info.
Regards,
MM
Le samedi 18 décembre 2021 à 14:59:10 UTC+1, Tilman Hausherr
<thaush...@t-online.de> a écrit :
Hi,
Yes I suspect this is a parallel access problem, probably parallel
initialization of a standard 14 font. This has made occasional troubles
for years (despite that we tried to solve this) which is why it was
modified in 3.0.
Please try this workaround:[...]
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org