[ https://issues.apache.org/jira/browse/PDFBOX-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Florent Guillaume updated PDFBOX-1622: -------------------------------------- Attachment: PDFBOX-1622.patch.txt > TextNormalize init not thread-safe, may lead to infinite loop > ------------------------------------------------------------- > > Key: PDFBOX-1622 > URL: https://issues.apache.org/jira/browse/PDFBOX-1622 > Project: PDFBox > Issue Type: Bug > Components: Utilities > Affects Versions: 1.0.0 > Reporter: Florent Guillaume > Fix For: 1.8.3, 2.0.0 > > Attachments: PDFBOX-1622.patch.txt > > > TextNormalize fills a static HashMap (DIACHASH) from a method > (populateDiacHash) called by the TextNormalize constructor. > If the constructor is called from two different threads at the same time, > then the HashMap may be written by two concurrent threads which may and will > cause infinite loops. > We see the CPU at 100% and jstack shows 4 threads all stuck at: > "Thread-2" prio=10 tid=0x00007f6e94499000 nid=0x347 runnable > [0x00007f6e925d6000] > java.lang.Thread.State: RUNNABLE > at java.util.HashMap.put(HashMap.java:391) > at > org.apache.pdfbox.util.TextNormalize.populateDiacHash(TextNormalize.java:82) > at org.apache.pdfbox.util.TextNormalize.<init>(TextNormalize.java:41) > at > org.apache.pdfbox.util.PDFTextStripper.<init>(PDFTextStripper.java:193) > A patch to fix this is attached, it just moves the initialization to a static > block. > Please apply to the 1.8.3 and 2.0.0 branches. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira