[
https://issues.apache.org/jira/browse/PDFBOX-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andreas Lehmkühler resolved PDFBOX-1622.
----------------------------------------
Resolution: Fixed
I've applied the patch in revision 1512433 as proposed including some slight
changes.
Thanks for the contribution!
Please be more patient for the future as we all try to do our best.
> TextNormalize init not thread-safe, may lead to infinite loop
> -------------------------------------------------------------
>
> Key: PDFBOX-1622
> URL: https://issues.apache.org/jira/browse/PDFBOX-1622
> Project: PDFBox
> Issue Type: Bug
> Components: Utilities
> Affects Versions: 1.0.0
> Reporter: Florent Guillaume
> Assignee: Andreas Lehmkühler
> Fix For: 1.8.3, 2.0.0
>
> Attachments: PDFBOX-1622.patch.txt
>
>
> TextNormalize fills a static HashMap (DIACHASH) from a method
> (populateDiacHash) called by the TextNormalize constructor.
> If the constructor is called from two different threads at the same time,
> then the HashMap may be written by two concurrent threads which may and will
> cause infinite loops.
> We see the CPU at 100% and jstack shows 4 threads all stuck at:
> "Thread-2" prio=10 tid=0x00007f6e94499000 nid=0x347 runnable
> [0x00007f6e925d6000]
> java.lang.Thread.State: RUNNABLE
> at java.util.HashMap.put(HashMap.java:391)
> at
> org.apache.pdfbox.util.TextNormalize.populateDiacHash(TextNormalize.java:82)
> at org.apache.pdfbox.util.TextNormalize.<init>(TextNormalize.java:41)
> at
> org.apache.pdfbox.util.PDFTextStripper.<init>(PDFTextStripper.java:193)
> A patch to fix this is attached, it just moves the initialization to a static
> block.
> Please apply to the 1.8.3 and 2.0.0 branches.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira