[ 
https://issues.apache.org/jira/browse/PDFBOX-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Florent Guillaume updated PDFBOX-1622:
--------------------------------------

    Attachment: PDFBOX-1622.patch.txt
    
> TextNormalize init not thread-safe, may lead to infinite loop
> -------------------------------------------------------------
>
>                 Key: PDFBOX-1622
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1622
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Utilities
>    Affects Versions: 1.0.0
>            Reporter: Florent Guillaume
>             Fix For: 1.8.3, 2.0.0
>
>         Attachments: PDFBOX-1622.patch.txt
>
>
> TextNormalize fills a static HashMap (DIACHASH) from a method 
> (populateDiacHash) called by the TextNormalize constructor.
> If the constructor is called from two different threads at the same time, 
> then the HashMap may be written by two concurrent threads which may and will 
> cause infinite loops.
> We see the CPU at 100% and jstack shows 4 threads all stuck at:
> "Thread-2" prio=10 tid=0x00007f6e94499000 nid=0x347 runnable 
> [0x00007f6e925d6000]
>    java.lang.Thread.State: RUNNABLE
>       at java.util.HashMap.put(HashMap.java:391)
>       at 
> org.apache.pdfbox.util.TextNormalize.populateDiacHash(TextNormalize.java:82)
>       at org.apache.pdfbox.util.TextNormalize.<init>(TextNormalize.java:41)
>       at 
> org.apache.pdfbox.util.PDFTextStripper.<init>(PDFTextStripper.java:193)
> A patch to fix this is attached, it just moves the initialization to a static 
> block.
> Please apply to the 1.8.3 and 2.0.0 branches.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to