[ 
https://issues.apache.org/jira/browse/PDFBOX-2862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14630583#comment-14630583
 ] 

John Hewson commented on PDFBOX-2862:
-------------------------------------

Ok, I might have figured it out. I don't think it was actually a static 
initialization issue at all. While it's true that a GlyphList is conceptually 
read-only, nameToUnicode was actually being written to as a cache by 
toUnicode(code). So if one thread is copying a GlyphList via the copy 
constructor and another is writing to the cache during a toUnicode(code) call, 
then there is a concurrent access.

This problem can be solved without synchronization by introducing a separate, 
non-shared map for per-instance caching.

> GlyphList doesn't appear to be thread safe in trunk...or user error?
> --------------------------------------------------------------------
>
>                 Key: PDFBOX-2862
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2862
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 2.0.0
>            Reporter: Tim Allison
>            Assignee: John Hewson
>             Fix For: 2.0.0
>
>         Attachments: batch-process-warn-first10klines.log.bz2
>
>
> This could be user error, but I'm getting the following when running trunk in 
> a multithreaded environment.
> {noformat}
> Caused by: java.util.ConcurrentModificationException
>         at java.util.HashMap$HashIterator.nextEntry(HashMap.java:922)
>         at java.util.HashMap$EntryIterator.next(HashMap.java:962)
>         at java.util.HashMap$EntryIterator.next(HashMap.java:960)
>         at java.util.HashMap.putAllForCreate(HashMap.java:554)
>         at java.util.HashMap.<init>(HashMap.java:298)
>         at 
> org.apache.pdfbox.pdmodel.font.encoding.GlyphList.<init>(GlyphList.java:114)
>         at 
> org.apache.pdfbox.text.PDFTextStreamEngine.<init>(PDFTextStreamEngine.java:103)
>         at 
> org.apache.pdfbox.text.PDFTextStripper.<init>(PDFTextStripper.java:196)
>         at 
> org.apache.tika.parser.pdf.PDF2XHTML.<init>(PDF2XHTML.java:106)
>         at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:133)
>         at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:132)
>         at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:281)
>         ... 16 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to