[
https://issues.apache.org/jira/browse/OPENNLP-808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15169432#comment-15169432
]
Tristan Nixon commented on OPENNLP-808:
---------------------------------------
A simple way to deal with this is to wrap your parsers in a ThreadLocal<Parser>
instance, like so:
private ThreadLocal<Parser> tlParser = new ThreadLocal<Parser>();
...
if( tlParser.get() == null )
tlParser.set( ParserFactory.create( parserModel ) )
Parse[] parsed = ParserTool.parseLine( sentencestr, tlParser.get(), 1 );
> Parser is not thread safe
> -------------------------
>
> Key: OPENNLP-808
> URL: https://issues.apache.org/jira/browse/OPENNLP-808
> Project: OpenNLP
> Issue Type: Bug
> Components: Parser
> Affects Versions: tools-1.5.3, 1.6.0
> Environment: Ubuntu 14.04.3 LTS
> java version "1.7.0_55"
> Java(TM) SE Runtime Environment (build 1.7.0_55-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 24.55-b03, mixed mode)
> Reporter: Fergal Monaghan
> Attachments: fix_thread_safety_bottomupparser.diff,
> fix_thread_safety_contextcache.diff, test_thread_safety_bug.diff
>
>
> I'm actually not sure if this is really a "Major" "Bug" as I have listed it,
> perhaps it is by design. However even in this case this issue should possibly
> be listed as an "Improvement".
> Steps to recreate:
> 1. Run 2 or more threads simultaneously which make calls to the same parser
> object with the same piece of text.
> 2. One of a couple of things happens:
> (a) Either: line 281 of opennlp.tools.parser.AbstractBottomUpParser throws a
> java.util.ConcurrentModificationException from java.util.ArrayList iterator
> due to the `odh` field being global/shared in the object and not local to the
> method.
> (b) Or: the opennlp.tools.postag.DefaultPOSContextGenerator.getContext method
> throws a NullPointerException from line 77 of the
> opennlp.tools.util.Cache.clear method, since the underlying
> opennlp.tools.util.DoubleLinkedListElement is altered out from underneath it.
> Unless there are serious memory reasons for doing so, I would propose that
> such fields could be made local to the method since thread safety may take
> precedence over the memory saved in this case. As is, any code that calls the
> parser has to be enclosed in a giant synchronized block, and all applications
> using the parser have serious performance issues/cannot make use of modern
> hardware. I could be way of the mark here though if there is method to the
> madness :)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)