I meant: a) Instantiate the components in the local scope that leads to their references being in the call (thread) stack.
On Wed, Jan 11, 2017 at 8:33 PM, Cohan Sujay Carlos <[email protected]> wrote: > Control over threading is not required to "share the model between > threads and create one instance of the component per thread". > > One could use a scope where variable references are guaranteed to be > stored in the call stack (say method-local variables in Java). > > You could then: > > a) Instantiate the components on the call stack. > b) Instantiate the models in constructors or the factory methods of a > singleton. > > If one were using OpenNLP in a Tomcat webapp, for instance, one could, I > believe, use this method. > > Cohan Sujay Carlos > > > On Wed, Jan 11, 2017 at 7:08 PM, Thilo Goetz <[email protected]> wrote: > >> Correct me if I'm wrong, but that approach only works if you control the >> thread creation yourself. In my case, for example, I was using Scala's >> parallel collection API, and had no control over the threading. I will >> usually want to create one service that does tokenization or POS tagging or >> whatever, which can be accessed by many threads. I don't want to have to >> mess around with an object pool, or thread locals, or anything like that. >> Especially since there is really no good reason IMHO. You could very easily >> just return the probabilities together with the spans, and whoever doesn't >> need them can ignore them. Or have two methods, one with probabilities, one >> without. Maybe it's just where I'm coming from, but I fail to see the >> advantages of the current approach. >> >> --Thilo >> >> >> >> On 11/01/2017 13:58, Joern Kottmann wrote: >> >>> Hello Thilo, >>> >>> I am interested in your opinion about how this is done currently. >>> We say: "Share the model between threads and create one instance of the >>> component per thread". >>> >>> Wouldn't that work well in your use case? >>> >>> Jörn >>> >>> >>> >>> On Wed, Jan 11, 2017 at 11:05 AM, Thilo Goetz <[email protected]> wrote: >>> >>> Hi, >>>> >>>> in a recent project, I was using SentenceDetectorME, TokenizerME and >>>> POSTaggerME. It turns out that none of those is thread safe. This is >>>> because the classification probabilities for the last tag() call (for >>>> example) are stored in a member variable and can be retrieved by a >>>> separate >>>> API call. >>>> >>>> I'm planning to build thread safe versions for myself, and I'd be happy >>>> to >>>> contribute a patch if there is interest. This could be done as a >>>> conservative extension with an additional method such as tagReentrant, >>>> where the old API calls would continue to work as before and would still >>>> not be thread safe. Alternatively, one could remodel the API so that >>>> everything was thread safe, but that would break backwards >>>> compatibility. >>>> >>>> Final question: if I do this for the classes mentioned above, are there >>>> other tools that should be made thread safe while we're at it? >>>> >>>> Opinions? >>>> >>>> --Thilo >>>> >>>> >>>> >>>> >> >
