Got it Rupert. Thanks. -harish On Sat, Mar 2, 2013 at 6:56 AM, Rupert Westenthaler < [email protected]> wrote:
> On Sat, Mar 2, 2013 at 2:23 PM, harish suvarna <[email protected]> wrote: > > Rupert, > > Who creates the one instance per thread specifically one opennlp > > tokenizer/postagger per thread.? is it the > > commons.opennlp or Stanbol has its own code.? > > You must have misunderstood me. > > * Models are singletons that are used by all threads . SentenceModel, > TokenizerModel, POSModel, ChunkerModel and TokenNameFinderModel are > all singeltons. Those things do need a lot of memory so it good to > have them as singletons. > * SentenceDetectors, Tokenizers, POSTagger, Chunker and > TokenNameFinders are created for each request (on top of the singleton > models). Those are lightweight components so reusing them would not > bring much of an advantage. > > The code for loading and managing the singelton models is part of the > org.apache.stanbol.commons.opennlp module (see > org.apache.stanbol.commons.opennlp.OpenNLP for details). But this > class is mainly about > > * OSGI integration > * using the Stanbol DataFileProvider [1] infrastructure for loading model > files. > > and not to workaround some OpenNLP concurrency issues. Actually the > way OpenNLP treats with concurrency seams to me just fine. I had much > more troubles with concurrency when integrating Freeling [2] and > Talismane [3] with Stanbol. > > best > Rupert > > [1] http://stanbol.apache.org/docs/trunk/utils/datafileprovider > [2] https://github.com/insideout10/stanbol-freeling > [3] https://github.com/westei/stanbol-talismane > > > > > -harish > > > > On Sat, Mar 2, 2013 at 5:14 AM, Rupert Westenthaler < > > [email protected]> wrote: > > > >> Hi > >> > >> Stanbol uses a single instance of Models (e.g. POSModel). They are > >> loaded and managed by the OpenNLP service (commons.opennlp module). > >> Stanbol does not reuse OpenNLP Tagger, Finder, ... objects build on > >> top of models (e.g. POSTagger on top of the PosModel). So each request > >> will create a new instance. This is exactly because PostTagger, > >> Tokenizers ... are not thread safe (as stated by the documentation). > >> As the documentation also mentions hat those objects are rather light > >> weight it was not taken in considerations to cache those things in > >> ResourcePools are ThreadLocal variables. > >> > >> best > >> Rupert > >> > >> On Sat, Mar 2, 2013 at 1:23 PM, harish suvarna <[email protected]> > wrote: > >> > OpenNLP documentation says postagger and tokenizer etc are not thread > >> safe. > >> > Couple of Internet posts and OpenNLP discussion forums also indicate > >> this. > >> > How is Stanbol using OpenNLP to make it thread safe? Do you use java > >> > synchonised or thread-local or any java locking to make it thread > safe.? > >> > I have not ran into this thread safe issues in Stanbol yet. Opennlp > guy > >> > says create one instance of opennlp components per thread. > >> > > >> > > >> > http://grokbase.com/t/opennlp/dev/1176mzaen1/thread-safety-or-lack-thereof > >> > -- > >> > Thanks > >> > Harish > >> > >> > >> > >> -- > >> | Rupert Westenthaler [email protected] > >> | Bodenlehenstraße 11 ++43-699-11108907 > >> | A-5500 Bischofshofen > >> > > > > > > > > -- > > Thanks > > Harish > > > > -- > | Rupert Westenthaler [email protected] > | Bodenlehenstraße 11 ++43-699-11108907 > | A-5500 Bischofshofen > -- Thanks Harish
