Re: Integrating NLP into Lucene Analysis Chain

2022-11-19 Thread Robert Muir
https://github.com/apache/lucene/pull/11955 On Sat, Nov 19, 2022 at 10:43 PM Robert Muir wrote: > > Hi, > > Is this 'synchronized' really needed? > > 1. Lucene tokenstreams are only used by a single thread. If you index > with 10 threads, 10 tokenstreams are used. > 2. These OpenNLP Factories

Re: Integrating NLP into Lucene Analysis Chain

2022-11-19 Thread Robert Muir
Hi, Is this 'synchronized' really needed? 1. Lucene tokenstreams are only used by a single thread. If you index with 10 threads, 10 tokenstreams are used. 2. These OpenNLP Factories make a new *Op for each tokenstream that they create. so there's no thread hazard. 3. If i remove 'synchronized'

Integrating NLP into Lucene Analysis Chain

2022-11-19 Thread Luke Kot-Zaniewski (BLOOMBERG/ 919 3RD A)
Greetings, I would greatly appreciate anyone sharing their experience doing NLP/lemmatization and am also very curious to gauge the opinion of the lucene community regarding open-nlp. I know there are a few other libraries out there, some of which can’t be directly included in the lucene