Hi Benoit,
Thanks for the reply and link! My application is english-focused so I have
the benefit of having a language with little inflection. This along with a
few other reasons pushed me towards an index-heavy approach which doesn't
have the complexities involved with synonyms of different posit
Hi Guan,
I think I've confused everyone a little bit, including myself. When I
initially went down the rabbit hole of understanding the synchronization of
these wrapping methods I kept an eye out for all potential thread safety
issues within open-nlp. I ended up finding issues unrelated to the
sy
Hi Luke,
For what you've described as a "bug" for NLPPOSTaggerOp, I do agree with you
that there could be a more elegant solution than simply synchronizing the
entire method. That has been said, IMHO, I don't see there is a thread-safe
issue. Lucene TokenFilters are not supposed to be shared am
Hello, Benoit.
I just came across
https://lucene.apache.org/core/8_0_0/analyzers-common/org/apache/lucene/analysis/miscellaneous/TypeAsSynonymFilterFactory.html
It sounds similar to what you asking, but it watches TypeAttribute only.
Also, spans are superseded with intervals
https://lucene.apache
Hi Luke,
Thank you for your work and information sharing. From my point of view
lemmatization is just a use case of text token annotation. I have been
working with Lucene since 2006 to index lexicographic and linguistic
data and I always miss the fact that (1) token attributes are not
search
https://github.com/apache/lucene/pull/11955
On Sat, Nov 19, 2022 at 10:43 PM Robert Muir wrote:
>
> Hi,
>
> Is this 'synchronized' really needed?
>
> 1. Lucene tokenstreams are only used by a single thread. If you index
> with 10 threads, 10 tokenstreams are used.
> 2. These OpenNLP Factories mak
Hi,
Is this 'synchronized' really needed?
1. Lucene tokenstreams are only used by a single thread. If you index
with 10 threads, 10 tokenstreams are used.
2. These OpenNLP Factories make a new *Op for each tokenstream that
they create. so there's no thread hazard.
3. If i remove 'synchronized' ke