[GitHub] [lucene] rmuir commented on pull request #24: LUCENE-9852: Make Hunspell thread-safe

GitBox Thu, 18 Mar 2021 11:11:12 -0700


rmuir commented on pull request #24:
URL: https://github.com/apache/lucene/pull/24#issuecomment-802174939



   I guess my question is why does the stemmer need to be threadsafe?
   
   Stemmers in lucene aren't threadsafe, we use a threadlocal model for the 
analysis chain. So tokenizers, stemmers, etc are cached per-thread, and 
maintain some buffers to avoid creating tons of garbage.
   
   e.g. the way the Analyzer class works, if you are indexing with 8 threads, 
is that you have 8 HunspellStemFilters, each one with its own HunspellStemmer, 
so there are no thread safety issues. Previously the idea is that only the 
"large" thing (Dictionary) needed to be threadsafe as we don't want to 
instantiate it all the time anyway.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [lucene] rmuir commented on pull request #24: LUCENE-9852: Make Hunspell thread-safe

Reply via email to