Hello!

I'm a newbie in elasticsearch, so forgive if the question is lame.

I have implemented a custom plugin using a custom lemmatizer and a 
tokenizer. The simplified class sequence: 


AnalysisMorphologyPlugin->MorphologyAnalysisBinderProcessor->SemanticAnalyzerTwitterLemmatizerProvider->RussianLemmatizingTwitterAnalyzer

In the RussianLemmatizingTwitterAnalyzer's ctor I load the custom object for 
lemmatization (object unrelated to lucene/es) in a singleton fashion (in a 
syncrhonized code block).
Then, when creating 14 indices in the same JVM I see 
 14 instances of RussianLemmatizingTwitterAnalyzer, 
 4 instances of SemanticAnalyzerTwitterLemmatizerProvider, 
 4 instances of MorphologyAnalysisBinderProcessor,
 30 instances of the custom lemmatizer (in each 
RussianLemmatizingTwitterAnalyzer only one instance is expected, so should be 
14), 
 1 instance of AnalysisMorphologyPlugin.

The question is, can RussianLemmatizingTwitterAnalyzer object be made shared 
between indices? Or is it by design, that they must load separately per index?
What could be wrong in the code that makes 30 instances of the custom singleton 
lemmatizer instead of 14?

The current standing is that *with* the plugin 100M of RAM is reserved by the 
JVM with no data. *Without* the plugin the JVM reserves 2M with no data. 
Elasticsearch 1.3.2, Lucene 4.9.0.

Regards,

Dmitry Kan

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7e0b09a0-c88c-4c56-bc8f-1b895d534cc0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to