Using the minimal_english stemmer<http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-stemmer-tokenfilter.html>, acronym tokens like "irs" and "nps" get stemmed to "ir" and "np". I can use the keyword marker token filter<http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-keyword-marker-tokenfilter.html>to specify a list of acronyms to protect, but I do not know them all in advance so I will be constantly tweaking the list and reindexing.
Ideally, I would like to be able to either tell the keyword marker to protect tokens 1-4 characters in length, or tell the minimal english stemmer to ignore tokens shorter than 5 characters. Are either of those options possible? -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e385b457-6eed-4a98-975d-9cf19375c39f%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
