I think it would be nice to support protecting tokens based on their length. Maybe you can open an issue about it?
On Wed, Jan 22, 2014 at 5:10 PM, Loren <[email protected]> wrote: > Using the minimal_english > stemmer<http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-stemmer-tokenfilter.html>, > acronym tokens like "irs" and "nps" get stemmed to "ir" and "np". I can use > the keyword marker token > filter<http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-keyword-marker-tokenfilter.html>to > specify a list of acronyms to protect, but I do not know them all in > advance so I will be constantly tweaking the list and reindexing. > > Ideally, I would like to be able to either tell the keyword marker to > protect tokens 1-4 characters in length, or tell the minimal english > stemmer to ignore tokens shorter than 5 characters. > > Are either of those options possible? > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/e385b457-6eed-4a98-975d-9cf19375c39f%40googlegroups.com > . > For more options, visit https://groups.google.com/groups/opt_out. > -- Adrien Grand -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7avbq%2B9f2O1HxMzkwDgEUQXj6%2BThVJs3dCSuQObPZFgA%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
