Hi there, I was wondering, would it be possible to add a new feature to the indexing engine (or somehow simulate it) that will do EXACTLY opposite of Lucy::Analysis::SnowballStopFilter? In other words, instead of blocking a list of stopwords, indexing engine will index ONLY phrases supplied in the user list to the exact match. Or even better, prioritize them for indexing: index the user list first and then use Lucy analyzer for words that are not in the list.
Why this can be useful? In chemistry for example, it is simply impossible to create a rule that will index chemical names correctly ( e.g. NH4+/H+K+/NH4+(H+), [Hg(CN)2], Ca(.-) just to name a few of thousands). Also, in a biomedical text some seemingly common words can for example, represent a gene or protein name which should not be stemmed. To summarize, this feature will allow one to create a correct index(es) of specialized terms. Alex
