On Jan 2, 2010, at 7:46 AM, Robert Muir wrote: >> I also want backward compatibility. Or at least control over it. That is, I >> need for indexes to work fully but want an easy path to upgrade/replace an >> index with better analyzer/filter combos. This stemmer is not backward >> compatible. > > But the Analyzers can be (we can have the old stemmer available also), > and if we create an analyzers/sv or whatever that uses the > SmartSwedishStemmer or whatever its named, its not a back compat break > as long as you can still use SnowballAnalyzer("Swedish") and get the > old one, right?
Right. New code is not a backward compatibility break. Replacing the SnowballAnalyzer("Swedish") with this one would be. The problem I have with names like "SmartSwedishStemmer" is when a better solution comes about. What then "SmarterSwedishStemmer", "BrilliantSwedishStemmer", "BetterSwedishStemmer"? Likewise for any other descriptive name. I guess I'd want OutOfTheBoxBestLatestGreatestAndMostHighlyRecommendedSwedishStemmer to stay the out of the box best , latest, greatest and most highly recommended Swedish Stemmer by Version;) > > For I think a first example of improving analyzers with Version, check > out the modifications to CzechAnalyzer, with that one we added a > stemmer where there was not one before, but the stemming only takes > place with Version >= 3.1 by default. In my opinion we should exploit > Version to improve analyzers, based upon relevance testing or > published relevance results if at all possible, of course. This is good and should be pursued. Under this only one name is needed. However, this solves only half of the problem. The index does not have metadata regarding what it requires to maintain backward compatibility. To have full backward compatibility, one may/must know that a particular index was built with: A particular JRE or Unicode version. With a particular (i.e. version) tokenizer. With a particular stop word list. With a particular ordered chain of filters. ... And then searches need to use that same software blend. There is incipient support to store this in the index but it is the responsibility of users to determine what metadata is needed for their indexes and for them to create a custom representation of it. I think Marvin mentioned a methodology in Lucy to capture that info and to use it to build the analyzer. -- DM --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org