[ https://issues.apache.org/jira/browse/SOLR-1677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12802020#action_12802020 ]
Mark Miller commented on SOLR-1677: ----------------------------------- In my opinion this should be real simple. Having to specify a Lucene version for each component is not simple - its beyond most users. I think its beyond me (laugh as you see fit). Having to accept Lucene 2.4 behavior by default because of Solr back compat issues is also "weak". A new user should get all the bug fixes of the latest Lucene with minimal effort. Hopefully no effort. Older users should be able to get the newest with minimal effort as well - not having to go one by one through each component and upgrading it. I can't imagine juggling all these versions for each component - thats ugly enough in Lucene - it shouldn't infect Solr for the average case. Personally, I do think there should be a global default. And I think right next to it, it should say, if you change this, you must reindex. No worries about action at a distance. The action is to get the latest and greatest Lucene has to offer rather than older buggy or back compat behavior. Reindex, get latest greatest. Don't reindex and your on your own. Solr might rip your head off. We should also offer per component for real experts, but I wouldn't be meddling that way myself unless in a bind. Solr should be real simple about this - and the latest Solr should use the latest bug fixes from Lucene, with previous configs out there defaulting to 2.4 compatibility. I abbreviated the heck out of my arguments and thinking, but damn it thats what I think :) > Add support for o.a.lucene.util.Version for BaseTokenizerFactory and > BaseTokenFilterFactory > ------------------------------------------------------------------------------------------- > > Key: SOLR-1677 > URL: https://issues.apache.org/jira/browse/SOLR-1677 > Project: Solr > Issue Type: Sub-task > Components: Schema and Analysis > Reporter: Uwe Schindler > Attachments: SOLR-1677.patch, SOLR-1677.patch, SOLR-1677.patch, > SOLR-1677.patch > > > Since Lucene 2.9, a lot of analyzers use a Version constant to keep backwards > compatibility with old indexes created using older versions of Lucene. The > most important example is StandardTokenizer, which changed its behaviour with > posIncr and incorrect host token types in 2.4 and also in 2.9. > In Lucene 3.0 this matchVersion ctor parameter is mandatory and in 3.1, with > much more Unicode support, almost every Tokenizer/TokenFilter needs this > Version parameter. In 2.9, the deprecated old ctors without Version take > LUCENE_24 as default to mimic the old behaviour, e.g. in StandardTokenizer. > This patch adds basic support for the Lucene Version property to the base > factories. Subclasses then can use the luceneMatchVersion decoded enum (in > 3.0) / Parameter (in 2.9) for constructing Tokenstreams. The code currently > contains a helper map to decode the version strings, but in 3.0 is can be > replaced by Version.valueOf(String), as the Version is a subclass of Java5 > enums. The default value is Version.LUCENE_24 (as this is the default for the > no-version ctors in Lucene). > This patch also removes unneeded conversions to CharArraySet from > StopFilterFactory (now done by Lucene since 2.9). The generics are also fixed > to match Lucene 3.0. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.