On Thu, May 21, 2009 at 05:19:43PM -0400, Michael McCandless wrote: > Marvin, which solution would you prefer?
Between the two, I'd prefer settings constructor arguments, though I would be inclined to have settings classes that are specific to individual classes rather than Lucene-wide. At least that scheme gets locality right. The global actsAsVersion variable violates that principle and has the potential to saddle a small number of users who have done absolutely nothing wrong with bugs that are very, very hard to hunt down. That's unfair. As far as analyzers and token streams, the theoretical answer is making indexes self-describing via serializable schemas, as discussed on the Lucy dev list, and as implemented in KinoSearch svn trunk. With versioning metadata attached to the index, there is no longer any worry about upgrading analysis modules provided that those modules handle their own versioning correctly. For instance, in KS the Stopalizer always embeds the complete stoplist in the schema file, so even if we update the "English" stoplist, we don't get invalid search results for indexes which were created with the old stoplist. Similarly, it may not be possible to keep around multiple variants of Snowball, but at least we can fail catastrophically instead of subtly if we detect that the Snowball version has changed. Full-on schema serialization isn't feasible for Lucene, but attaching an actsAsVersion variable to an index and feeding that to your analyzers would be a decent start. Lastly, I think a major java Lucene release is justified already. Won't this discussion die down somewhat if you can get 3.0 out? If there are issues that are half done, how about rolling back whatever's in the way? Marvin Humphrey --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org