Re: Lucene's default settings & back compatibility

DM Smith Tue, 19 May 2009 05:26:54 -0700


On May 19, 2009, at 7:45 AM, Michael McCandless wrote:

On Tue, May 19, 2009 at 6:47 AM, DM Smith <dmsmith...@gmail.com>wrote:
It is common in my application, a Bible program, that indexes eachverse(think of a verse as a numbered sentence) as a separate document.We indexeverything, including words that are typically stop words as thosemight beimportant to our end users. Besides this, the top 280 word rootsrepresent
90% of the occurrences.
And on searches, we return everything in book order, unless theuser wantsto score the result. In that case, we return a small, userconfigurable
amount of hits ordered by score.
The ability to turn off scoring when sorting by field, new in 2.9,
should be a good performance boost for your use case (if performance
is important).
And we are using Lucene out of the box for the most part. We'vedeviated
only to incrementally solve performance problems.
Right, my impression is most people will stick w/ Lucene's defaults,
incrementally changing only limited settings they come across, which
is why selecting good defaults is vital to Lucene's growth/adoption
(new users especially simply start w/ our defaults).
But we can't pick good defaults when we're so heavily bound by back-compat.
Which is why I find the Settings approach so appealing :)  Suddenly,
on all improvements to Lucene, we have the freedom to change our
defaults so a new user sees all such improvements.


From my perspective as a user:
Backward compatibility is important, but it is not a be-all and end-all.

To me, if I can drop in the new jar and get bug fixes that's great. Myexpectation is that searches against an existing index will stillreturn the same or, in the case of bug fixes, better results.

What I need to know is when that is not the case. Today, we use anaming convention of the Lucene jars to indicate whether that is true.I'd be just as happy if there were a compatibility level that I couldcheck (I'm having to do that in our code as I change our analyzersfrequently enough to be embarrassed).

The problem, which might be addressed in the "fixing" of core vscontrib, is that we use lots of contrib (analyzers, snowball,highlighting) and want it to maintain backward compatibility too. (I'mhappy that has been the case!) So, perhaps a compatibility level percontribution.

The packagers for jpackage consider nearly every release of Lucene tobreak backward compatibility, because they treat Lucene as a whole.Perhaps that is the same with other Linux distributions. But becausebackward compatibility does not apply to contrib in a strict fashion,one cannot reliably use Lucene from distributions unless such a policyis the case.

In any case, I don't think anyone should just drop in a new jarwithout some testing. At a minimum, they should compile withdeprecations turned on.


Regarding deprecations, I'd also be just as happy if a method was marked

@deprecated This behavior <b>has</b> changed in with this release,2.4.3.

That is, as a warning of changed behavior.

And then on the 3.0 release the warning could be removed.

But then again, my use of Lucene, while very important to myapplication, is very simple and easy to change.


-- DM




---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Lucene's default settings & back compatibility

Reply via email to