On May 19, 2009, at 7:45 AM, Michael McCandless wrote:

On Tue, May 19, 2009 at 6:47 AM, DM Smith <dmsmith...@gmail.com> wrote:

It is common in my application, a Bible program, that indexes each verse (think of a verse as a numbered sentence) as a separate document. We index everything, including words that are typically stop words as those might be important to our end users. Besides this, the top 280 word roots represent
90% of the occurrences.
And on searches, we return everything in book order, unless the user wants to score the result. In that case, we return a small, user configurable
amount of hits ordered by score.

The ability to turn off scoring when sorting by field, new in 2.9,
should be a good performance boost for your use case (if performance
is important).

And we are using Lucene out of the box for the most part. We've deviated
only to incrementally solve performance problems.

Right, my impression is most people will stick w/ Lucene's defaults,
incrementally changing only limited settings they come across, which
is why selecting good defaults is vital to Lucene's growth/adoption
(new users especially simply start w/ our defaults).

But we can't pick good defaults when we're so heavily bound by back- compat.

Which is why I find the Settings approach so appealing :)  Suddenly,
on all improvements to Lucene, we have the freedom to change our
defaults so a new user sees all such improvements.

From my perspective as a user:
Backward compatibility is important, but it is not a be-all and end-all.

To me, if I can drop in the new jar and get bug fixes that's great. My expectation is that searches against an existing index will still return the same or, in the case of bug fixes, better results.

What I need to know is when that is not the case. Today, we use a naming convention of the Lucene jars to indicate whether that is true. I'd be just as happy if there were a compatibility level that I could check (I'm having to do that in our code as I change our analyzers frequently enough to be embarrassed).

The problem, which might be addressed in the "fixing" of core vs contrib, is that we use lots of contrib (analyzers, snowball, highlighting) and want it to maintain backward compatibility too. (I'm happy that has been the case!) So, perhaps a compatibility level per contribution.

The packagers for jpackage consider nearly every release of Lucene to break backward compatibility, because they treat Lucene as a whole. Perhaps that is the same with other Linux distributions. But because backward compatibility does not apply to contrib in a strict fashion, one cannot reliably use Lucene from distributions unless such a policy is the case.

In any case, I don't think anyone should just drop in a new jar without some testing. At a minimum, they should compile with deprecations turned on.

Regarding deprecations, I'd also be just as happy if a method was marked
@deprecated This behavior <b>has</b> changed in with this release, 2.4.3.
That is, as a warning of changed behavior.

And then on the 3.0 release the warning could be removed.

But then again, my use of Lucene, while very important to my application, is very simple and easy to change.

-- DM




---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to