For extreme examples like this, couldn't the stopword list be encapsulated into a single class that's used by the lucene defaults class.

That way if you folks released updates to mostly static content like a stopword list, new or old users could get it easily with a simple drop in fix?

Just my two cents.

Matt

Michael McCandless wrote:
On Thu, May 21, 2009 at 12:19 PM, Robert Muir <rcm...@gmail.com> wrote:
even as simple as changing default stopword list for some analyzer could be
an issue, if the user doesn't re-index in response to that change.

OK, right.

So say we forgot to include "the" in the default English stopwords
list (yes, an extreme example...).

Under the proposed changes 1 & 2 to back-compat policy, we would add
"the" to the default stopword list, so new users get the fix, but
still keep the the-less list accessible (deprecated).  We'd add an
entry in CHANGES.txt saying this happened, and then show code on how
to get back to the the-less stopword list.

New users using that StopFilter would properly see "the" filtered out.
 Users who upgraded would need to fix their code to switch back to the
deprecated the-less list.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



--
Matthew Hall
Software Engineer
Mouse Genome Informatics
mh...@informatics.jax.org
(207) 288-6012



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to