For extreme examples like this, couldn't the stopword list be
encapsulated into a single class that's used by the lucene defaults class.
That way if you folks released updates to mostly static content like a
stopword list, new or old users could get it easily with a simple drop
in fix?
Just my two cents.
Matt
Michael McCandless wrote:
On Thu, May 21, 2009 at 12:19 PM, Robert Muir <rcm...@gmail.com> wrote:
even as simple as changing default stopword list for some analyzer could be
an issue, if the user doesn't re-index in response to that change.
OK, right.
So say we forgot to include "the" in the default English stopwords
list (yes, an extreme example...).
Under the proposed changes 1 & 2 to back-compat policy, we would add
"the" to the default stopword list, so new users get the fix, but
still keep the the-less list accessible (deprecated). We'd add an
entry in CHANGES.txt saying this happened, and then show code on how
to get back to the the-less stopword list.
New users using that StopFilter would properly see "the" filtered out.
Users who upgraded would need to fix their code to switch back to the
deprecated the-less list.
Mike
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org
--
Matthew Hall
Software Engineer
Mouse Genome Informatics
mh...@informatics.jax.org
(207) 288-6012
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org