Sorry, I wasn't quite sure what to call this new class you guys have been talking about.

I was referring to the class that's being discussed to encapsulate all of the defaults for a given lucene release. (Its caching strategies etc etc)

I'm just not certain that something like a static list of words belongs in a higher level defaults class like you guys are talking about, especially considering that anyone using a stop enabled analyzer really should be familiar with this list, and oftentimes needs to override it.

Meh, now that I'm actually typing it out though, perhaps I'm incorrect here, assuming this class you guys are describing will be well advertised/documented maybe it will actually make it easier for end developers to twiddle around with this list, or at least certainly make them more aware that its even something that they have the ability to actually change.

Matt

Michael McCandless wrote:
What is the "lucene defaults class"?

Mike

On Thu, May 21, 2009 at 12:37 PM, Matthew Hall
<mh...@informatics.jax.org> wrote:
For extreme examples like this, couldn't the stopword list be encapsulated
into a single class that's used by the lucene defaults class.

That way if you folks released updates to mostly static content like a
stopword list, new or old users could get it easily with a simple drop in
fix?

Just my two cents.

Matt

Michael McCandless wrote:
On Thu, May 21, 2009 at 12:19 PM, Robert Muir <rcm...@gmail.com> wrote:

even as simple as changing default stopword list for some analyzer could
be
an issue, if the user doesn't re-index in response to that change.

OK, right.

So say we forgot to include "the" in the default English stopwords
list (yes, an extreme example...).

Under the proposed changes 1 & 2 to back-compat policy, we would add
"the" to the default stopword list, so new users get the fix, but
still keep the the-less list accessible (deprecated).  We'd add an
entry in CHANGES.txt saying this happened, and then show code on how
to get back to the the-less stopword list.

New users using that StopFilter would properly see "the" filtered out.
 Users who upgraded would need to fix their code to switch back to the
deprecated the-less list.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org


--
Matthew Hall
Software Engineer
Mouse Genome Informatics
mh...@informatics.jax.org
(207) 288-6012



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



--
Matthew Hall
Software Engineer
Mouse Genome Informatics
mh...@informatics.jax.org
(207) 288-6012



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to