Re: [PATCH] Bug on BrazilianAnalyzer

Rafael Cunha de Almeida Fri, 21 Nov 2008 14:16:29 -0800

On Fri, 21 Nov 2008 16:46:30 -0200
Rafael Cunha de Almeida <[EMAIL PROTECTED]> wrote:


> On Mon, 17 Nov 2008 19:58:47 -0800
> "Adriano Crestani" <[EMAIL PROTECTED]> wrote:
> 
> > Hi Rafael,
> > 
> > What is your scenario?
> > 
> > Maybe it was defined this way so it do not filter uppercased stop words.
> > Like, for example, the downcased word "se" is a stopword, but the uppercased
> > "SE" stands for "Sergipe", a brazilian state, so it should not be filtered.
> 
> Suppose you are right, but passing it through the LowerCaseFilter can
> be useful too, specially if you don't care much about those corner
> cases (the GermanAnalyzer, for instance, passes through
> LowerCaseFilter first). The class being final doesn't allow to inherit
> from it and make the changes if one needs to, which is unfortunate :-(.
> 
> I would like to see a change in this whole stemmer's and language
> analyzer's API in order to make it more flexible and extensible. The
> way it is you have to use them in that predeterminaded way.
> 
> It would be nice if there was only one StemFilter, a Stemmer interface
> and all Stemmers were subclasses of that. Then, the StemFilter should
> get its Stemmer as a constructor parameter. I see no reason for
> BrazilianAnalyzer to be public.

To be final, sorry. I was a bit tired when I wrote all that.

> Are you interested in those kind of changes? Do you agree with them?

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [PATCH] Bug on BrazilianAnalyzer

Reply via email to