On Fri, 21 Nov 2008 16:46:30 -0200 Rafael Cunha de Almeida <[EMAIL PROTECTED]> wrote:
> On Mon, 17 Nov 2008 19:58:47 -0800 > "Adriano Crestani" <[EMAIL PROTECTED]> wrote: > > > Hi Rafael, > > > > What is your scenario? > > > > Maybe it was defined this way so it do not filter uppercased stop words. > > Like, for example, the downcased word "se" is a stopword, but the uppercased > > "SE" stands for "Sergipe", a brazilian state, so it should not be filtered. > > Suppose you are right, but passing it through the LowerCaseFilter can > be useful too, specially if you don't care much about those corner > cases (the GermanAnalyzer, for instance, passes through > LowerCaseFilter first). The class being final doesn't allow to inherit > from it and make the changes if one needs to, which is unfortunate :-(. > > I would like to see a change in this whole stemmer's and language > analyzer's API in order to make it more flexible and extensible. The > way it is you have to use them in that predeterminaded way. > > It would be nice if there was only one StemFilter, a Stemmer interface > and all Stemmers were subclasses of that. Then, the StemFilter should > get its Stemmer as a constructor parameter. I see no reason for > BrazilianAnalyzer to be public. To be final, sorry. I was a bit tired when I wrote all that. > Are you interested in those kind of changes? Do you agree with them? --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]