[PATCH] Bug on BrazilianAnalyzer

Rafael Cunha de Almeida Mon, 17 Nov 2008 15:40:28 -0800

Following is the patch for what I think is a bug on the
BrazilianAnalyzer. The default stopwords list is all in lowercase, so
it will only work if the LowerCaseFilter comes first of if the
StopWordFilter is set to ignore case.


Since the LowerCaseFilter is instantiated anyway I just changed its
order. If there's some problem with that order, then please consider
setting StopWordFilter to ignore case.

Index: BrazilianAnalyzer.java
===================================================================
--- BrazilianAnalyzer.java      (revision 718407)
+++ BrazilianAnalyzer.java      (working copy)
@@ -131,10 +131,9 @@
        public final TokenStream tokenStream(String fieldName, Reader
reader) { TokenStream result = new StandardTokenizer( reader );
                result = new StandardFilter( result );
+               result = new LowerCaseFilter( result );
                result = new StopFilter( result, stoptable );
                result = new BrazilianStemFilter( result, excltable );
-               // Convert to lowercase after stemming!
-               result = new LowerCaseFilter( result );
                return result;
        }
 }

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[PATCH] Bug on BrazilianAnalyzer

Reply via email to