Right way to make analyzer

2005-02-03 Thread Owen Densmore
Is this the right way to make a porter analyzer using the standard 
tokenizer?  I'm not sure about the order of the filters.

Owen
class MyAnalyzer extends Analyzer {
  public TokenStream tokenStream(String fieldName, Reader reader) {
return new PorterStemFilter(
new StopFilter(
new LowerCaseFilter(
new StandardFilter(
new StandardTokenizer(reader))),
   StopAnalyzer.ENGLISH_STOP_WORDS));
  }
}

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Right way to make analyzer

2005-02-03 Thread Erik Hatcher
On Feb 3, 2005, at 9:26 AM, Owen Densmore wrote:
Is this the right way to make a porter analyzer using the standard 
tokenizer?  I'm not sure about the order of the filters.

Owen
class MyAnalyzer extends Analyzer {
  public TokenStream tokenStream(String fieldName, Reader reader) {
return new PorterStemFilter(
new StopFilter(
new LowerCaseFilter(
new StandardFilter(
new StandardTokenizer(reader))),
   StopAnalyzer.ENGLISH_STOP_WORDS));
  }
}
Yes, that is correct.
Analysis starts with a tokenizer, and chains the output of that to the 
next filter and so on.

I strongly recommend, as you start tinkering with custom analysis, to 
use a little bit of code to see how your analyzer works on some text.  
The Lucene Intro article I wrote for java.net has some code you can 
borrow to do this, as does Lucene in Action's source code.  Also, Luke 
has this capability - which is a tool I also highly recommend.

Erik
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]