Re: Right way to make analyzer

Erik Hatcher Thu, 03 Feb 2005 06:48:37 -0800


On Feb 3, 2005, at 9:26 AM, Owen Densmore wrote:

Is this the right way to make a porter analyzer using the standard tokenizer? I'm not sure about the order of the filters.

Owen

    class MyAnalyzer extends Analyzer {
      public TokenStream tokenStream(String fieldName, Reader reader) {
        return new PorterStemFilter(
            new StopFilter(
                new LowerCaseFilter(
                    new StandardFilter(
                        new StandardTokenizer(reader))),
               StopAnalyzer.ENGLISH_STOP_WORDS));
      }
    }


Yes, that is correct.

Analysis starts with a tokenizer, and chains the output of that to the next filter and so on.

I strongly recommend, as you start tinkering with custom analysis, to use a little bit of code to see how your analyzer works on some text. The Lucene Intro article I wrote for java.net has some code you can borrow to do this, as does Lucene in Action's source code. Also, Luke has this capability - which is a tool I also highly recommend.

        Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Right way to make analyzer

Reply via email to