=?iso-8859-1?Q?Ren=E9_Ferr=E9ro?= writes:
>  --- Morus Walter <[EMAIL PROTECTED]> a
> �crit : > > 
> 
> > Hmm.
> > Isn't
> > org.apache.lucene.analysis.snowball.SnowballFilter
> > already
> > what you described?
> 
> Thanks for your information.
> Where can I get this source code ?
> 
> I don't find it in the lucene 1.2 or 1.3 source
> distribution, neither in the snowball one.
> 
SnowballFilter is part of snowball.
It's described in the API-Docs and the source is 
...snowball-0.1/src/java/org/apache/lucene/analysis/snowball/SnowballFilter.java

The line I quoted stems from a small extension I made to one of lucenes
analyzers (quick and dirty). The full source for a small program that shows
the effect of the analyzer for text read from a file or stdin is

=====
import java.io.*;
import java.util.*;
import org.apache.lucene.analysis.*;
import org.apache.lucene.analysis.de.GermanAnalyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.analysis.standard.StandardFilter;
import org.apache.lucene.analysis.standard.StandardTokenizer;
import org.apache.lucene.analysis.snowball.SnowballFilter;

class SnowballAnalyzer extends Analyzer {

    private Hashtable stopTable;

    /** An array containing some common English words that are usually not
        useful for searching. */
    public static final String[] STOP_WORDS = {
    };

    /** Builds an analyzer. */
    public SnowballAnalyzer() {
        this(STOP_WORDS);
    }

    /** Builds an analyzer with the given stop words. */
    public SnowballAnalyzer(String[] stopWords) {
        stopTable = StopFilter.makeStopTable(stopWords);
    }

    public final TokenStream tokenStream(String fieldName, Reader reader) {
        TokenStream result = new StandardTokenizer(reader);
        result = new StandardFilter(result);
        result = new SnowballFilter(result, "German");
        result = new LowerCaseFilter(result);
        result = new StopFilter(result, stopTable);
        return result;
    }
}

class TestToken {
    static void usage() {
        System.out.println("java TestToken (std|german|snowball) [<file>]");
        System.exit(1);
    }

    public static void main(String[] args) {
        try {

            String stop[] = { };
            Analyzer myAnalyzer = null;

            if ( args[0].equals("std") ) {
                myAnalyzer = new StandardAnalyzer(stop);
            }
            else if ( args[0].equals("german") ) {
                myAnalyzer = new GermanAnalyzer(stop);
            }
            else if ( args[0].equals("snowball") ) {
                myAnalyzer = new SnowballAnalyzer(stop);
            }
            else {
                usage();
            }

            java.io.Reader reader;
            if ( args.length == 2 ) {
                reader = new FileReader(args[1]);
            }
            else {
                reader = new InputStreamReader(System.in);
            }

            final TokenStream in = myAnalyzer.tokenStream(null,reader);
            for(;;) {
                final Token token = in.next();
                if (token == null) {
                    break;
                }
                System.out.println("[" + token.termText() + "]");
            }
        }
        catch (Exception e) {
            System.out.println(" caught a " + e.getClass() +
                               "\n with message: " + e.getMessage());
        }
    }
}

=====

I guess you're interested in the SnowballAnalyzer class.

HTH
        Morus

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to