=?iso-8859-1?Q?Ren=E9_Ferr=E9ro?= writes:
> --- Morus Walter <[EMAIL PROTECTED]> a
> �crit : > >
>
> > Hmm.
> > Isn't
> > org.apache.lucene.analysis.snowball.SnowballFilter
> > already
> > what you described?
>
> Thanks for your information.
> Where can I get this source code ?
>
> I don't find it in the lucene 1.2 or 1.3 source
> distribution, neither in the snowball one.
>
SnowballFilter is part of snowball.
It's described in the API-Docs and the source is
...snowball-0.1/src/java/org/apache/lucene/analysis/snowball/SnowballFilter.java
The line I quoted stems from a small extension I made to one of lucenes
analyzers (quick and dirty). The full source for a small program that shows
the effect of the analyzer for text read from a file or stdin is
=====
import java.io.*;
import java.util.*;
import org.apache.lucene.analysis.*;
import org.apache.lucene.analysis.de.GermanAnalyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.analysis.standard.StandardFilter;
import org.apache.lucene.analysis.standard.StandardTokenizer;
import org.apache.lucene.analysis.snowball.SnowballFilter;
class SnowballAnalyzer extends Analyzer {
private Hashtable stopTable;
/** An array containing some common English words that are usually not
useful for searching. */
public static final String[] STOP_WORDS = {
};
/** Builds an analyzer. */
public SnowballAnalyzer() {
this(STOP_WORDS);
}
/** Builds an analyzer with the given stop words. */
public SnowballAnalyzer(String[] stopWords) {
stopTable = StopFilter.makeStopTable(stopWords);
}
public final TokenStream tokenStream(String fieldName, Reader reader) {
TokenStream result = new StandardTokenizer(reader);
result = new StandardFilter(result);
result = new SnowballFilter(result, "German");
result = new LowerCaseFilter(result);
result = new StopFilter(result, stopTable);
return result;
}
}
class TestToken {
static void usage() {
System.out.println("java TestToken (std|german|snowball) [<file>]");
System.exit(1);
}
public static void main(String[] args) {
try {
String stop[] = { };
Analyzer myAnalyzer = null;
if ( args[0].equals("std") ) {
myAnalyzer = new StandardAnalyzer(stop);
}
else if ( args[0].equals("german") ) {
myAnalyzer = new GermanAnalyzer(stop);
}
else if ( args[0].equals("snowball") ) {
myAnalyzer = new SnowballAnalyzer(stop);
}
else {
usage();
}
java.io.Reader reader;
if ( args.length == 2 ) {
reader = new FileReader(args[1]);
}
else {
reader = new InputStreamReader(System.in);
}
final TokenStream in = myAnalyzer.tokenStream(null,reader);
for(;;) {
final Token token = in.next();
if (token == null) {
break;
}
System.out.println("[" + token.termText() + "]");
}
}
catch (Exception e) {
System.out.println(" caught a " + e.getClass() +
"\n with message: " + e.getMessage());
}
}
}
=====
I guess you're interested in the SnowballAnalyzer class.
HTH
Morus
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]