On Jan 6, 2009, at 7:26 PM, 이지홍 wrote:
thanks for your answers.
i'm sorry. my english writing is not good.
i was told you. the Lucene SandBox Analyzer.
you can find out.
following url :
http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/analyzers/src/java/org/apache/lucene/analysis/
go there. you cand find out GermanAnalyzer and FranchAnalyzer.
I will ask you repeat Time.
Lucene SandBox Analyzer is What Diffrent From SnowBallAnalyzer?
I would suggest looking at the code. I haven't ever investigated them
at a low-level. If I had to guess, I bet they just have different
approaches to how stemming is done. Chances are neither is right or
wrong and there is no such thing as a perfect stemmer.
If I were you, I would setup a small program that takes in some number
of Strings from your documents in each of the languages and then runs
them through each Analyzer, printing out the the tokens. I have a
_SAMPLE_ of this in my Lucene Boot Camp training code: http://www.lucenebootcamp.com/LuceneBootCamp/training/src/test/java/com/lucenebootcamp/training/basic/AnalyzerTest.java
I don't know That.
Which One Is Best?
Best for what? It's going to depend.
you can sure that snowball analyzer is covered english language?
Yes.
Analyzer analyzer = new SnowballAnalyzer("English");
Plz Teach me.
Please have a look through more of the documentation and try some
things out.
A simple:
Analyzer analyzer = new //FILL IN YOUR ANALYZER HERE
TokenStream stream = analyzer.tokenStream("foo", new
StringReader("Test String Goes here"));
Token token = new Token();
while ((token = tokenStream.next(token)) != null) {
System.out.println("Token: " + token);
}
will go a long way in your understanding of how these Analyzers work.
I am doing Lucene Boot Camp at ApacheCon in Amsterdam, Netherlands in
March. If you can't make that, I suggest you buy the most excellent
"Lucene In Action" by Erik, Otis and Mike M. (http://www.manning.com/hatcher3
). Otherwise, there are plenty of tutorials and articles on using
Lucene at http://wiki.apache.org/lucene-java/Resources and on the Wiki
itself: http://wiki.apache.org/lucene-java/ which will cover how to
use an analyzer.
You might also check out Solr's Admin UI, which has a built in way of
outputting tokens to the screen given some user input in a text box.
See the Solr project for more on that.
Good Luck,
Grant