subject:"RE\: Spliting of words"

Re: Spliting of words

2005-09-13 Thread Paul Libbrecht

Madhu, Analyzer is the magic word here. Lucene's StandardAnalyzer has a whole grammar to split words into tokens. There are many more analyzers, most of which are language specific (e.g. based the Snowball or Porter-stemmers, see contribs or javadoc of core). For which language do wish to u

RE: Spliting of words

2005-09-13 Thread Kunemann Frank

This depends on the analyzer you are using. You can find the standard analyzers in org.apache.lucene.analysis. To find out what they do, I recommend the example in Lucene in action in 4.2.3 called "AnalyzerDemo". If you don't have the book, you can also download the examples from http://www.manning