You can easily string together your own tokenizer and any number of filters to create an analyzer that does exactly what you need. Lucene In Action shows an example for creating your own analyzer by assembling the standard parts....
Best Erick On Mon, Apr 18, 2011 at 3:08 AM, Clemens Wyss <clemens...@mysign.ch> wrote: > What is the best way to "avoid" the lowercasing (and still being able to > exclude stop words)? > >> -----Ursprüngliche Nachricht----- >> Von: Simon Willnauer [mailto:simon.willna...@googlemail.com] >> Gesendet: Freitag, 15. April 2011 08:56 >> An: java-user@lucene.apache.org >> Betreff: Re: German*Filter, Analyzer "cutting" off letters from (french) >> words... >> >> On Fri, Apr 15, 2011 at 8:48 AM, Clemens Wyss <clemens...@mysign.ch> >> wrote: >> > Does the StandardAnalyzer lowercase its terms? >> yes! >> >> simon >> > >> >> -----Ursprüngliche Nachricht----- >> >> Von: Clemens Wyss [mailto:clemens...@mysign.ch] >> >> Gesendet: Mittwoch, 13. April 2011 13:34 >> >> An: java-user@lucene.apache.org >> >> Betreff: AW: German*Filter, Analyzer "cutting" off letters from >> >> (french) words... >> >> >> >> This is what I was looking for, thanks >> >> >> >> > -----Ursprüngliche Nachricht----- >> >> > Von: Robert Muir [mailto:rcm...@gmail.com] >> >> > Gesendet: Mittwoch, 13. April 2011 12:11 >> >> > An: java-user@lucene.apache.org >> >> > Betreff: Re: German*Filter, Analyzer "cutting" off letters from >> >> > (french) words... >> >> > >> >> > If you only want to ignore german stopwords, you don't need to use >> >> > the german analyzer with german stemming. you can just use >> >> > StandardAnalyzer with your own stopwords set! >> >> > >> >> > On Wed, Apr 13, 2011 at 3:51 AM, Clemens Wyss >> >> <clemens...@mysign.ch> >> >> > wrote: >> >> > > What I really want to do is ignore german stop words such as >> >> > > "der", "die", >> >> > "das", "ein",... >> >> > > >> >> > >> -----Ursprüngliche Nachricht----- >> >> > >> Von: Robert Muir [mailto:rcm...@gmail.com] >> >> > >> Gesendet: Dienstag, 12. April 2011 17:03 >> >> > >> An: java-user@lucene.apache.org >> >> > >> Betreff: Re: German*Filter, Analyzer "cutting" off letters from >> >> > >> (french) words... >> >> > >> >> >> > >> On Tue, Apr 12, 2011 at 8:46 AM, Clemens Wyss >> >> > <clemens...@mysign.ch> >> >> > >> wrote: >> >> > >> > Why so? Where have the e's gone? >> >> > >> > >> >> > >> >> >> > >> the e is being stemmed as its a german suffix... all of the >> >> > >> german stemming algorithms remove final -e, as do all the french >> >> > >> stemming >> >> > algorithms. >> >> > >> >> >> > >> so i don't understand your problem. >> >> > >> >> >> > >> ---------------------------------------------------------------- >> >> > >> --- >> >> > >> -- To unsubscribe, e-mail: >> >> > >> java-user-unsubscr...@lucene.apache.org >> >> > >> For additional commands, e-mail: >> >> > >> java-user-h...@lucene.apache.org >> >> > > >> >> > > >> >> > >> >> > ------------------------------------------------------------------- >> >> > -- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> >> > For additional commands, e-mail: java-user-h...@lucene.apache.org >> > >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org