On Sun, Feb 6, 2011 at 3:28 PM, Georger Araujo <georger.ara...@gmail.com> wrote: > Hi, > I started using Lucene a few weeks ago, and I must say I'm amazed. Hats off > to the developers and the community! > I'd like to write a custom analyzer whose only difference to > org.apache.lucene.analysis.br.BrazilianAnalyzer is that I want it to discard > numeric tokens from the input. I've looked at the code and also at the > discussion in [1], but I'm lost about what is the simplest/cleanest way to > go. > What do you think?
Hi, in general the supplied analyzers are basically very general purpose examples. So i would make your own analyzer: except using a tokenizer that discards numbers (like lowercasetokenizer) instead of standardtokenizer: something like LowerCaseTokenizer + BrazilianStemFilter + Brazilian stopwords in a stopfilter. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org