Hello,
I want to change the input text before tokenizing. I think I just need
to use some characters as word separators, and maybe remove some others
completely.
I was planning to use MappingCharFilterFactory to replace some chars
with " " and others with "", but I feel like I'm not in the right track.
First, I've implemented a custom analyzer to use my custom tokenizer. My
idea was to inherit from StandardTokenizer and, in setReader, calling
MappingCharFilterFactory.create(reader) from within.
However, setReader is final, so I can't override it.
Is there a better way to do this?
In any case, how should I use MappingCharFilter in case I really needed it?
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org