Re: tokenizer to strip a set of characters

2013-11-21 Thread Jack Krupansky
The word delimiter filter has the ability to pass a table which specifies the type for a character: http://lucene.apache.org/core/4_5_1/analyzers-common/org/apache/lucene/analysis/miscellaneous/WordDelimiterFilter.html

tokenizer to strip a set of characters

2013-11-21 Thread Stephane Nicoll
Hi, I am using lucene 3.6 and I am looking to a tokenized that would remove certain characters when they are present at the beginning or at the end of a token. I initially used the StandardAnalyzer and switched to the WhitespaceAnalyser because it was too agressive for my use case. A few