Chris Hostetter wrote:
: I need to tokenize my field on whitespaces, html, punctuation, apostrophe
: but if I use HTMLStripStandardTokenizerFactory it strips only html
: but no apostrophes
you might consider using one of the HTML Tokenizers, and then use a
PatternReplaceFilterFilter ...
>>: I need to tokenize my field on whitespaces, html, punctuation, apostrophe
>>
>>: but if I use HTMLStripStandardTokenizerFactory it strips only html
>>: but no apostrophes
> you might consider using one of the HTML Tokenizers, and then use a
> PatternReplaceFilterFilter ... or if you kno
: I need to tokenize my field on whitespaces, html, punctuation, apostrophe
: but if I use HTMLStripStandardTokenizerFactory it strips only html
: but no apostrophes
you might consider using one of the HTML Tokenizers, and then use a
PatternReplaceFilterFilter ... or if you know java write
Hi all,
I need to tokenize my field on whitespaces, html, punctuation, apostrophe
but if I use HTMLStripStandardTokenizerFactory it strips only html but no
apostrophes
If I use PatternTokenizerFactory i don't know if i can create a pattern to
tokenizer all of theese characters...(hmtl, apo