Re: analyzers-common VS analyzers-icu

2016-06-08 Thread Daniel Bigham
Any other replies to this? Timothy's response was somewhat helpful but hasn't answered in an authoritative way what the current status of these two different "forks" of language analyzers is. Surely there is some history here and some high level status about them? (perhaps I should look at git a

RE: analyzers-common VS analyzers-icu

2016-06-01 Thread Allison, Timothy B.
That package has an ICU tokenizer and the ICUFoldingFilter. The ICUFoldingFilter does advanced (well, Unicode compliant) case folding/lowercasing/normalization and is critical for non-ascii languages. You can use that in place of the AsciiFoldingFilter and the LowerCaseFilter, and it should