have it right till 2 .
3,4,5 are a single character
Thx
PM
On Tue, Apr 22, 2008 at 12:01 PM, Steven A Rowe <[EMAIL PROTECTED]> wrote:
> Hi Prashant,
>
> On 04/22/2008 at 2:23 PM, Prashant Malik wrote:
> > We have been observing the following problem while
HI ,
We have been observing the following problem while tokenizing using
lucene's StandardAnalyzer. Tokens that we get is different on different
machines. I am suspecting it has something to do with the Locale settings on
individual machines?
For example
the word 'CÃ(c)sar' is split as 'CÃ