Hi László,

I found out why ZWSP does not work as a word-boundary in ICU. They decided that it was not a spacing character, but a format character, and not a word boundary. I have sunmitted a patch to the ICU for OOo in which we revert it to be a spacing character.

We are developing a dictionary-based breakiterator as the one for Thai in ICU. The question is: does OpenOffice have special code for tokenization in Thai? or is the ICU breakiterator enough.

I want to know if besides writing the ICU breakiterator, I also have to do something in OOo.

Thanks,

Javier







---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to