Peter Kirk <peter dot r dot kirk at ntlworld dot com> wrote: > Thank you, Ken. Well, you make it sound as if the problems are > minimal, and that version I can just about accept. But if Philippe is > correct about what he says about UAX#29 and UAX#14, there are some > more serious problems. It is certainly highly inappropriate for > non-spacing diacritics to be considered word boundaries.
Non-spacing diacritics had better not be word boundaries, otherwise a string like Québec (spelled with U+0301, as here) would be considered two words. I don't have time right now to look up the relevant properties and UAX's, but I sincerely hope this is just another "Philippe mistake" and not a general misinterpretation that anyone might make. -Doug Ewell Fullerton, California http://users.adelphia.net/~dewell/

