...I mentioned Thai because it is the only language I know of which does not used SPACE, U+0020. It also has at least some of its own punctuation. So a Thai text need not include any characters U+00xx - which rules out one suggested heuristic method.
By the way, I still don't quite understand what's special about Thai. Could someone elaborate?
-- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/

