Peter Kirk
Tue, 13 Jan 2004 06:24:34 -0800
...I get plenty of them. But then I assume that they default to ASCII or Windows-1252. Is there in fact a formal default for e-mail, HTML etc without encoding declaration?
In this case (as in most other similar cases), you should rather blame the people who send you e-mail without encoding declaration.
...In some English texts the combined frequency of digits, parentheses, exclamation marks, quotes and semicolons is minimal, so perhaps similarly for their Thai counterparts. Does Thai use the basic Latin hyphen as part of the spelling of common words? Apart from them there is no guarantee that any basic Latin characters will be used.
I don't think that Thai would be such a case. Thai normally uses European digits (the usage scope of Thai digits is probably similar to that of Roman numerals in Western languages), some European punctuation (parentheses, exclamation marks, hyphens, quotes), and spaces (although a Thai space has the strength -- and hence the frequency -- of a Western semicolon).
Yes, but has it caught on in some countries/languages/applications/OSs? And will it catch on in future? Anyway, some texts use very long paragraphs and so very few explicit line feeds etc.As a minimum, all languages should use line feed and/or new line as line terminators, as Unicode's line and paragraph separators never caught on.
-- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/