On 13/01/2004 04:10, Marco Cimarosti wrote:

...

In this case (as in most other similar cases), you should rather blame the
people who send you e-mail without encoding declaration.



I get plenty of them. But then I assume that they default to ASCII or Windows-1252. Is there in fact a formal default for e-mail, HTML etc without encoding declaration?

...

I don't think that Thai would be such a case. Thai normally uses European
digits (the usage scope of Thai digits is probably similar to that of Roman
numerals in Western languages), some European punctuation (parentheses,
exclamation marks, hyphens, quotes), and spaces (although a Thai space has
the strength -- and hence the frequency -- of a Western semicolon).



In some English texts the combined frequency of digits, parentheses, exclamation marks, quotes and semicolons is minimal, so perhaps similarly for their Thai counterparts. Does Thai use the basic Latin hyphen as part of the spelling of common words? Apart from them there is no guarantee that any basic Latin characters will be used.

As a minimum, all languages should use line feed and/or new line as line
terminators, as Unicode's line and paragraph separators never caught on.



Yes, but has it caught on in some countries/languages/applications/OSs? And will it catch on in future? Anyway, some texts use very long paragraphs and so very few explicit line feeds etc.


-- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/





Reply via email to