Thomas Dickey wrote:
Lynx assumes the document charset is ISO-8859-1 if it's not given. (That was the rule for some time - for HTML - perhaps we're not discussing HTML anymore).
It hasn't been the rule for around a decade; HTML 4.0 overrides HTTP/1.1 for the text/html media type. Failure to specify a charset is an error. Browsers must not assume a default, but may use heuristics. (In practice, one of the heuristics is to use a default!)
Without a charset, therefore, it is reasonable, but not required, for a browser to assume that something that starts with a UTF-* byte order sequence is UTF-*.
Also, outside of the USA/Western Europe, it became quite common practice to use tools that set windows-1252, etc., but then actually send the local encoding. People in those regions had no problems, as they locked their browsers into, say GB2312, and ignored the charset, completely. Nowadays, there is a mix of UTF and GB2312, so that strategy may no longer work.
Setting that to UTF-8 makes it display properly. 0xFE is a valid ISO-8859-1 code, as your terminal emulator shows...
-- David Woolley Emails are not formal business letters, whatever businesses may want. RFC1855 says there should be an address here, but, in a world of spam, that is no longer good advice, as archive address hiding may not work. _______________________________________________ Lynx-dev mailing list Lynx-dev@nongnu.org http://lists.nongnu.org/mailman/listinfo/lynx-dev