On 29/08/13 19:41, Zack Weinberg wrote: > All the discussion of fallback character encodings has reminded me of an > issue I've been meaning to bring up for some time: As a user of the > en-US localization, nowadays the overwhelmingly most common situation > where I see mojibake is when a site puts UTF-8 in its pages without > declaring any encoding at all (neither via <meta charset> nor > Content-Type). It is possible to distinguish UTF-8 from most legacy > encodings heuristically with high reliability, and I'd like to suggest > that we ought to do so, independent of locale.
That seems wise to me, on gut instinct. If the web is moving to UTF-8, and we are trying to encourage that, then it seems we should expect that this is what we get unless there are hints that we are wrong, whether that's the TLD, the statistical profile of the characters, or something else. We don't want people to try and move to UTF-8, but move back because they haven't figured out how (or are technically unable) to label it correctly and "it comes out all wrong". Gerv _______________________________________________ dev-platform mailing list [email protected] https://lists.mozilla.org/listinfo/dev-platform

