Fredrik Jervfors wrote: > ... > > > > X writes a homepage in French, using either latin1 or utf8 encoding (but > > mentions this encoding properly), and of course he uses all the french > > letters, including e.g. è (e with grave accent). > > > > Y is sitting in Poland for example, using a system configured to use a > > latin2 locale by default. Latin2 lacks e with grave accent. Y visits the > > homepage of X with some popular graphical web browser. > > > > What should happen? > > > > Rich says that his browser must (or should?) think in latin2 and hence > > drop the è letters, maybe replace them with unaccented e or question > > marks or similar. > > > > I say that his browser mush show è correctly, it doesn't matter what its > > locale is. > > That depends on the configuration of the browser. > > The browser should by default (programmer's choice really) think in the > encoding X used, since it's tagged with that encoding information.
In what sense do you mean "think in encoding X"? (Are you talking about internal browser operations like displaying text, or external operations like saving files?) For internal operations, shouldn't the browser "think" in terms of characters? Isn't that how HTML is defined (in terms of characters, not byte encodings)? That is, who cares if è (e with grave accent) doesn't exist in the system's default Latin-2 encoding? At least for internal operations, the browser doesn't ever have to encode the character into bytes in the system's default encoding, does it? The browser received an entity in UTF-8, and the browser understood the UTF-8 byte sequence and the character it represented. The browser can get an appropriate glyph displayed without using the system's locale-specified encoding, right? (It would use the encoding of the font, not any system default encoding, right)? Wouldn't it only be for external operations (e.g., a "Save Page Source As" command, or loading files from the local system) where the browser would care about the local system's encoding? Daniel -- Daniel Barclay [EMAIL PROTECTED] -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/