* Clark Christensen <[EMAIL PROTECTED]> [2007-04-05 17:25]:
> I hate it when the CGI transaction clobbers characters. You
> can set the content-encoding in the HTML to UTF-8, and it might
> help, but I think the conversion from the urlencoded value is
> dependent on the web server platform's encoding (OS codepage,
> app platform settings, etc.)
This description of the overall behaviour is grossly wrong in a
number of ways, but I don’t have the stamina right now to drop
over to Google and peel back the layers on this onion. Suffice to
say there is a terrible degree of annoying niggly details, as
ever when both “web” and “charset” show up in a single sentence.
(The first place I’d look is HTML5; the WHATWG is doing a good
job for document actual implemented browser behaviour, so if
they’ve written any spec text about this, that is likely to be
a good summary of what real browsers really do.)
> Plus, you run the risk of a user forcing the browser's encoding
> to something other than what you intended.
You may want to take a look at this:
HEBCI: HTML Entity-Based Codepage Inference
http://www.joshisanerd.com/set/
Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>
-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------