On Thu, Apr 2, 2009 at 7:30 PM, Je suis la poubelle <laps...@gmail.com> wrote: > On Fri, Mar 27, 2009 at 5:34 PM, Christopher Schultz < > ch...@christopherschultz.net> wrote: > > > Setting charset/encoding is to specify computerized information. It's > not just a matter of language. If setting charset in META tag doesn't mean > anything to you, the same argument applies to setting charset in HTTP > header. >
Well, this is the only argument I can agree upon. But encoding of HTML/XML is the story of which was there first: The hen or the egg? I'll give you an example based on our dreadful experiences with XML-parsing: Let's say, we have a stream looking like this: <?xml version="1.0" encoding="UTF-8"?> <foo>bar</foo> </xml> However, the encoding of the whole stream is done in some wierd encoding you've never heard about. See, the parser needs to know about the encoding /in advance/ to be able to read the encoding from said stream. See the point? Actually, it's a good practice to put the encoding, but that's about it, and same goes for a META-TAG. Talking web, the only thing a parser can rely on is a HTTP-Header. And it's getting really nuts, when it comes to UTF-8: Talking about UTF-8 with or without BOM? Even the specs are not clear about that. In my oppinion, the whole character-set is a pain in the ass: I personally wish IETF came up with some specs saying something like "the first n bytes of any stream have to be encoded in ASCII containg length and encoding-type of the rest of the stream". I put that on my whishlist for xmas. Rgds Gregor -- just because your paranoid, doesn't mean they're not after you... gpgp-fp: 79A84FA526807026795E4209D3B3FE028B3170B2 gpgp-key available @ http://pgpkeys.pca.dfn.de:11371 @ http://pgp.mit.edu:11371/ --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org