On Sunday 03 March 2002 21:21, Tod Harter wrote:
> On Sunday 03 March 2002 13:55, Robin Berjon wrote:
> > It ought to be, except that url-encoding can in fact encode any 8bit
> > charset, which means that it can also take most of the iso-8859-* family.
> > A number of browsers have that misbehaviour.
>
> Maybe it can in the sense that it is POSSIBLE to do so. The problem is its
> definitely a boo-boo. I think a lot of what has happened with URL encoding
> is that vendors have gone and deployed file systems that use various
> encodings, then they seem incapable of making user agents that properly URL
> encode the resulting paths.
I don't think that's the case. I think that quite simply provision has been
made for proper url-encoding but that 1) updating user-agents takes time, and
2) POST data shouldn't use url-encoding but instead use multipart/form-data
because it make a hell of a lot more sense nowadays.
> Realistically even UTF-8 is a hack. ALL the
> software and standards need to be updated, badly. Ideally all software
> should be able to deal with any incoming encoding, and really everything
> should be UTF-16 internally. At least then you have a fighting chance of
> representing an encoding in a consistent internal form. I'd give it about
> 40 years...
No, UTF-8 isn't a hack, it's a well thought out encoding that makes it
possibly for a lot of text data to be forward compatible even when it wasn't
created to be so. It also allows a lot of text to not suddenly weigh twice as
much (which UTF-16 doesn't). Finally, it is on average (ie statistically)
less prone to endian-ness problems than UTF-16 is. In short, if the
language(s) you use is/are within {US-ASCII | Latin-1} then by all means do
use UTF-8 as much as possible as it will -- in the end, or at least one day
-- make your life a *lot* easier. Otherwise use UTF-16 (generalising that
further would suck, as it has problems with a lot of \n terminals, as Java
has shown before) [1]. Charset/encoding problems suck very hard, and even if
you haven't been bitten yet, restricting the number of those that are in use
in your system will prove very useful in the long run (if only because you
won't notice any problems). I know I've said that ten times over, I'll stop
saying it only when those become trivial things that one can safely ignore ;)
[1] this rule being subject to local common sense decisions, obviously.
--
_______________________________________________________________________
Robin Berjon <[EMAIL PROTECTED]> -- CTO
k n o w s c a p e : // venture knowledge agency www.knowscape.com
-----------------------------------------------------------------------
Being schizophrenic is better than living alone.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]