RE: off-topic: handling non-ascii characters in URLs

Birte Glimm Fri, 05 Jan 2001 03:00:22 -0800
True,
it`s the Browser that encodes the special chars I think. I sometimes had problems with 
not encoded URL`s in Netscape, but the IE always translates them right.
Birte Glimm

-----Original Message-----
From: Kitching Simon [mailto:[EMAIL PROTECTED]]
Sent: Freitag, 5. Januar 2001 11:58
To: '[EMAIL PROTECTED]'
Subject: off-topic: handling non-ascii characters in URLs


Hi All,

While following a related thread (RE: a simple test to charset), 
a question occured to me about charset encodings in URLs. 
This isn't really tomcat-related (more to do with HTTP standards) 
but thought someone here might be able to offer an answer.

When a webserver sends content to a browser, it can indicate
the character data format (ascii, latin-1, UTF8, etc) as an http
header. However, how is the character data type specified for data
send *by* a browser *to* a webserver (ie GET or POST action)?

Andre Alves had an example where an e-accent character
was part of the URL. I saw that IE4 replaced this character
with %E9 when submitting a form using GET method, but this
really assumes that the receiving webserver is using latin-1.

There is this thing called an "entity-header" defined in the HTTP
specs, which may contain a "content-encoding" entry. This seems
to cover POST urls ok then, as the POSTed data is in an entity-body,
and therefore an entity-header can be used to define its encoding.

But the URLs themselves cannot have their encoding specified by 
an entity-header, because they are not in an entity-body. So does
this mean that all URLs should be restricted to ascii, and forms
should not use GET method unless their data content is guarunteed
to be all-ascii??  I remember seeing an article recently about domain
names now being available in asian ideogram characters, which seems
to indicate otherwise....

Any comments?

Cheers,

Simon

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]
RE: off-topic: handling non-ascii characters in URLs

Reply via email to