On Oct 07, 2006 12:40 AM, Christer Olsson wrote:
> >>> In my initial message, I noted that the entering the
> >>> following string into a web form:
> >>>
> >>> M=FC! (second letter is umlauted 'u')
> >>>
> >>> Causes the web browser to encode it as follows:
> >>>
> >>> M%FC%21
> >>
> >> This doesn't look really correct. M=FC! as a URL encoded (Hex
> >> bytes) string should read M%C3%BC%21.
> >
> >Indeed, the UTF-8 chart I am looking at lists %C3%BC as
> >the way to hex-encode the umlauted 'u'. However, every web
> >browser I've tried sends that character as %FC.
> >
> Are you sure you have set the character set for your form
> to UTF-8? I have tested Safari, Firefox and Opera, and they
> are all sending it as %C3%BC.
>
> >Even if the web browser sends %C3%BC for the character, though,
> >decodeURLComponent doesn't interpret it correctly... the
> >following command:
> >
> >msgBox decodeURLComponent("%C3%BC")
> >
> >...yields a dialog box containing a square root symbol and
> a degree
> >symbol (values C3 and BC from the MacRoman encoding).
>
> That's true. If you forget to tell decodeURLComponent what
> encoding you're using :-) Try decodeURLComponent("%C3%BC",
> encodings.UTF8).

Christer is right. If the page where you submit your data does not
have the charset to utf-8
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
then you probably will get "iso-8859-1" if no other charset is
defined.

I tested this with 2 pages with a form:
- one without any charset: data is encoded using ISO-8859-1.
- one with utf-8 charset: data is encoded using utf-8.

You can also do this test:
Go to Google and enter "Mü" and click search. Then look at the URL and
you'll see "%C3%BC" - Google uses utf-8 charset.

Carlos

_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>

Reply via email to