I am working on a CGI program and am having trouble when non-ASCII
characters (such as accented roman characters, for example) are
submitted as form data.
Lets say that the following string is submitted as part of a POST
query string:
Mü! (The second letter is a u with an umlaut, in case it doesn't
survive the e-mail gauntlet.)
The string is encoded in the query string as:
M%FC%21
Since query strings are supposed to be %xx encoded as UTF-8, that
looks fine so far (I checked <http://www.utf8-chartable.de/> and the
hex codes match up).
So the data is sent to my program, and I capture it in a variable,
say "s". If I then do the following:
t=decodeURLComponent(s)
...then t becomes:
M¸! (The second character here is a free-standing cedilla.)
So while the exclamation point was restored, the umlauted 'u' was
not. Also, the debugger lists t.encoding as nil.
I've tried both defining and converting the encoding of 's'
beforehand, setting it to US-ASCII, but no joy. I've tried defining
the encoding of 't' afterward, setting it to UTF-8, but on joy there
either.
Is this a flaw in decodeURLComponent, or am I using it incorrectly?
Is there a way to tell decodeURLComponent to interpret the %xx
entities as UTF-8 values? (And what lookup table is it using,
anyway... in UTF-8, the cedilla is %B8...?)
Thanks in advance!
***************************************************
Toby W. Rush - [EMAIL PROTECTED]
Instructor of Music Theory
PVA Webmaster & Technical Operations Manager
University of Northern Colorado
"Omnia voluntaria est."
***************************************************
_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>
Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>