On Tue, 3 Oct 2006 01:15:59 +0300, Ahmad Al-Twaijiry wrote:
Hi everyone
in my PHP code I use the following command to set a cookie with
non-english word (UTF-8) :
@setcookie (UserName,$Check[1]);
and in my html page I get this cookie using javascript :
[Snipped]
but the result from writing the cookie using javascript is garbage, I
don't get the right word !!
The problem is that JavaScript uses UTF-16, so you
either have to store the cookie as UTF-16 or do your
own UTF-8 decoding in JavaScript.
For example, consider the string åäö, containing
the three funny characters in the Swedish language
(aring;auml;ouml;). These characters are encoded
as c3 a5 c3 a4 c3 b6 in UTF-8, and PHP stores these
in the cookie as:
%C3%A5%C3%A4%C3%B6
Example:
--
?php
setcookie ('UserName', \xc3\xa5\xc3\xa4\xc3\xb6);
// setcookie ('UserName', åäö);
header ('Content-Type: text/html; charset=utf-8');
?
!DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.01//EN
http://www.w3.org/TR/html4/strict.dtd;
titleUTF-8 flavoured cookies/title
p
script type=text/javascript
document.write(document.cookie);
/script
--
The unescape() function in JavaScript converts
these characters to the Unicode code points
00c3 00a5 00c3 00a4 00c3 00b6 which, of course,
is not what you want.
Example:
--
?php
setcookie ('UserName', \xc3\xa5\xc3\xa4\xc3\xb6);
// setcookie ('UserName', åäö);
header ('Content-Type: text/html; charset=utf-8');
?
!DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.01//EN
http://www.w3.org/TR/html4/strict.dtd;
titleUTF-8 flavoured cookies/title
p
script type=text/javascript
var s = unescape(document.cookie);
var t = ;
for (var i = 0; i s.length; i++) {
var c = s.charCodeAt(i);
t += c 128 ? String.fromCharCode(c) : c.toString(16) + ;
}
document.write(t);
/script
--
While there are no doubt better ways to solve this,
you /could/ use the unescape() function to convert the
percent-encoded characters to unicode code point, and
then write your own UTF-8 decoder to do the rest.
Example:
(This is an old C function hammered into JavaScript
shape. It is likely to be a horrible implementation
in JavaScript. The error checking adds a bit of bloat.
Note that the utf_8_decode function supports the full
Unicode range, while JavaScript doesn't. )
--
?php
setcookie ('UserName', \xc3\xa5\xc3\xa4\xc3\xb6);
header ('Content-Type: text/html; charset=utf-8');
?
!DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.01//EN
http://www.w3.org/TR/html4/strict.dtd;
titleUTF-8 flavoured cookies/title
p
script type=text/javascript
function utf_8_decode (sin)
{
function octet_count (c)
{
var octet_counts = [
/* c0 */ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
/* d0 */ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
/* e0 */ 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
/* f0 */ 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 0, 0
];
return c 128 ? 1 :
c 192 ? 0 : octet_counts [(c255)-192];
}
var octet0_masks = [ 0x00,0x7f,0x1f,0x0f,0x07,0x03,0x01 ];
var sout = ;
var add;
for (var si = 0; si sin.length; si += add) {
var c = sin.charCodeAt(si);
add = octet_count(c);
if (si+add = sin.length) {
var u = c octet0_masks[add];
var ci;
for (ci = 1; (ci add) ((sin.charCodeAt(si+ci)0xc0) == 0x80);
ci++)
u = (u6) | (sin.charCodeAt(si+ci) 0x3f);
if (ci == add) {
sout += String.fromCharCode (u);
} else {
// Invalid UTF-8 sequence. Should probably throw() instead.
sout += \ufffd; // Replacement character.
add = 1;
}
} else {
// Invalid UTF-8 sequence. Should probably throw() instead.
sout += \ufffd; // Replacement character.
add = 1;
}
}
return sout;
}
document.write (utf_8_decode(unescape(document.cookie)));
/script
--
BTW:
* I also tried the php function setrawcookie and I get the same problem
* I use META http-equiv=Content-Type content=text/html;
charset=utf-8 in my page
The META thing might be good for storing pages
on disk, but on the web you should use real HTTP
headers.
--nfe
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php