steve wrote:
But, the characters don't render correctly when viewed with MySQLcc - which
I'm now convinced is using utf-8. I can't find any config settings for
MySQLcc relating to encoding, so maybe it's something to do with KDE? I
If mysqlcc uses locales, just set locale before launching it (via xterm)
Under Linux and for french locale, you can choose it via LANG env variable:
LANG=fr_BE.iso88591 mysqlcc
[EMAIL PROTECTED] mysqlcc
LANG=fr_FR.utf8 mysqlcc
Sorry, I dont use KDE/Gnome very often. But I guess they both defaults
to utf-8 these days.
I'll be sticking with latin1 (or maybe iso-8859-15). I'll never produce a
site that uses more than English and French
As Tex pointed out,
"1) ISO 8859-1 does not have the Euro character so is not really suitable for
France or Europe, unless you never have or discuss commercial transactions.
and "(...) Greek (...) is also not covered by latin-1)"
About iso-8859-15 (aka latin9, aka latin0), from "man iso_8859-15":
"(...latin1...) lacks the EURO symbol and does not fully cover Finnish and
French.
ISO 8859-15 is a modification of ISO 8859-1 that covers these needs
FYI I made a diff between latin1 and latin9 (with man -7 and diff)
hex iso-8859-1/latin1 iso-8859-15/latin9
----------------------------------------------------------------------------
A4 CURRENCY SIGN EURO SIGN
A6 BROKEN BAR LATIN CAPITAL LETTER S WITH CARON
A8 DIAERESIS LATIN SMALL LETTER S WITH CARON
B4 ACUTE ACCENT LATIN CAPITAL LETTER Z WITH CARON
B8 CEDILLA LATIN SMALL LETTER Z WITH CARON
BC VULGAR FRACTION ONE QUARTER LATIN CAPITAL LIGATURE OE
BD VULGAR FRACTION ONE HALF LATIN SMALL LIGATURE OE
BE VULGAR FRACTION THREE QUARTERS LATIN CAPITAL LETTER Y WITH DIAERESIS
Be carefull that some chars are undef in latin1 (hex 80-9F, deci 128-159).
You also need to take into account that Micro$oft, in his whole little world,
has its own "latin1" : cp1252 [1]. As windows users often used it and it's
incompatible with latin1 and add a few chars to latin1 in 0x80-0x9F range.
This means some translations must take place, whatever you choose
(latin1, latin9, utf-8).
Some facts can be worth knowing. Ex M$ cp1252 char A4 is 'Currency sign' too.
But M$ fonts (ex Arial) really use 'euro sign' for that char (even from
window95,
with ms 'euro patches'). So lack of Euro sign can be dealt with by simply
stating
(as ms) that A4 sign is euro sign. Ugly, but works quite well : ms users are
happy, but pblm remains with Mac/Unix/oldwindows users
Using iso-8859-15 also means not (really) using iso-8859-1, which is
the same as Unicode in lower 8 bits. To prepare Unicode migration
(utf-8 or other encodings), perhaps it's better to choose latin1
As Tex said too
"you will have to either go thru the work to convert to utf-8 anyway"
Everyone is migrating to Unicode (often utf-8 encoding), to avoid
encoding problems/headaches. So you'll have to do it someday.
But not everyone is always up-to-date, on the edge, etc.
Ex many people still uses Windows98 (21% of google users in mid 2004), [2]
not the newest XP. That said, there was already some Unicode support
back to Ms-Office 97.
Everyone is moving to Unicode, it's up to you decide when you'll do it.
Personnaly, I thinks that, for very 'local' websites
(like only English/French/Dutch in Belgium/France)
latin1 is still an option, even if utf-8 will replace it
in a somewhat near future -- I mean when (nearly) all "old"
softs/web-apps using latin1 will be upgraded to Unicode.
But yes, Unicode will be the only choice quite soon,
so be prepared seems a good idea
Christophe
[1] cp1252
http://www.microsoft.com/typography/unicode/1252.htm
[2] april 2004 zeitgeist google
http://www.google.com/press/zeitgeist/zeitgeist-apr04.html
--
PHP Internationalization Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php