In the process of designing a web-based publishing system that is internationalized using gettext() calls, I've run into some odd problems with the display and proper charset when there are multiple languages on a page. Here's the background:

My default charset (declared in content-type meta tag) is iso-8859-1. On the page, there is a language popup that allows users to change to any number of languages, including Greek (iso-8859-7) and Turkish (iso-8859-9). Switching to a different language changes PHP's locale settings, and also changes the META tag charset. You're welcome to poke and prod it at http://dev.dadaimc.org/.

Users input text into a form for publication. The form declares "accept-charset=" and defaults to iso-8859-1, windows-1252, utf-8, and then whatever charsets are used by available languages (e.g. iso-8859-7 and -9). The text is stored in a MySQL database (default charset iso-8859-1).

I can successfully input English, Turkish, and Greek text. And when a viewer selects "Turkish" from the language menu, the Turkish text displays fine (because META tag is set to "iso-8859-9"). The same applies to English and Greek. However, on a page which displays all three texts -- one in each language -- only one of the three will display properly (whichever one corresponds to the currently selected language). Obviously inconvenient.

I visited another site -- http://www.indymedia.org/ which also displays multiple languages on the same page. It uses "utf-8" in the META tag (which makes sense, since it encompasses all the necessary characters). The publishing form it uses declares no accept-charset parameter. But it works!

When I tried using "charset=utf-8" in my META tag, the text displays worse than before -- lots of unprintable characters.

So I'm wondering if anyone knows the magic incantation that brings this all together -- how do I get my 3 texts in English, Greek, and Turkish to ALL display properly on the page at the same time?

It can't be a simple META tag set to utf-8...that didn't work.
Is the problem in the input method? The database storage method? The display method?
Do I need to accept-charset=utf-8 ONLY on my input form?
Does the page charset on the page containing the input form need to be utf-8?
Does the database need to default to using utf-8 for storage? (It doesn't seem to be supported).

Please help!

Cheers,
spud.

-------------------------------------------------------------------
a.h.s. boy
spud(at)nothingness.org "as yes is to if,love is to yes"
http://www.nothingness.org/
-------------------------------------------------------------------


--
PHP Internationalization Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to