I've been trying to internationalize a rather large PHP-based app that 
I'm working on. I implemented gettext() to cover some 650+ static 
strings in the code, and that aspect seems to work fine. I am now trying 
to handle issues of alternate character sets, but htmlentities() seems 
to go wonky when I add the charset parameter.

When I started the internationalization, I was (romanocentrically) 
focused on providing support for other "Latin 1" languages. The first 
alternate language request I received, however, was for Greek and 
Turkish support. Go figure.

To accomodate the non-Latin 1 characters, I made the following changes:

1) I modified my code so that switching displayed languages set the 
appropriate
      <meta http-equiv="content-type"...charset=...>
header for the page.

2) I modified my submission forms to have an "Accept-charset" parameter 
that included all the supported language character sets.

3) I changed a text scrubbing function (to clean up curly quotes, em 
dashes, etc) so that it wouldn't interfere with non-Latin 1 character 
sets.

4) I converted my calls to "htmlentities" to include the 3rd parameter, 
specifying the charset.

This last change, however, doesn't seem to be working. To complicate 
things, I'm attempting to test Greek language support from my Mac OS X 
machine, which isn't fully configured for Greek input. The OmniWeb 
browser I use supports display of Greek websites, so I've been cutting 
and pasting text from those sites into my submission form to test it.

If I submit the Greek text, it goes into my MySQL database just fine. To 
display it, I'm calling

    nl2br(htmlentities($body,ENT_COMPAT,'ISO-8859-7'));

and the result in my browser is

Υπάρχουν ιστορικά...

(Random Latin 1 accented characters, not Greek).

If, however, I change the htmlentities() call to htmlspecialchars(), 
still with the character set specified, it renders properly in my 
browser, in Greek.

Is this a bug in htmlentities? I'm using PHP 4.2.2, just released, so if 
it's a bug, it hasn't been fixed yet.

Can anyone else confirm this? Is anyone else attempting 
internationalized Greek support and attempting to use htmlentities()?

To see a page with both htmlentities() and htmlspecialchars() used, go to
http://alt.baltimoreimc.org/newswire/display/49/index.php

And feel free to attempt posting your own submissions...it's a 
development server, so no harm done. Use the "language" popup on the 
left-hand side to "switch" languages. The gettext translations aren't 
done, so that won't have the desired effect, but it does switch the 
charset declaration in the page headers.

You are especially welcome to try this if you're Greek, and can input 
text "in the way a normal Greek computer would". ;-)

Cheers,
spud.

-------------------------------------------------------------------
a.h.s. boy
spud(at)nothingness.org            "as yes is to if,love is to yes"
http://www.nothingness.org/
-------------------------------------------------------------------


--
PHP Internationalization Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to