php-i18n Digest 4 Apr 2004 12:44:10 -0000 Issue 223
Topics (messages 680 through 682):
Re: Where to Begin?
680 by: Leung WC
Script to convert utf-8 to html entities?
681 by: Amadeus
682 by: Hayk Chamyan
Administrivia:
To subscribe to the digest, e-mail:
[EMAIL PROTECTED]
To unsubscribe from the digest, e-mail:
[EMAIL PROTECTED]
To post to the list, e-mail:
[EMAIL PROTECTED]
----------------------------------------------------------------------
--- Begin Message ---
Jamie wrote:
Hi, i've been doing a little php programming for a while now, as well
as interacting with japanese data and mysql.
But now i need to rewrite my site to account for the possibility of
mutliple languages being used on the one page.
What i had intended to do was to store everything in the database as
UTF-8, and to output the html pages as UTF-8.
I've read about the slashes issues and the internal encoding issues
and i think i understand.
But, if i have a page that requires input in Japanese and Hangul, what
should i set the internal encoding as:
mbstring.internal_encoding = UTF-8
What i'm not too sure about is, if the user uses an IME with SJIS to
input japanese, and then another IME to inpout the Korean, does PHP
convert this to UTF-8? And if i need to fill the form back in
(mistakes or omissions) do i need to convert the text back to some
other format?
No. The browser converts all of them to UTF-8 instead. If browser fails
to do it (esp Opera), then change your browser.
So, you can forget this issue in PHP.
If i pull UTF-8 text from a DB,how or do i need to convert it (to SJIS
for example) to fill i a form?
Again... forget it. Simply send the form in UTF-8 and people will answer
in UTF-8
Lastly, would it be advisable to give users the ability to output a
HTML page in another character set (UTF default -> User wants SJIS) as
long as their are only japanese charaters to display?
No. Popular browsers (IE and Mozilla) have absolutely no difficulties
using UTF-8. In the other way, having encoding conversions adds
complexity to your program (it is also easy to do it wrong!).
Today I was reading a web diary that the writer submitted a essay in
SJIS when it should be BIG5. Nightmare...
Sorry if i don't exactly make much sense, i'm still trying to come to
terms with dealing with multiple character sets and IME's.
Many Thanks
Jamie
--- End Message ---
--- Begin Message ---
Hello,
I have been having a lot of problems with .po 's in utf-8 on linux
systems...
A solution to incorrect displays would be to convert the .po content into
html_entities (&#xxxx).
I don't know a lot about this, but can anyone post a script snippet to do
this?
Ie. read a .po file and output the same with html entities?
I have tried sending gettext output to html_entities() through php, but
gettext doesn't even return the utf-8 content correctly in the first place.
I have seen a lot of javascript scripts which work fine, but am looking for
a command line script (perl/php etc.)
Thanks
Amadeus
--
[EMAIL PROTECTED]
SDF Public Access UNIX System - http://sdf.lonestar.org
--- End Message ---
--- Begin Message ---
Amadeus wrote:
A> Hello,
A> I have been having a lot of problems with .po 's in utf-8 on linux
A> systems...
A> A solution to incorrect displays would be to convert the .po content into
A> html_entities (&#xxxx).
A> I don't know a lot about this, but can anyone post a script snippet to do
A> this?
A> Ie. read a .po file and output the same with html entities?
A> I have tried sending gettext output to html_entities() through php, but
A> gettext doesn't even return the utf-8 content correctly in the first place.
A> I have seen a lot of javascript scripts which work fine, but am looking for
A> a command line script (perl/php etc.)
You can use mb_convert_encoding();
Tis an example:
<?php
$data = file_get_contents('in.po');
$data = mb_convert_encoding($data, 'HTML-ENTITIES', 'UTF-8');
$fp = fopen('out.po', 'wb');
fwrite($fp, $data);
fclose($fp);
?>
--
Best regards,
Hayk
mailto:[EMAIL PROTECTED]
--- End Message ---