php-i18n Digest 4 Apr 2004 12:44:10 -0000 Issue 223

Topics (messages 680 through 682):

Re: Where to Begin?
        680 by: Leung WC

Script to convert utf-8 to html entities?
        681 by: Amadeus
        682 by: Hayk Chamyan

Administrivia:

To subscribe to the digest, e-mail:
        [EMAIL PROTECTED]

To unsubscribe from the digest, e-mail:
        [EMAIL PROTECTED]

To post to the list, e-mail:
        [EMAIL PROTECTED]


----------------------------------------------------------------------
--- Begin Message --- Jamie wrote:

Hi, i've been doing a little php programming for a while now, as well
as interacting with japanese data and mysql.

But now i need to rewrite my site to account for the possibility of
mutliple languages being used on the one page.

What i had intended to do was to store everything in the database as
UTF-8, and to output the html pages as UTF-8.

I've read about the slashes issues and the internal encoding issues
and i think i understand.

But, if i have a page that requires input in Japanese and Hangul, what
should i set the internal encoding as:
mbstring.internal_encoding = UTF-8


What i'm not too sure about is, if the user uses an IME with SJIS to
input japanese, and then another IME to inpout the Korean, does PHP
convert this to UTF-8? And if i need to fill the form back in
(mistakes or omissions) do i need to convert the text back to some
other format?

No. The browser converts all of them to UTF-8 instead. If browser fails to do it (esp Opera), then change your browser.
So, you can forget this issue in PHP.

If i pull UTF-8 text from a DB,how or do i need to convert it (to SJIS for example) to fill i a form?

Again... forget it. Simply send the form in UTF-8 and people will answer in UTF-8

Lastly, would it be advisable to give users the ability to output a HTML page in another character set (UTF default -> User wants SJIS) as long as their are only japanese charaters to display?

No. Popular browsers (IE and Mozilla) have absolutely no difficulties using UTF-8. In the other way, having encoding conversions adds complexity to your program (it is also easy to do it wrong!).


Today I was reading a web diary that the writer submitted a essay in SJIS when it should be BIG5. Nightmare...

Sorry if i don't exactly make much sense, i'm still trying to come to terms with dealing with multiple character sets and IME's.

Many Thanks
Jamie

--- End Message ---
--- Begin Message ---
Hello,

I have been having a lot of problems with .po 's in utf-8 on linux 
systems...

A solution to incorrect displays would be to convert the .po content into 
html_entities (&#xxxx).

I don't know a lot about this, but can anyone post a script snippet to do 
this?

Ie. read a .po file and output the same with html entities?

I have tried sending gettext output to html_entities() through php, but 
gettext doesn't even return the utf-8 content correctly in the first place.

I have seen a lot of javascript scripts which work fine, but am looking for 
a command line script (perl/php etc.)

Thanks

Amadeus
-- 
[EMAIL PROTECTED]
SDF Public Access UNIX System - http://sdf.lonestar.org

--- End Message ---
--- Begin Message ---
Amadeus wrote:

A> Hello,

A> I have been having a lot of problems with .po 's in utf-8 on linux 
A> systems...

A> A solution to incorrect displays would be to convert the .po content into
A> html_entities (&#xxxx).

A> I don't know a lot about this, but can anyone post a script snippet to do
A> this?

A> Ie. read a .po file and output the same with html entities?

A> I have tried sending gettext output to html_entities() through php, but
A> gettext doesn't even return the utf-8 content correctly in the first place.

A> I have seen a lot of javascript scripts which work fine, but am looking for
A> a command line script (perl/php etc.)

You can use mb_convert_encoding();

Tis an example:
<?php
$data = file_get_contents('in.po');
$data = mb_convert_encoding($data, 'HTML-ENTITIES', 'UTF-8');
$fp = fopen('out.po', 'wb');
fwrite($fp, $data);
fclose($fp);
?>

-- 
Best regards,
 Hayk
 mailto:[EMAIL PROTECTED]

--- End Message ---

Reply via email to