php-i18n Digest 4 Aug 2003 10:07:26 -0000 Issue 186

Topics (messages 594 through 598):

Re: Encoding Japanese characters.
        594 by: David Powers
        595 by: Inwards
        596 by: Tony Laszlo
        597 by: Inwards

Making i18n work on all Unix webservers ?
        598 by: Chichi

Administrivia:

To subscribe to the digest, e-mail:
        [EMAIL PROTECTED]

To unsubscribe from the digest, e-mail:
        [EMAIL PROTECTED]

To post to the list, e-mail:
        [EMAIL PROTECTED]


----------------------------------------------------------------------
--- Begin Message ---
Inwards wrote:
>
> - I have a string containing a Japanese phrase.  mb_detect_encoding()
> reports it as SJIS.
> - I want to display this string in HTML without changing the encoding
> type (ISO-8859-1).

You can't. If you want to support Japanese in a web page, you have to
use an encoding that supports Japanese - UTF-8, Shift_JIS, EUC-JP, for
example.

> - In order to accomplish this, I would like to encode the text as
> Unicode values.  I have seen a nice example of this in action here:
> http://www.animenewsnetwork.com/encyclopedia/anime.php?id=4 so I know
> that it is possible.

That page doesn't use Japanese, but HTML numeric entities:

科学忍者隊ガッチャ
マン

You need to find a method of converting Japanese to numeric entities if
that's the route you want to take. I have had it happen to me by
accident, but for anyone who can read Japanese, it's a complete
nightmare to work with, so it's unlikely to be a function in great
demand.

David Powers
*******************************************
No-nonsense reviews of computer books
http://japan-interface.co.uk/webdesign/books.html
Save 10% on TopStyle CSS Editor
*******************************************

--- End Message ---
--- Begin Message ---
Thanks for the quick response.  I suppose that I did not make myself very
clear.  I realize that this page's solution is to convert to unicode
numerical values.  Since I want to do something very similar to this
particular page (ie; I happen to have the Japanese name of something that I
would like to display, but the rest of the page is entirely English), this
seems to be the best solution.

What I can't seem to figure out is how to get the unicode value for a
multibyte character.  I guess that I'm looking for the mb equivalent of
ord().  Any ideas?  Or is there a better/smarter way to accomplish what I'm
after?

"David Powers" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> Inwards wrote:
> >
> > - I have a string containing a Japanese phrase.  mb_detect_encoding()
> > reports it as SJIS.
> > - I want to display this string in HTML without changing the encoding
> > type (ISO-8859-1).
>
> You can't. If you want to support Japanese in a web page, you have to
> use an encoding that supports Japanese - UTF-8, Shift_JIS, EUC-JP, for
> example.
>
> > - In order to accomplish this, I would like to encode the text as
> > Unicode values.  I have seen a nice example of this in action here:
> > http://www.animenewsnetwork.com/encyclopedia/anime.php?id=4 so I know
> > that it is possible.
>
> That page doesn't use Japanese, but HTML numeric entities:
>
> &#31185;&#23398;&#24525;&#32773;&#38538;&#12460;&#12483;&#12481;&#12515;
> &#12510;&#12531;
>
> You need to find a method of converting Japanese to numeric entities if
> that's the route you want to take. I have had it happen to me by
> accident, but for anyone who can read Japanese, it's a complete
> nightmare to work with, so it's unlikely to be a function in great
> demand.
>
> David Powers
> *******************************************
> No-nonsense reviews of computer books
> http://japan-interface.co.uk/webdesign/books.html
> Save 10% on TopStyle CSS Editor
> *******************************************
>
>
>



--- End Message ---
--- Begin Message ---
On Thu, 31 Jul 2003, David Powers wrote:

> You need to find a method of converting Japanese to numeric entities if
> that's the route you want to take. 

Yudit and the accompanying package called uniconv can do that. 

> I have had it happen to me by
> accident, but for anyone who can read Japanese, it's a complete
> nightmare to work with, so it's unlikely to be a function in great
> demand.

Not necessarily, but I would also advise avoidance of these 
special characters, if possible. Murphy's Law states that they 
_will_ appear as themselves, not as Japanese characters, at some 
time or another. :)



-- 
Tony Laszlo
http://www.issho.org/modules.php?op=modload&name=phpWiki&file=index&pagename=La$
(going for the record - blog with the longest URL) 




--- End Message ---
--- Begin Message ---
Thanks for the tip.  I took a look at this but I finally managed to figure
out how to do this in PHP natively.

Just in case folks are interested:

if (mb_detect_encoding($japanese_string)=="SJIS") {
    $convmap = array(0x0000, 0xffff, 0, 0xffff);
    $str = mb_encode_numericentity($japanese_string, $convmap,
mb_detect_encoding($japanese_string));
    print $str;  // print encoded string
  }

"Tony Laszlo" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> On Thu, 31 Jul 2003, David Powers wrote:
>
> > You need to find a method of converting Japanese to numeric entities if
> > that's the route you want to take.
>
> Yudit and the accompanying package called uniconv can do that.
>
> > I have had it happen to me by
> > accident, but for anyone who can read Japanese, it's a complete
> > nightmare to work with, so it's unlikely to be a function in great
> > demand.
>
> Not necessarily, but I would also advise avoidance of these
> special characters, if possible. Murphy's Law states that they
> _will_ appear as themselves, not as Japanese characters, at some
> time or another. :)
>
>
>
> --
> Tony Laszlo
>
http://www.issho.org/modules.php?op=modload&name=phpWiki&file=index&pagename
=La$
> (going for the record - blog with the longest URL)
>
>
>



--- End Message ---
--- Begin Message ---

With the following code to translate messages in french, we need to put the mo files in a directory like

./local/xxx/LC_MESSAGE/messages.{mo,po}

        putenv("LANGUAGE=french");
        setlocale(LC_ALL, 'fr_BE');
        bindtextdomain("messages", "./local");
        textdomain("messages");
        echo '<br>' . _("Yes");

On one linux webserver, xxx must be 'fr' (and LANGUAGE set).
On another linux webserver, xxx must be 'french' (no var to set).

How can I guess which xxx to use, for my code to work on
any webserver ? Other solution than using 3 or more directories
like 'fr', 'french', 'fr_BE', and copying it?
In fact, I'd like to make it work on windows servers too...

And how can I guess which env var to set (or not) and in
which order (LC_ALL, LANG, LANGUAGE, ...) ? Is there a way
to do it for the code to work nearly everywhere ?
I'll use a class to hide that complexity.

Can someone help ? The PHP documentation is far from beeing
clear and precise enough in that particular field.

---
Christophe Chisogne
Developper, Publicityweb sprl
http://www.publicityweb.com
Tel +32(0)61/27.14.80


--- End Message ---

Reply via email to