php-i18n Digest 4 Aug 2003 10:07:26 -0000 Issue 186
Topics (messages 594 through 598):
Re: Encoding Japanese characters.
594 by: David Powers
595 by: Inwards
596 by: Tony Laszlo
597 by: Inwards
Making i18n work on all Unix webservers ?
598 by: Chichi
Administrivia:
To subscribe to the digest, e-mail:
[EMAIL PROTECTED]
To unsubscribe from the digest, e-mail:
[EMAIL PROTECTED]
To post to the list, e-mail:
[EMAIL PROTECTED]
----------------------------------------------------------------------
--- Begin Message ---
Inwards wrote:
>
> - I have a string containing a Japanese phrase. mb_detect_encoding()
> reports it as SJIS.
> - I want to display this string in HTML without changing the encoding
> type (ISO-8859-1).
You can't. If you want to support Japanese in a web page, you have to
use an encoding that supports Japanese - UTF-8, Shift_JIS, EUC-JP, for
example.
> - In order to accomplish this, I would like to encode the text as
> Unicode values. I have seen a nice example of this in action here:
> http://www.animenewsnetwork.com/encyclopedia/anime.php?id=4 so I know
> that it is possible.
That page doesn't use Japanese, but HTML numeric entities:
科学忍者隊ガッチャ
マン
You need to find a method of converting Japanese to numeric entities if
that's the route you want to take. I have had it happen to me by
accident, but for anyone who can read Japanese, it's a complete
nightmare to work with, so it's unlikely to be a function in great
demand.
David Powers
*******************************************
No-nonsense reviews of computer books
http://japan-interface.co.uk/webdesign/books.html
Save 10% on TopStyle CSS Editor
*******************************************
--- End Message ---
--- Begin Message ---
Thanks for the quick response. I suppose that I did not make myself very
clear. I realize that this page's solution is to convert to unicode
numerical values. Since I want to do something very similar to this
particular page (ie; I happen to have the Japanese name of something that I
would like to display, but the rest of the page is entirely English), this
seems to be the best solution.
What I can't seem to figure out is how to get the unicode value for a
multibyte character. I guess that I'm looking for the mb equivalent of
ord(). Any ideas? Or is there a better/smarter way to accomplish what I'm
after?
"David Powers" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> Inwards wrote:
> >
> > - I have a string containing a Japanese phrase. mb_detect_encoding()
> > reports it as SJIS.
> > - I want to display this string in HTML without changing the encoding
> > type (ISO-8859-1).
>
> You can't. If you want to support Japanese in a web page, you have to
> use an encoding that supports Japanese - UTF-8, Shift_JIS, EUC-JP, for
> example.
>
> > - In order to accomplish this, I would like to encode the text as
> > Unicode values. I have seen a nice example of this in action here:
> > http://www.animenewsnetwork.com/encyclopedia/anime.php?id=4 so I know
> > that it is possible.
>
> That page doesn't use Japanese, but HTML numeric entities:
>
> 科学忍者隊ガッチャ
> マン
>
> You need to find a method of converting Japanese to numeric entities if
> that's the route you want to take. I have had it happen to me by
> accident, but for anyone who can read Japanese, it's a complete
> nightmare to work with, so it's unlikely to be a function in great
> demand.
>
> David Powers
> *******************************************
> No-nonsense reviews of computer books
> http://japan-interface.co.uk/webdesign/books.html
> Save 10% on TopStyle CSS Editor
> *******************************************
>
>
>
--- End Message ---
--- Begin Message ---
On Thu, 31 Jul 2003, David Powers wrote:
> You need to find a method of converting Japanese to numeric entities if
> that's the route you want to take.
Yudit and the accompanying package called uniconv can do that.
> I have had it happen to me by
> accident, but for anyone who can read Japanese, it's a complete
> nightmare to work with, so it's unlikely to be a function in great
> demand.
Not necessarily, but I would also advise avoidance of these
special characters, if possible. Murphy's Law states that they
_will_ appear as themselves, not as Japanese characters, at some
time or another. :)
--
Tony Laszlo
http://www.issho.org/modules.php?op=modload&name=phpWiki&file=index&pagename=La$
(going for the record - blog with the longest URL)
--- End Message ---
--- Begin Message ---
Thanks for the tip. I took a look at this but I finally managed to figure
out how to do this in PHP natively.
Just in case folks are interested:
if (mb_detect_encoding($japanese_string)=="SJIS") {
$convmap = array(0x0000, 0xffff, 0, 0xffff);
$str = mb_encode_numericentity($japanese_string, $convmap,
mb_detect_encoding($japanese_string));
print $str; // print encoded string
}
"Tony Laszlo" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> On Thu, 31 Jul 2003, David Powers wrote:
>
> > You need to find a method of converting Japanese to numeric entities if
> > that's the route you want to take.
>
> Yudit and the accompanying package called uniconv can do that.
>
> > I have had it happen to me by
> > accident, but for anyone who can read Japanese, it's a complete
> > nightmare to work with, so it's unlikely to be a function in great
> > demand.
>
> Not necessarily, but I would also advise avoidance of these
> special characters, if possible. Murphy's Law states that they
> _will_ appear as themselves, not as Japanese characters, at some
> time or another. :)
>
>
>
> --
> Tony Laszlo
>
http://www.issho.org/modules.php?op=modload&name=phpWiki&file=index&pagename
=La$
> (going for the record - blog with the longest URL)
>
>
>
--- End Message ---
--- Begin Message ---
With the following code to translate messages in french,
we need to put the mo files in a directory like
./local/xxx/LC_MESSAGE/messages.{mo,po}
putenv("LANGUAGE=french");
setlocale(LC_ALL, 'fr_BE');
bindtextdomain("messages", "./local");
textdomain("messages");
echo '<br>' . _("Yes");
On one linux webserver, xxx must be 'fr' (and LANGUAGE set).
On another linux webserver, xxx must be 'french' (no var to set).
How can I guess which xxx to use, for my code to work on
any webserver ? Other solution than using 3 or more directories
like 'fr', 'french', 'fr_BE', and copying it?
In fact, I'd like to make it work on windows servers too...
And how can I guess which env var to set (or not) and in
which order (LC_ALL, LANG, LANGUAGE, ...) ? Is there a way
to do it for the code to work nearly everywhere ?
I'll use a class to hide that complexity.
Can someone help ? The PHP documentation is far from beeing
clear and precise enough in that particular field.
---
Christophe Chisogne
Developper, Publicityweb sprl
http://www.publicityweb.com
Tel +32(0)61/27.14.80
--- End Message ---