"a.h.s. boy" <[EMAIL PROTECTED]> wrote:

> I realized, somewhere in the midst of writing my last post, that UTF-8 
> was capable of displaying Japanese characters, but it doesn't seem to 
> be a particularly common choice. And as long as I was only expecting 
> single-byte languages, shouldn't Japanese input still cause potential 
> problems, especially with str functions?

That's the case, in particular when it comes to mobile internet access 
like i-mode, J-SKY, ezweb or whatsoever provided in Japan, though recent 
PC browsers such as IE and Netscape except for old versions (4.x) of 
netscape navigator are enough capable of handling UTF-8 characters.

And on the other hand, there still remains some sort of database 
problem. In most cases it's not likely that you have trouble executing SQL 
statements which include UTF-8 characters under the 
configuration for latin1(iso-8859-1) characters. But you won't get a 
correct result by string manipulation functions then. To solve this, you 
should use one of the encodings supported by MySQL such as EUC-JP(ujis), 
Shift_JIS(sjis) for Japanese text handling.

In addition, since PHP's standard string functions treat all the input 
strings as such that they consist of one-letter-per-byte components, you 
should use mbstring functions to manipulate multi-byte strings. Besides 
mb_ series can handle various single-byte encoding too.

BTW even if you want to quickly make your websites ready for Japanese 
language, using mbstring's function overloading feature should always be
discouraged. It's likely to cause various unknown problems due to slight 
diferrences in the function specs.

Moriyoshi


-- 
PHP Internationalization Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to