php-i18n Digest 1 Dec 2003 15:36:30 -0000 Issue 206

Topics (messages 641 through 644):

Re: new to intenationalization programming
        641 by: Rasmus Lerdorf
        642 by: Moriyoshi Koizumi
        643 by: Moriyoshi Koizumi

GNU gettext support for PHP programs
        644 by: Bruno Haible

Administrivia:

To subscribe to the digest, e-mail:
        [EMAIL PROTECTED]

To unsubscribe from the digest, e-mail:
        [EMAIL PROTECTED]

To post to the list, e-mail:
        [EMAIL PROTECTED]


----------------------------------------------------------------------
--- Begin Message ---
Are you sure you will be using UTF8?  The native charset everyone uses in
Japan is EUC-JP and in Korea it is EUC-KR.  This of course doesn't mean
that UTF8 won't work, but you may want to doublecheck on that.

As far as PHP is concerned you can work in almost any character set you
want, including UTF8, in your PHP script itself.  User input can come in
from the browser in even more character sets and the output can be any of
a long list as well.

For more info I suggest reading through http://php.net/mbstring

-Rasmus

On Fri, 28 Nov 2003, Ligaya Turmelle wrote:

> Hi I am just starting a project that will have to handle English, Japanese
> and Korean characters.  I have never programmed for an international group
> before and am unsure how to begin.  I will be using a HTML(UTF8)/PHP front
> end with a MySQL DB in the back.  I have read over the information about
> multi-byte string functions for PHP on PHP.net and still am very confused
> about what I have to do to start using these functions.  Also I am confused
> about what (if anything) I have to do with the MySQL DB.  This seems to be
> the only place to go to ask questions.  I'm sorry if I ask silly or stupid
> questions in the future and ask for your patience.
>
> Can anyone tell me where to go to get information, possibly view code
> snippets, or a forum I could join.  Where is a good place to start?
>
> --
> PHP Internationalization Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>

--- End Message ---
--- Begin Message ---
First off, you can take a quick look over the archive of this list.
There should be a lot of useful information.

Moriyoshi

p.s. IIRC, the current stable version of MySQL doesn't natively
support UTF-8 encoding. It'd be better to also consider using
PostgreSQL if you really need to use UTF-8 in your web application.

On 2003/11/28, at 11:37, Ligaya Turmelle wrote:

Hi I am just starting a project that will have to handle English, Japanese
and Korean characters. I have never programmed for an international group
before and am unsure how to begin. I will be using a HTML(UTF8)/PHP front
end with a MySQL DB in the back. I have read over the information about
multi-byte string functions for PHP on PHP.net and still am very confused
about what I have to do to start using these functions. Also I am confused
about what (if anything) I have to do with the MySQL DB. This seems to be
the only place to go to ask questions. I'm sorry if I ask silly or stupid
questions in the future and ask for your patience.


Can anyone tell me where to go to get information, possibly view code
snippets, or a forum I could join.  Where is a good place to start?

--
PHP Internationalization Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



--- End Message ---
--- Begin Message ---
On 2003/11/28, at 17:45, Rasmus Lerdorf wrote:


Are you sure you will be using UTF8? The native charset everyone uses in
Japan is EUC-JP and in Korea it is EUC-KR. This of course doesn't mean
that UTF8 won't work, but you may want to doublecheck on that.

Just to clarify, there are several standards for character set and encoding in Japan.

The most commonly used encoding is Shift_JIS, because it's been used
in numerous products since localized CP/M was introduced in Japan.
(it became popular when the first localized MS-DOS came out in fact.)

However, ISO-2022-JP encoding is often used for internet message
transport because RFC1468 standarized it.

In contrast to Windows, you may be able to choose one of the
following encodings for system locale charset, Shift_JIS,
EUC-JP and ISO-2022-JP possibly on *nix (or similar) platforms.
There EUC-JP has been considered to be most preferred because
it's often found easier to port a non-multilingual application
to the Japanised one with EUC-JP than the others due to
characteristics of its encoding scheme.

There are lots of non-i18n'ed open source products used in Japan.
PHP is no exception. It's actually incapable of handling several
encodings such as Shift_JIS (CP932), CP936 (often wrongly referred to
as GB2312) or CP949 (a microsoft variant of EUC-KR) without magic [1].
It is because those encoding methods encode some characters into a
compound of an arbitrary octet and a special character like "\" that
has a particular meaning in the language construct of PHP, and it
often leads PHP into unexpected behaviour [2].

On the other hand, lots of people tend to think Unicode is the
perfect solution to create an fully-internationalised application.
But it's hardly the case because OS vendors use different
character mapping table between native character set and Unicode,
which ended up with sort of mass confusion.

Well, maybe enough said. If you want to know further information,
Ken Lunde's CJKV Information Processing [3], published by O'Reilly
would definitely help.

Moriyoshi

[1] The magic is now enabled by specifying --zend-multibyte to configure.
Let's thank Masaki Fujimoto for his effort :)


[2] A typical example can be seen in
http://news.php.net/article.php?group=php.i18n&article=633

[3] You can reach Ken Lunde's homepage at http://www.praxagora.com/lunde/ .

As far as PHP is concerned you can work in almost any character set you
want, including UTF8, in your PHP script itself. User input can come in
from the browser in even more character sets and the output can be any of
a long list as well.


For more info I suggest reading through http://php.net/mbstring

-Rasmus

On Fri, 28 Nov 2003, Ligaya Turmelle wrote:

Hi I am just starting a project that will have to handle English, Japanese
and Korean characters. I have never programmed for an international group
before and am unsure how to begin. I will be using a HTML(UTF8)/PHP front
end with a MySQL DB in the back. I have read over the information about
multi-byte string functions for PHP on PHP.net and still am very confused
about what I have to do to start using these functions. Also I am confused
about what (if anything) I have to do with the MySQL DB. This seems to be
the only place to go to ask questions. I'm sorry if I ask silly or stupid
questions in the future and ask for your patience.


Can anyone tell me where to go to get information, possibly view code
snippets, or a forum I could join.  Where is a good place to start?

--
PHP Internationalization Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


-- PHP Internationalization Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php



--- End Message ---
--- Begin Message ---
Hi,

The just-released GNU gettext 0.13 has improved support for PHP programs:

  * "xgettext --language=PHP" now supports the plural handling functions
    ngettext, dngettext, dcngettext (introduced in PHP 4.2.0).

  * An example demonstrating the use of GNU gettext with PHP is shipped and
    installed at $prefix/share/doc/gettext/examples/hello-php.

Unfortunately, GNU gettext is not yet ready for being used in a multithreaded
environment where each thread may need to use a different locale/language.

URL: http://ftp.gnu.org/gnu/gettext/gettext-0.13.tar.gz

Enjoy!

                                Bruno

--- End Message ---

Reply via email to