Yes, microsoft doc is often loose with respect to encodings. E.g. they claim 932 and shift-jis are the same when they aren't, etc.
I'll look for confirmation from Kitazaki-san that ISO_2022,locale=ja,version=3 works. tex -----Original Message----- From: Moriyoshi Koizumi [mailto:m...@mozo.jp] Sent: Tuesday, February 02, 2010 11:32 PM To: Tex Texin Cc: KITAZAKI Shigeru; php-i18n@lists.php.net Subject: Re: [PHP-I18N] adding GB18030 support for mbstring That is not correct. .NET Names here are also used intenally in MS products as well as codepages, and doesn't necessarily reflect the actual codeset defined in the IANA charset if the names look the same. Look at "additional information" for the differences. Moriyoshi On Wed, Feb 3, 2010 at 4:16 PM, Tex Texin <texte...@xencraft.com> wrote: > Yes- 50220 is just normal ISO-2022-JP: > http://msdn.microsoft.com/en-us/library/dd317756(VS.85).aspx > > > -----Original Message----- > From: Moriyoshi Koizumi [mailto:m...@mozo.jp] > Sent: Tuesday, February 02, 2010 10:54 PM > To: KITAZAKI Shigeru > Cc: php-i18n@lists.php.net > Subject: Re: [PHP-I18N] adding GB18030 support for mbstring > > It just turned out ISO_2022,locale=ja,version=3 is actually ISO-2022-JP-MS. > > Moriyoshi > > On Wed, Feb 3, 2010 at 10:22 AM, Moriyoshi Koizumi <m...@mozo.jp> wrote: >> None of them can handle CP50220. >> >> Moriyoshi >> >> 2010/2/3 Tex Texin <texte...@xencraft.com>: >>> icu has at least 5 versions of iso 2022-jp. >>> >>> http://demo.icu-project.org/icu-bin/convexp >>> >>> If the one you refer to is not one of these send me the details and I'll > log >>> it with the icu team. >>> >>> tex >>> >>> >>> -----Original Message----- >>> From: KITAZAKI Shigeru [mailto:shigeru_kitaz...@cybozu.co.jp] >>> Sent: Tuesday, February 02, 2010 4:43 AM >>> To: Moriyoshi Koizumi >>> Cc: php-i18n@lists.php.net >>> Subject: Re: [PHP-I18N] adding GB18030 support for mbstring >>> >>> Koizumi-san >>> >>> Let me tell you the one concern about mbstring-ng. >>> The current mbstring supports 'ISO-2022-JP-MS', this is different from >>> 'ISO-2022-JP'. And the current implementation of ICU can not convert >>> between ISO-2022-JP-MS and unicode correctly, I guess. >>> For example, Japanese hankaku katakana, GA, A with a sonant mark. >>> >>> Although it's better way to modify ICU itself, it takes long time. >>> How do you think of this? >>> >>> Moriyoshi Koizumi wrote: >>>> BTW, I created an extension that is near-compatible with mbstring and >>>> based on ICU that of course supports GB18030. See >>>> http://github.com/moriyoshi/mbstring-ng for detail. >>>> >>> >>> Regards, >>> Shigeru >>> >>> -- >>> PHP Unicode & I18N Mailing List (http://www.php.net/) >>> To unsubscribe, visit: http://www.php.net/unsub.php >>> >>> >> > > -- > PHP Unicode & I18N Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php > > -- PHP Unicode & I18N Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php