Unrelated - We are looking for people who will contribute unit tests to PHP 5.3 for ext/mbstring esp. input encoding coversion (Shift-JIS, etc..). Any volunteers please contact internals@
Andi > -----Original Message----- > From: Tomas Kuliavas [mailto:[EMAIL PROTECTED] > Sent: Tuesday, March 25, 2008 8:49 PM > To: php-i18n@lists.php.net > Subject: Re: [PHP-I18N] Re: Problems with mime encoding of Japanese > Charactersin Subject and'From:' etc. fields. > > >>> Hi, > >>> I try to send messages written in Japanese (Kana/Kanji) with php. > >>> > >>> Everything works fine - only when the subject (or the name of the > >>> sender) becomes longer, there seems to be something wrong with the > >>> encoding: Neither my nor the mail reader of other Japanese friends > is > >>> able to decode the mime string. At the place of the Japanese > >>> Characters, the mime string itself is displayed. > >>> > >>> As this doesn't happen for other Japanese emails with even long > >>> subjects, I suppose I did something wrong... > >>> > >>> When using the corresponding php mb_* functions to decode the > string > >>> back, sometimes the correct original string and sometimes > meaningless > >>> characters are shown. > >>> > >>> Here how I convert the subject (the name is converted using the > same > >>> method and the sources are saved in UTF-8 using emacs): > >>> > >>> $subjectJIS = mb_convert_encoding($subject, "ISO-2022-JP", > "AUTO"); > >>> $subjectMIME = mb_encode_mimeheader($subjectJIS, "ISO-2022-JP", > "B"); > >>> ...snip... > >>> mail($to, $subjectMIME, $bodyJIS, $headers); > >>> > >>> Here part of the message as it is displayed by my mail program: > >>> > >>> From: > >>> =?ISO-2022- > JP?B?GyRCJCskSjRBO3okKyRKNEE7eiQrJEo0QTt6JCskSjRBO3okKyRKNEE7?==?ISO- > 2022-JP?B?eiQrJEo0QTt6JCskSjRBO3okKyRKNEE7eiQrJEo0QTt6JCskSjRBO3ob?=(B > >>> <[EMAIL PROTECTED]> > >>> ...snip... > >>> Subject: > >>> =?ISO-2022- > JP?B?GyRCJCskSjRBO3okKyRKNEE7eiQrJEo0QTt6JCskSjRBO3okKyRKNEE7?= > >>> =?ISO-2022- > JP?B?eiQrJEo0QTt6JCskSjRBO3okKyRKNEE7eiQrJEo0QTt6JCskSjRBO3ob?= > >>> (B > >> ... > >>> If anybody can explain me the problem I would be most gratefull :) > > I have seen this problem in a few mail clients My solution in the > past > > has been to merge the 2 encoding strings into a single encoding > string > > to avoid the client getting messed when it sees the second > > "=?ISO-2022-JP" in the Header line. (this is really a big problem for > > Apple iMail-I have seen it regardless of the programming language > used) > > Again RFC2047. > --- > An 'encoded-word' may not be more than 75 characters long, including > 'charset', 'encoding', 'encoded-text', and delimiters. > --- > > >> > >> You forgot to mention your PHP version, configure options related to > >> mbstring and php mbstring configuration. > >> > >> Could you explain why Japanese are so obsessed with ISO-2022-JP? Why > >> can't you just send it in Base64 encoded UTF-8? > >> > > Some brain dead ISPs/Mobile services here in Japan only support > > ISO-2022-JP. > > Do they need another four black ships in order to change things? > > ISO-2022 texts can be encoded correctly, but it is harder to implement > than iso-8859 or utf-8/utf-16 mime encoding. I suggest sending text in > utf-8 and asking brain dead ISPs to fix their software. Even if it is > DoCoMo. If Dietrich uses script in some html form, he does not know if > text submitted in that form is Japanese. > > Instead of > ---- > $subjectJIS = mb_convert_encoding($subject, "ISO-2022-JP", "AUTO"); > $subjectMIME = mb_encode_mimeheader($subjectJIS, "ISO-2022-JP", "B"); > ---- > do > ---- > mb_internal_encoding('utf-8'); > $subjectMIME = mb_encode_mimeheader($subject, "utf-8", "B"); > ---- > > -- > Tomas > > -- > PHP Unicode & I18N Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php