On Sun, Jan 05, 2003 at 03:37:47PM -0000, David Powers wrote:
> > On Sun, Jan 05, 2003 at 12:20:51AM -0000, David Powers wrote:
> >> This is a cut-down version of what I now have in my php.ini. As you
> >> will see, I have commented out the output_handler line. When
> >> enabled, all I got was mojibake.
> >>
> >> output_buffering = On
> >> ;output_handler = mb_output_handler
> >
> > I see no problem in this.
> 
> That brings me back to my original query. The PHP documentation says
> SJIS users should set output_handler to mb_output_handler. Doing so
> results in mojibake. Turning it off (by commenting it out with the
> semi-colon) is the only way I can get my pages to display correctly. So,
> either there is a mistake in the documentation or the explanation of
> SJIS users needs to be clarified.
(snip)
> This would seem to add an unnecessary level of complication. I am using
> PHP in combination with MySQL to provide an online database in both
> Japanese and English. All input is done through a browser interface over
> the internet, and most - if not all - users are on Windows. PHP seems to
> do an excellent job of conversion without adding a further layer.

As I said in the previous mail, mojibake is because you are composing your
pages in Shift_JIS whereas you are supposed to use EUC-JP actually.

In most cases PHP is likely to process Shift_JIS encoded pages without
problems, but sometimes it ends up giving a buggy result you could hardly
know what is going wrong there. This is because several (not many) Shift_JIS
kanji characters consist of any character which can be a lead-byte of the
double-byte character set and '\' (backslash / yen sign), though '\' is
also used to form escape sequences in string literals enclosed by
single-quotes or double-quotes. Besides the same problem is known to be
caused by other east-asian(CJKV) charatcter sets like CP936
(a superset of GB2312 which is adopted by Microsoft; also known as GBK),
GB18030 (a huge character set defined as a Chinese national standard),
or BIG5 (used to represent traditional Chinese text). 

If you haven't experienced such a "phenomenon" ever, you are definitely
lucky so far :-)

Unfortunately I don't seem to be allowed to use Japanese characters in this
list, I couldn't give you any example in this mail.
I'll come up with those again if you can read Japanese mails with your mail
client. 

> >> based on PHP 4.2.2 and PostgreSQL. Are there any major differences
> >> between 4.2.2 and 4.3.0 as far as Japanese is concerned?
> >
> > No significant changes have been made between these versions. All that
> > the mbstring developers did is bug fixing.
> 
> Again, this is where I get confused - or maybe I'm misunderstanding a
> vital element. The PHP documentation states that as of
> 4.3.0, --enable-mbstr-enc-trans has been eliminated. Under 4.2.2, I
> needed to use mb_convert_encoding($_POST['variable'], "SJIS") to gather
> variables submitted by a form. Now I don't need to.

Sorry for the confusion. I said "no significant" in a technical point of
view. As for --enable-mbstr-enc-trans, this compile-time option is removed
for convenience and now replaced by "mbstring.encoding_translation"
runtime option. You can use mb_parse_str() as well in case it's turned off.

> Since Japanese is not my native language, it's not as easy for me to
> search for information in news groups and websites as it is in English.
> I intend to study the PDF files you recommended, but I see they were
> written before --enable-mbstr-enc-trans was eliminated, so any guidance
> on how this affects the handling of Japanese would be useful.

Hmm... English information about Japanese text handling with PHP is very
limited since a small number of developers who fluently speak English
use Japanese or other east-asian languages in his/her project, and since
I don't have much time to add more explanation to the manual. I think
all I can do for now is fill up this list's archive with (hopefully) detailed
mails.

Moriyoshi

-- 
PHP Internationalization Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to