ID: 9365
User Update by: [EMAIL PROTECTED]
Status: Open
Bug Type: Scripting Engine problem
Description: Problem with muiti-byte char code set (serious)
I tested with
Newer Japanese Charset Handling module.
=> the same result.
Without these modules
=> the same result.
I was compiled in these Japanese char handling modules in php. I didn't compile these
modules in php, but I compile these modules as individual *.so file. The same result.
If I use plain ASCII HTML file for include(). It works. But not with HTML contains
EUC.
I'll upgrade my glibc see if it fixes. (Please wait for feedback)
FYI:
Code that causes this. I've tested with require()/include_once()/require_once(), the
same result.
Reminder - Include()/require() works fine except on this file.
------------------
// Show Registration complete html
//die('DIED BEFORE INCLUDE'); // Dies as it should
//header('Location: http://www/'); // Just for testing
//include('regist_finished.ihtml'); // HTML file contains EUC. Script executed again
from the beginning!! Can't even die at the beginning of the file.
//include('test3.php'); // ASCII char only. Works as expected.
include('cancel.ihtml'); // Another HTML file contains EUC. Script executed again from
the beginning!! Can't even die at the beginning of the file.
die('DIED AFTER INCLUDE'); // DO NOT DIE, as it should.
--------------------
Previous Comments:
---------------------------------------------------------------------------
[2001-03-08 07:11:18] [EMAIL PROTECTED]
I thought I put some code, but there is not....
Anyway, I found the line causes "goto" like behaviour. It was the line to include HTML
file to show users.
(Note: I still have code that alway do that, I tried to make it simple. So far no
luck, if I make it simple, it starts working as expected.... I will try again after I
upgrade my glibc to see if it fixes the problem)
I've tried to stop script execution as follows.
die('Die before include'); // Works as expected
include('some.html');
die('Die after include); // This will never happen
Inside some.html
<?php
die('Die inside include'); // This will never happen
?>
at the beginning of the file.
My code has many include/require like this. This only happens on the script, but not
others. Other scripts work fine. All I can tell is something happens when the file is
included. Although, the file can be included w/o problems from other scripts.
I'll post follow up. Because I didn't test the script after I upgraded Japanese
Charactor handling module. It might be gone. (I hope)
---------------------------------------------------------------------------
[2001-03-08 05:41:04] [EMAIL PROTECTED]
Could you please provide a short code demonstrating the problem?
---------------------------------------------------------------------------
[2001-02-21 04:07:58] [EMAIL PROTECTED]
PHP4.0.4pl1 possibly has unsafe code for 8 bit char codesets. If it is the case, any
user, that uses charactor code from 128 to 255, may experience strange/unexpected PHP
behavior. (Another possiblity is bugs in glibc....)
NOTE: It is very difficult to determine in what condtion program does wrong. When
condition meets PHP does following behavior ALWAYS. (I don't figure out exact
condition yet. i.e. what combination/location of multi-byte charset causes this
behavior.) In most cases, I don't have this kind of problem at all. Therefore, I can't
reproduce this problem with simple script, so I don't put them in here.
Anyway, it seems PHP4.0.4pl1 does this:
PHP4 behavior: Script is executed TWICE and included file is not processed
1) PHP parse script and start executing.
- My script check username data in db, if there is the same username, return error.
If not, insert new username into db.
2) PHP calls function to register new user.
2) PHP execute code to insert data into db in the function. if user can be added. PHP
possibly encounters 8bit char unclean code some where near include()and RESTART script
execution from the beginning.
- The script written to include() HTML file for successful user registration.
PHP inserts new username into db at 1st execution, then it finds the same username in
db and return error for 2nd execution.
If I put die('died here') BEFORE include(), PHP stops execution and outputs 'died
here'. but not AFTER include(). PHP does not stop execution inside of included file,
too.
I was using 'ob_gzhandler', disabling it does not make any difference.
This happened when user registration check/insert was done in function defined in
other included file that included at the top of script.
PHP does not log any errors when this happens. (E_ALL)
PHP4 behavior: Script does not process included file and outputs default HTML as if I
didn't print any outputs.
(It is rewrite for the code I explained)
1) PHP parse script and start executing.
- This script does not use function calls in contrast to previous one.
2) PHP possibly encounters 8bit char unclean code some where near include(), and
outputs default HTML for null output and stops execution.
Therefore, I can see output from die('died here') if I put BEFORE include(), but not
AFTER include(). If I put die('died here') inside of included file, PHP does not die
also.
This happened when user registration check/insert was done in the script w/o using
functions. i.e. I'm not using functions defined included file. The script logic is
identical to first one except it is not using any functions.
PHP does not log any errors. (E_ALL)
When I tested with PLAIN ASCII HTML for included file. PHP WORKS as expected. i.e. It
show html file, and die/exit from script. (before/inside/after include())
I use EUC (Extended Unix Code), EUC-JP to be specific, for char code, which is
supposed to work well with 8 bit char code clean programs.
[Environment]
OS: RadHat Linux7.0.1/ja(i386) FTP version (no glibc update)
Apache: Apache 1.3.17 w/ mod_ssl-2.8.0, mod_gzip-1.13.17a. build from source
PHP: PHP4.0.4pl1 w/ pgsql-7.0.3, gd-1.8.3, mhash, mcript and others. build from
source. (no debug option)
- ECU-JP for all html, php scripts
PHP Configure:
'./configure' '--with-apxs' '--disable-short-tags' '--enable-bcmath' '--with-zlib-dir'
'--enable-ftp' '--with-imap' '--with-mhash' '--with-mcrypt' '--with-pgsql'
'--with-swf' '--enable-sysvsem' '--enable-sysvshm' '--with-zlib' '--enable-iconv'
'--with-kakasi' '--enable-jstring' '--enable-mbregex' '--with-namazu'
'--with-gd=../gd-1.8.3/' '--with-jpeg-dir=/usr' '--with-xpm-dir=/usr/X11R6'
I cannot think of any reasonable explanation for this strange PHP4 behavior other than
possibility that glibc has bugs. (8 bit char unsafe code, etc. I haven't research
about my exact glibc version nor bugs yet, so far I don't have any problem other than
PHP4.)
PS: I don't use EUC for var/function names, of course. I only use EUC in HTML or var
contents.
I really want this problem to be fixed. If you need to contact me, please do so. I'll
try the best I can do.
Regards,
--
Yasuo Ohgaki
---------------------------------------------------------------------------
Full Bug description available at: http://bugs.php.net/?id=9365
--
PHP Development Mailing List <http://www.php.net/>
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]