ID: 9365
User Update by: [EMAIL PROTECTED]
Status: Open
Bug Type: Scripting Engine problem
Description: Problem with muiti-byte char code set (serious)

It seems it is related to reference.

The registration is done in a function and if there are error the function returns 
array contains error messages. It can be relatively large, so I returned reference. I 
get rid of the reference, then it start working as it should.

i.e. 
function &register() <= script executed from the beginning.
function register() <= works as it should.

Hope this info helps.


Previous Comments:
---------------------------------------------------------------------------

[2001-03-09 23:03:45] [EMAIL PROTECTED]
I tested with 

Newer Japanese Charset Handling module.
=> the same result.

Without these modules
=> the same result.

I was compiled in these Japanese char handling modules in php. I didn't compile these 
modules in php, but I compile these modules as individual *.so file. The same result.

If I use plain ASCII HTML file for include(). It works. But not with HTML contains 
EUC.
I'll upgrade my glibc see if it fixes. (Please wait for feedback)

FYI:
Code that causes this. I've tested with require()/include_once()/require_once(), the 
same result.
Reminder - Include()/require() works fine except on this file.

------------------
// Show Registration complete html
//die('DIED BEFORE INCLUDE'); // Dies as it should
//header('Location: http://www/'); // Just for testing
//include('regist_finished.ihtml'); // HTML file contains EUC. Script executed again 
from the beginning!! Can't even die at the beginning of the file.
//include('test3.php'); // ASCII char only. Works as expected.
include('cancel.ihtml'); // Another HTML file contains EUC. Script executed again from 
the beginning!! Can't even die at the beginning of the file.
die('DIED AFTER INCLUDE'); // DO NOT DIE, as it should.
--------------------



---------------------------------------------------------------------------

[2001-03-09 22:52:46] [EMAIL PROTECTED]
I tested with 

Newer Japanese Charset Handling module.
=> the same result.

Without these modules
=> the same result.

I was compiled in these Japanese char handling modules in php. I didn't compile these 
modules in php, but I compile these modules as individual *.so file. The same result.

If I use plain ASCII HTML file for include(). It works. But not with HTML contains 
EUC.
I'll upgrade my glibc see if it fixes. (Please wait for feedback)

FYI:
Code that causes this. I've tested with require()/include_once()/require_once(), the 
same result.
Reminder - Include()/require() works fine except on this file.

------------------
// Show Registration complete html
//die('DIED BEFORE INCLUDE'); // Dies as it should
//header('Location: http://www/'); // Just for testing
//include('regist_finished.ihtml'); // HTML file contains EUC. Script executed again 
from the beginning!! Can't even die at the beginning of the file.
//include('test3.php'); // ASCII char only. Works as expected.
include('cancel.ihtml'); // Another HTML file contains EUC. Script executed again from 
the beginning!! Can't even die at the beginning of the file.
die('DIED AFTER INCLUDE'); // DO NOT DIE, as it should.
--------------------



---------------------------------------------------------------------------

[2001-03-08 07:11:18] [EMAIL PROTECTED]
I thought I put some code, but there is not....
Anyway, I found the line causes "goto" like behaviour. It was the line to include HTML 
file to show users.
(Note: I still have code that alway do that, I tried to make it simple. So far no 
luck, if I make it simple, it starts working as expected.... I will try again after I 
upgrade my glibc to see if it fixes the problem)

I've tried to stop script execution as follows.

die('Die before include'); // Works as expected
include('some.html');
die('Die after include); // This will never happen

Inside some.html
<?php
die('Die inside include'); // This will never happen
?>
at the beginning of the file.

My code has many include/require like this. This only happens on the script, but not 
others. Other scripts work fine. All I can tell is something happens when the file is 
included. Although, the file can be included w/o problems from other scripts.

I'll post follow up. Because I didn't test the script after I upgraded Japanese 
Charactor handling module. It might be gone. (I hope)

---------------------------------------------------------------------------

[2001-03-08 05:41:04] [EMAIL PROTECTED]
Could you please provide a short code demonstrating the problem?

---------------------------------------------------------------------------

[2001-02-21 04:07:58] [EMAIL PROTECTED]
PHP4.0.4pl1 possibly has unsafe code for 8 bit char codesets. If it is the case, any 
user, that uses charactor code from 128 to 255, may experience strange/unexpected PHP 
behavior. (Another possiblity is bugs in glibc....)

NOTE: It is very difficult to determine in what condtion program does wrong. When 
condition meets PHP does following behavior ALWAYS. (I don't figure out exact 
condition yet. i.e. what combination/location of multi-byte charset causes this 
behavior.) In most cases, I don't have this kind of problem at all. Therefore, I can't 
reproduce this problem with simple script, so I don't put them in here.

Anyway, it seems PHP4.0.4pl1 does this:

PHP4 behavior: Script is executed TWICE and included file is not processed
1) PHP parse script and start executing.
   - My script check username data in db, if there is the same username, return error. 
If not, insert new username into db.
2) PHP calls function to register new user.
2) PHP execute code to insert data into db in the function. if user can be added. PHP 
possibly encounters 8bit char unclean code some where near include()and RESTART script 
execution from the beginning.
   - The script written to include() HTML file for successful user registration.

PHP inserts new username into db at 1st execution, then it finds the same username in 
db and return error for 2nd execution. 
If I put die('died here') BEFORE include(), PHP stops execution and outputs 'died 
here'. but not AFTER include(). PHP does not stop execution inside of included file, 
too. 
I was using 'ob_gzhandler', disabling it does not make any difference.

This happened when user registration check/insert was done in function defined in 
other included file that included at the top of script. 
PHP does not log any errors when this happens. (E_ALL)

PHP4 behavior: Script does not process included file and outputs default HTML as if I 
didn't print any outputs.
(It is rewrite for the code I explained)
1) PHP parse script and start executing.
 - This script does not use function calls in contrast to previous one.
2) PHP possibly encounters 8bit char unclean code some where near include(), and 
outputs default HTML for null output and stops execution.

Therefore, I can see output from die('died here') if I put BEFORE include(), but not 
AFTER include(). If I put die('died here') inside of included file, PHP does not die 
also. 

This happened when user registration check/insert was done in the script w/o using 
functions. i.e. I'm not using functions defined included file. The script logic is 
identical to first one except it is not using any functions. 
PHP does not log any errors. (E_ALL)

When I tested with PLAIN ASCII HTML for included file. PHP WORKS as expected. i.e. It 
show html file, and die/exit from script. (before/inside/after include())

I use EUC (Extended Unix Code), EUC-JP to be specific,  for char code, which is 
supposed to work well with 8 bit char code clean programs.

[Environment]
OS: RadHat Linux7.0.1/ja(i386) FTP version (no glibc update)
Apache: Apache 1.3.17 w/ mod_ssl-2.8.0, mod_gzip-1.13.17a. build from source 
PHP: PHP4.0.4pl1 w/ pgsql-7.0.3, gd-1.8.3, mhash, mcript and others. build from 
source. (no debug option)
 - ECU-JP for all html,  php scripts
PHP Configure:
'./configure' '--with-apxs' '--disable-short-tags' '--enable-bcmath' '--with-zlib-dir' 
'--enable-ftp' '--with-imap' '--with-mhash' '--with-mcrypt' '--with-pgsql' 
'--with-swf' '--enable-sysvsem' '--enable-sysvshm' '--with-zlib' '--enable-iconv' 
'--with-kakasi' '--enable-jstring' '--enable-mbregex' '--with-namazu' 
'--with-gd=../gd-1.8.3/' '--with-jpeg-dir=/usr' '--with-xpm-dir=/usr/X11R6'

I cannot think of any reasonable explanation for this strange PHP4 behavior other than 
possibility that glibc has bugs. (8 bit char unsafe code, etc. I haven't research 
about my exact glibc version nor bugs yet, so far I don't have any problem other than 
PHP4.)

PS: I don't use EUC for var/function names, of course. I only use EUC in HTML or var 
contents. 
I really want this problem to be fixed. If you need to contact me, please do so. I'll 
try the best I can do.

Regards,
--
Yasuo Ohgaki

---------------------------------------------------------------------------

The remainder of the comments for this report are too long.  To view the rest of the 
comments, please view the bug report online.

Full Bug description available at: http://bugs.php.net/?id=9365


-- 
PHP Development Mailing List <http://www.php.net/>
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]

Reply via email to