ID: 17154
Comment by: [EMAIL PROTECTED]
Reported By: [EMAIL PROTECTED]
Status: Open
Bug Type: Recode related
Operating System: Linux2.2.19/Debian
PHP Version: 4.3.0-dev
New Comment:
this bug is for real!
just stumbled into it while writing a mail script.
recode _does_ stubbornly add somewhat random trailing garbage to
strings on my system. i made a test script to figure it out, so i might
as well post it here.
my php is 4.2.3, system is Debian. i also got some segfaults from my
mail script, but this was rare and might or might not be connected to
the trailing garbage bug
sample output first (wrong, clearly):
SNIP>
bash-2.05b$ php4 recodetest.php
X-Powered-By: PHP/4.2.3
Content-type: text/html
testing recode request ISO-8859-1..UTF-8
INPUT: "Some Hacker <[EMAIL PROTECTED]>"
OUTPUT:
"Some Hacker <[EMAIL PROTECTED]>"
"Some Hacker <[EMAIL PROTECTED]>"
"Some Hacker <[EMAIL PROTECTED]>&"
"Some Hacker <[EMAIL PROTECTED]>"
"Some Hacker <[EMAIL PROTECTED]>"
"Some Hacker <[EMAIL PROTECTED]>"
"Some Hacker <[EMAIL PROTECTED]>@"
"Some Hacker <[EMAIL PROTECTED]"
"Some Hacker <[EMAIL PROTECTED]>0u"
INPUT: "Some Hacker <[EMAIL PROTECTED] "
OUTPUT:
"Some Hacker <[EMAIL PROTECTED] 0u"
INPUT: "Some Hacker [EMAIL PROTECTED]>"
OUTPUT:
"Some Hacker [EMAIL PROTECTED]>0u"
INPUT: "Some Hacker <[EMAIL PROTECTED]>"
OUTPUT:
"Some Hacker <[EMAIL PROTECTED]>u"
INPUT: "Some Hacker <[EMAIL PROTECTED] "
OUTPUT:
"Some Hacker <[EMAIL PROTECTED] u"
INPUT: "Some Hacker [EMAIL PROTECTED]>"
OUTPUT:
"Some Hacker [EMAIL PROTECTED]>u"
INPUT: "Some Hacker <[EMAIL PROTECTED]> "
OUTPUT:
"Some Hacker <[EMAIL PROTECTED]> "
INPUT: "Some Hacker <[EMAIL PROTECTED] "
OUTPUT:
"Some Hacker <[EMAIL PROTECTED] "
INPUT: "Some Hacker [EMAIL PROTECTED]> "
OUTPUT:
"Some Hacker [EMAIL PROTECTED]> "
INPUT: "� B "
OUTPUT:
"�� B "
INPUT: "MAKE MONEY REALLY REALLY REALLY FAST"
OUTPUT:
"MAKE MONEY REALLY REALLY REALLY FASTY"
"MAKE MONEY REALLY REALLY REALLY FAST"
Tried 200 loops on 11 test(s).
<SNIP
and the code, so you can try too!
<?php
#try different encodings
$from='ISO-8859-1';
#$from='ascii';
$to='UTF-8';
#$to='HTML';
#$to='flat';
echo "testing recode request $from..$to\n";
$tests=array
(
'Some Hacker <[EMAIL PROTECTED]>',
'Some Hacker <[EMAIL PROTECTED] ',
'Some Hacker [EMAIL PROTECTED]>',
'Some Hacker <[EMAIL PROTECTED]>',
'Some Hacker <[EMAIL PROTECTED] ',
'Some Hacker [EMAIL PROTECTED]>',
'Some Hacker <[EMAIL PROTECTED]> ',
'Some Hacker <[EMAIL PROTECTED] ',
'Some Hacker [EMAIL PROTECTED]> ',
"\xA0 \x10 \x42 \x00",
'MAKE MONEY REALLY REALLY REALLY FAST',
);
$tries=200;
foreach ($tests as $t) {
print "\nINPUT: \"$t\"\nOUTPUT:\n";
for ($i=0;$i<$tries;$i++) {
$output=recode("$from..$to",$t);
if ($output!=$old) {
print "\"$output\"\n";
$old=$output;
}
}
}
echo "\n\nTried $tries loops on ".sizeof($tests)." test(s).\n";
?>
hopefully this will give someone a chance to test on latest sources, or
at least a clue about the cause of the bug
Previous Comments:
------------------------------------------------------------------------
[2002-06-24 12:10:28] [EMAIL PROTECTED]
not exactly true what i said:
4.3.0-dev does not always segfault (mostly with a string-length of
96...) and it seems to behave like 4.2
chregu
------------------------------------------------------------------------
[2002-06-24 12:08:07] [EMAIL PROTECTED]
Same problem here. same string lengths, which cause errors.
recode on the commandline does it perfectly right.
php 4.2 did add trailing garbage
php 4.3-dev segfaults
chregu
------------------------------------------------------------------------
[2002-06-06 18:03:32] [EMAIL PROTECTED]
Tests with PHP4.3.0-dev and PHP4.2.1 get same (wrong) result. The
recoded string's length and the original stringlength are not equal.
Simply try to recode a 36 chr long string will results a 40 byte long
string, so the return value contains additional 4 byte 0x00 chr garbage
at the end:
recode ("utf8..latin2", "0123456789012345678901234567890123456");
The error is reproducable at several stringlength: 36-39, 96-99,
186-189, 321-324, 523-526, 826-829, 1281-1284, 1963-1966, 2986-2989,
4521-4524, 6823-6826, and so on...:)
(operation on the result string makes random crashes).
Please try the examples above and report if it's working correctly.
(sorry, if the previous description was confusing ;)
Thx
------------------------------------------------------------------------
[2002-06-04 04:25:07] [EMAIL PROTECTED]
Thank you for taking the time to report a problem with PHP.
Unfortunately your version of PHP is too old -- the problem
might already be fixed. Please download a new PHP
version from http://www.php.net/downloads.php
If you are able to reproduce the bug with one of the latest
versions of PHP, please change the PHP version on this bug report
to the version you tested and change the status back to "Open".
Again, thank you for your continued support of PHP.
------------------------------------------------------------------------
[2002-05-11 10:33:22] [EMAIL PROTECTED]
Recode function somehow fails to calculate length of the result string,
this cause (mostly) random segfaults. In this example, the FOR will
stop at different cyclcount, which count depends on running mode:
apache module, cgi from shell, cgi from gdb, and the operations on the
string before calling recode.
The result of recoding in the file is so weird, at several places the
two string's length doesn't equal (like some buffer owerflow problem.)
PHP versions 4.0.6-4.1.2 (with recode 3.6) are all affected
(commandline recode is works well).
<?
$fp = fopen("ideni","w");
for ($i = 0; $i < 10240; $i++)
{
echo "$i\n";
$str = str_repeat("a",$i);
if (strlen($str) !=
strlen(recode("utf8..latin2",$str)))
{
$fstr = "\n$i: $str";
$rstr = "\n$i: " . recode("utf8..latin2",$str);
fwrite($fp,$fstr);
fwrite($fp,$rstr);
}
}
fclose($fp);
?>
This backtrace made from cgi/gdb:
#0 0x4024ed28 in free () from /lib/libc.so.6
#1 0x4024ea0a in malloc () from /lib/libc.so.6
#2 0x4024e1e4 in malloc () from /lib/libc.so.6
#3 0x080f5a8f in _emalloc (size=6828, __zend_filename=0x81309c2
"recode.c", __zend_lineno=142, __zend_orig_filename=0x0,
__zend_orig_lineno=0) at zend_alloc.c:165
#4 0x080f61ed in _estrndup (s=0x81d64a8 'a' <repeats 200 times>...,
length=6827, __zend_filename=0x81309c2 "recode.c", __zend_lineno=142,
__zend_orig_filename=0x0, __zend_orig_lineno=0) at
zend_alloc.c:356
#5 0x0807d88a in zif_recode_string (ht=2, return_value=0x81d2384,
this_ptr=0x0, return_value_used=1) at recode.c:142
#6 0x0812594a in execute (op_array=0x81cddbc) at
./zend_execute.c:1590
#7 0x08107309 in zend_execute_scripts (type=8, retval=0x0,
file_count=3) at zend.c:814
#8 0x0805f411 in php_execute_script (primary_file=0xbffffd04) at
main.c:1307
#9 0x0805cc8c in main (argc=3, argv=0xbffffd94) at cgi_main.c:738
#10 0x401f96cf in __libc_start_main () from /lib/libc.so.6
(gdb) frame 6
#6 0x0812594a in execute (op_array=0x81cddbc) at
./zend_execute.c:1590
1590
((zend_internal_function *)
function_state.function)->handler(opline->extended_value,
Ts[opline->result.u.var].var.ptr, object.ptr, return_value_used
TSRMLS_CC);
Good luck!
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/?id=17154&edit=1