ID: 17154
Updated by: [EMAIL PROTECTED]
Reported By: k dot joe at freemail dot hu
-Status: Verified
+Status: Bogus
Bug Type: Recode related
Operating System: Linux2.2.19/Debian
PHP Version: 4.3.3-dev
New Comment:
I had a look at this, but it really looks correct from the PHP side.
For some reason the recode library returns a string that is too long
with random chars behind it. It's not a bug in PHP, everything is done
as the documentation of recode tells it should be done. I used recode
3.6 for my tests and it definitely doesn't behave as it should.
Previous Comments:
------------------------------------------------------------------------
[2002-09-18 17:42:45] luka at mail dot ljudmila dot org
this bug is for real!
just stumbled into it while writing a mail script.
recode _does_ stubbornly add somewhat random trailing garbage to
strings on my system. i made a test script to figure it out, so i might
as well post it here.
my php is 4.2.3, system is Debian. i also got some segfaults from my
mail script, but this was rare and might or might not be connected to
the trailing garbage bug
sample output first (wrong, clearly):
SNIP>
bash-2.05b$ php4 recodetest.php
X-Powered-By: PHP/4.2.3
Content-type: text/html
testing recode request ISO-8859-1..UTF-8
INPUT: "Some Hacker <[EMAIL PROTECTED]>"
OUTPUT:
"Some Hacker <[EMAIL PROTECTED]>"
"Some Hacker <[EMAIL PROTECTED]>"
"Some Hacker <[EMAIL PROTECTED]>&"
"Some Hacker <[EMAIL PROTECTED]>"
"Some Hacker <[EMAIL PROTECTED]>"
"Some Hacker <[EMAIL PROTECTED]>"
"Some Hacker <[EMAIL PROTECTED]>@"
"Some Hacker <[EMAIL PROTECTED]"
"Some Hacker <[EMAIL PROTECTED]>0u"
INPUT: "Some Hacker <[EMAIL PROTECTED] "
OUTPUT:
"Some Hacker <[EMAIL PROTECTED] 0u"
INPUT: "Some Hacker [EMAIL PROTECTED]>"
OUTPUT:
"Some Hacker [EMAIL PROTECTED]>0u"
INPUT: "Some Hacker <[EMAIL PROTECTED]>"
OUTPUT:
"Some Hacker <[EMAIL PROTECTED]>u"
INPUT: "Some Hacker <[EMAIL PROTECTED] "
OUTPUT:
"Some Hacker <[EMAIL PROTECTED] u"
INPUT: "Some Hacker [EMAIL PROTECTED]>"
OUTPUT:
"Some Hacker [EMAIL PROTECTED]>u"
INPUT: "Some Hacker <[EMAIL PROTECTED]> "
OUTPUT:
"Some Hacker <[EMAIL PROTECTED]> "
INPUT: "Some Hacker <[EMAIL PROTECTED] "
OUTPUT:
"Some Hacker <[EMAIL PROTECTED] "
INPUT: "Some Hacker [EMAIL PROTECTED]> "
OUTPUT:
"Some Hacker [EMAIL PROTECTED]> "
INPUT: "� B "
OUTPUT:
"�� B "
INPUT: "MAKE MONEY REALLY REALLY REALLY FAST"
OUTPUT:
"MAKE MONEY REALLY REALLY REALLY FASTY"
"MAKE MONEY REALLY REALLY REALLY FAST"
Tried 200 loops on 11 test(s).
<SNIP
and the code, so you can try too!
<?php
#try different encodings
$from='ISO-8859-1';
#$from='ascii';
$to='UTF-8';
#$to='HTML';
#$to='flat';
echo "testing recode request $from..$to\n";
$tests=array
(
'Some Hacker <[EMAIL PROTECTED]>',
'Some Hacker <[EMAIL PROTECTED] ',
'Some Hacker [EMAIL PROTECTED]>',
'Some Hacker <[EMAIL PROTECTED]>',
'Some Hacker <[EMAIL PROTECTED] ',
'Some Hacker [EMAIL PROTECTED]>',
'Some Hacker <[EMAIL PROTECTED]> ',
'Some Hacker <[EMAIL PROTECTED] ',
'Some Hacker [EMAIL PROTECTED]> ',
"\xA0 \x10 \x42 \x00",
'MAKE MONEY REALLY REALLY REALLY FAST',
);
$tries=200;
foreach ($tests as $t) {
print "\nINPUT: \"$t\"\nOUTPUT:\n";
for ($i=0;$i<$tries;$i++) {
$output=recode("$from..$to",$t);
if ($output!=$old) {
print "\"$output\"\n";
$old=$output;
}
}
}
echo "\n\nTried $tries loops on ".sizeof($tests)." test(s).\n";
?>
hopefully this will give someone a chance to test on latest sources, or
at least a clue about the cause of the bug
------------------------------------------------------------------------
[2002-06-24 12:10:28] [EMAIL PROTECTED]
not exactly true what i said:
4.3.0-dev does not always segfault (mostly with a string-length of
96...) and it seems to behave like 4.2
chregu
------------------------------------------------------------------------
[2002-06-24 12:08:07] [EMAIL PROTECTED]
Same problem here. same string lengths, which cause errors.
recode on the commandline does it perfectly right.
php 4.2 did add trailing garbage
php 4.3-dev segfaults
chregu
------------------------------------------------------------------------
[2002-06-06 18:03:32] k dot joe at freemail dot hu
Tests with PHP4.3.0-dev and PHP4.2.1 get same (wrong) result. The
recoded string's length and the original stringlength are not equal.
Simply try to recode a 36 chr long string will results a 40 byte long
string, so the return value contains additional 4 byte 0x00 chr garbage
at the end:
recode ("utf8..latin2", "0123456789012345678901234567890123456");
The error is reproducable at several stringlength: 36-39, 96-99,
186-189, 321-324, 523-526, 826-829, 1281-1284, 1963-1966, 2986-2989,
4521-4524, 6823-6826, and so on...:)
(operation on the result string makes random crashes).
Please try the examples above and report if it's working correctly.
(sorry, if the previous description was confusing ;)
Thx
------------------------------------------------------------------------
[2002-06-04 04:25:07] [EMAIL PROTECTED]
Thank you for taking the time to report a problem with PHP.
Unfortunately your version of PHP is too old -- the problem
might already be fixed. Please download a new PHP
version from http://www.php.net/downloads.php
If you are able to reproduce the bug with one of the latest
versions of PHP, please change the PHP version on this bug report
to the version you tested and change the status back to "Open".
Again, thank you for your continued support of PHP.
------------------------------------------------------------------------
The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at
http://bugs.php.net/17154
--
Edit this bug report at http://bugs.php.net/?id=17154&edit=1