ID:               17154
 Updated by:       [EMAIL PROTECTED]
 Reported By:      [EMAIL PROTECTED]
-Status:           Open
+Status:           Verified
 Bug Type:         Recode related
 Operating System: Linux2.2.19/Debian
 PHP Version:      4.3.0-dev


Previous Comments:
------------------------------------------------------------------------

[2002-09-18 17:42:45] [EMAIL PROTECTED]

this bug is for real!
just stumbled into it while writing a mail script.
recode _does_ stubbornly add somewhat random trailing garbage to
strings on my system. i made a test script to figure it out, so i might
as well post it here. 

my php is 4.2.3, system is Debian. i also got some segfaults from my
mail script, but this was rare and might or might not be connected to
the trailing garbage bug



sample output first (wrong, clearly):

SNIP>
bash-2.05b$ php4 recodetest.php
X-Powered-By: PHP/4.2.3
Content-type: text/html

testing recode request ISO-8859-1..UTF-8

INPUT: "Some Hacker <[EMAIL PROTECTED]>"
OUTPUT:
"Some Hacker <[EMAIL PROTECTED]>"
"Some Hacker <[EMAIL PROTECTED]>"
"Some Hacker <[EMAIL PROTECTED]>&"
"Some Hacker <[EMAIL PROTECTED]>"
"Some Hacker <[EMAIL PROTECTED]>"
"Some Hacker <[EMAIL PROTECTED]>"
"Some Hacker <[EMAIL PROTECTED]>@"
"Some Hacker <[EMAIL PROTECTED]"
"Some Hacker <[EMAIL PROTECTED]>0u"

INPUT: "Some Hacker <[EMAIL PROTECTED] "
OUTPUT:
"Some Hacker <[EMAIL PROTECTED] 0u"

INPUT: "Some Hacker  [EMAIL PROTECTED]>"
OUTPUT:
"Some Hacker  [EMAIL PROTECTED]>0u"

INPUT: "Some Hacker  <[EMAIL PROTECTED]>"
OUTPUT:
"Some Hacker  <[EMAIL PROTECTED]>u"

INPUT: "Some Hacker  <[EMAIL PROTECTED] "
OUTPUT:
"Some Hacker  <[EMAIL PROTECTED] u"

INPUT: "Some Hacker   [EMAIL PROTECTED]>"
OUTPUT:
"Some Hacker   [EMAIL PROTECTED]>u"

INPUT: "Some Hacker <[EMAIL PROTECTED]>  "
OUTPUT:
"Some Hacker <[EMAIL PROTECTED]>  "

INPUT: "Some Hacker <[EMAIL PROTECTED]   "
OUTPUT:
"Some Hacker <[EMAIL PROTECTED]   "

INPUT: "Some Hacker  [EMAIL PROTECTED]>  "
OUTPUT:
"Some Hacker  [EMAIL PROTECTED]>  "

INPUT: "&#65533;  B "
OUTPUT:
"&#65533;&#65533;  B "

INPUT: "MAKE MONEY REALLY REALLY REALLY FAST"
OUTPUT:
"MAKE MONEY REALLY REALLY REALLY FASTY"
"MAKE MONEY REALLY REALLY REALLY FAST"


Tried 200 loops on 11 test(s).

<SNIP

and the code, so you can try too!

<?php

#try different encodings


$from='ISO-8859-1';
#$from='ascii';

$to='UTF-8'; 
#$to='HTML';
#$to='flat';

echo "testing recode request $from..$to\n";

$tests=array
(
 'Some Hacker <[EMAIL PROTECTED]>',
 'Some Hacker <[EMAIL PROTECTED] ',
 'Some Hacker  [EMAIL PROTECTED]>',

 'Some Hacker  <[EMAIL PROTECTED]>',
 'Some Hacker  <[EMAIL PROTECTED] ',
 'Some Hacker   [EMAIL PROTECTED]>',

 'Some Hacker <[EMAIL PROTECTED]>  ',
 'Some Hacker <[EMAIL PROTECTED]   ',
 'Some Hacker  [EMAIL PROTECTED]>  ',

 "\xA0 \x10 \x42 \x00",
 'MAKE MONEY REALLY REALLY REALLY FAST',
);


$tries=200;

foreach ($tests as $t) {

  print "\nINPUT: \"$t\"\nOUTPUT:\n";
  for ($i=0;$i<$tries;$i++) {
    $output=recode("$from..$to",$t);
    if ($output!=$old) {
      print "\"$output\"\n";
      $old=$output;
    }
  }
}

echo "\n\nTried $tries loops on ".sizeof($tests)." test(s).\n";
?>

hopefully this will give someone a chance to test on latest sources, or
at least a clue about the cause of the bug

------------------------------------------------------------------------

[2002-06-24 12:10:28] [EMAIL PROTECTED]

not exactly true what i said:

4.3.0-dev does not always segfault (mostly with a string-length of
96...) and it seems to behave like 4.2

chregu 

------------------------------------------------------------------------

[2002-06-24 12:08:07] [EMAIL PROTECTED]

Same problem here. same string lengths, which cause errors.

recode on the commandline does it perfectly right.
php 4.2 did add trailing garbage
php 4.3-dev segfaults

chregu

------------------------------------------------------------------------

[2002-06-06 18:03:32] [EMAIL PROTECTED]

Tests with PHP4.3.0-dev and PHP4.2.1 get same (wrong) result. The
recoded string's length and the original stringlength are not equal.
Simply try to recode a 36 chr long string will results a 40 byte long
string, so the return value contains additional 4 byte 0x00 chr garbage
at the end:

recode ("utf8..latin2", "0123456789012345678901234567890123456");

The error is reproducable at several stringlength: 36-39, 96-99,
186-189, 321-324, 523-526, 826-829, 1281-1284, 1963-1966, 2986-2989,
4521-4524, 6823-6826, and so on...:)
(operation on the result string makes random crashes).

Please try the examples above and report if it's working correctly.
(sorry, if the previous description was  confusing ;)

Thx

------------------------------------------------------------------------

[2002-06-04 04:25:07] [EMAIL PROTECTED]

Thank you for taking the time to report a problem with PHP.
Unfortunately your version of PHP is too old -- the problem
might already be fixed. Please download a new PHP
version from http://www.php.net/downloads.php

If you are able to reproduce the bug with one of the latest
versions of PHP, please change the PHP version on this bug report
to the version you tested and change the status back to "Open".
Again, thank you for your continued support of PHP.



------------------------------------------------------------------------

The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at
    http://bugs.php.net/17154

-- 
Edit this bug report at http://bugs.php.net/?id=17154&edit=1

Reply via email to