ID: 39332
Updated by: [EMAIL PROTECTED]
Reported By: herbert dot fischer at gmail dot com
-Status: Open
+Status: Feedback
Bug Type: Program Execution
Operating System: Red Hat ELAS4 Upd3
PHP Version: 5.1.6
New Comment:
Thank you for this bug report. To properly diagnose the problem, we
need a short but complete example script to be able to reproduce
this bug ourselves.
A proper reproducing script starts with <?php and ends with ?>,
is max. 10-20 lines long and does not require any external
resources such as databases, etc. If the script requires a
database to demonstrate the issue, please make sure it creates
all necessary tables, stored procedures etc.
Please avoid embedding huge scripts into the report.
Previous Comments:
------------------------------------------------------------------------
[2006-11-01 13:55:38] herbert dot fischer at gmail dot com
Description:
------------
PHP is assuming character encoding from external executed svn client,
as ASCII.
Even when external program returns ISO-8859-1 encoded string, PHP
"parses" the encoded string as ASCII, expanding accented characters as
literal string form and not their binary form.
For example:
an output like "Acentuação" turns to be a string in literal form
"Acentua?\195?\167?\195?\163o/".
Reproduce code:
---------------
Import some accented file or folders into a subversion repository. Is
it possible to convert the output to utf-8 using the command bellow:
# svn list 'file:////home/svn/herbert/' | iconv -tutf-8
But not when from PHP:
<?php
$cmd = "svn list 'file:////home/svn/herbert/'";
$out = shell_exec($cmd);
$res = unpack('c*', $out);
var_dump($res);
?>
var_dump reports:
array(29) {
[1]=>
int(65)
[2]=>
int(99)
[3]=>
int(101)
[4]=>
int(110)
[5]=>
int(116)
[6]=>
int(117)
[7]=>
int(97)
[8]=>
int(63)
[9]=>
int(92)
[10]=>
int(49)
[11]=>
int(57)
[12]=>
int(53)
[13]=>
int(63)
[14]=>
int(92)
[15]=>
int(49)
[16]=>
int(54)
[17]=>
int(55)
[18]=>
int(63)
[19]=>
int(92)
[20]=>
int(49)
[21]=>
int(57)
[22]=>
int(53)
[23]=>
int(63)
[24]=>
int(92)
[25]=>
int(49)
[26]=>
int(54)
[27]=>
int(51)
[28]=>
int(111)
[29]=>
int(47)
}
So it's not possible to convert the string to other character set,
since it's invalid.
Expected result:
----------------
It's expected to PHP store the string as it's original binary format.
array(10) {
[1]=>
int(65)
[2]=>
int(99)
[3]=>
int(101)
[4]=>
int(110)
[5]=>
int(116)
[6]=>
int(117)
[7]=>
int(97)
[8]=>
int(-25)
[9]=>
int(-29)
[10]=>
int(111)
}
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/?id=39332&edit=1