ID:               39332
 Updated by:       [EMAIL PROTECTED]
 Reported By:      herbert dot fischer at gmail dot com
-Status:           Open
+Status:           Feedback
 Bug Type:         Program Execution
 Operating System: Red Hat ELAS4 Upd3
 PHP Version:      5.1.6
 New Comment:

Thank you for this bug report. To properly diagnose the problem, we
need a short but complete example script to be able to reproduce
this bug ourselves. 

A proper reproducing script starts with <?php and ends with ?>,
is max. 10-20 lines long and does not require any external 
resources such as databases, etc. If the script requires a 
database to demonstrate the issue, please make sure it creates 
all necessary tables, stored procedures etc.

Please avoid embedding huge scripts into the report.




Previous Comments:
------------------------------------------------------------------------

[2006-11-01 13:55:38] herbert dot fischer at gmail dot com

Description:
------------
PHP is assuming character encoding from external executed svn client,
as ASCII.

Even when external program returns ISO-8859-1 encoded string, PHP
"parses" the encoded string as ASCII, expanding accented characters as
literal string form and not their binary form.

For example: 
an output like "Acentuação" turns to be a string in literal form
"Acentua?\195?\167?\195?\163o/".

Reproduce code:
---------------
Import some accented file or folders into a subversion repository. Is
it possible to convert the output to utf-8 using the command bellow:

# svn list 'file:////home/svn/herbert/' | iconv -tutf-8

But not when from PHP:

<?php
$cmd = "svn list 'file:////home/svn/herbert/'";
$out = shell_exec($cmd);
$res = unpack('c*', $out);
var_dump($res);
?>

var_dump reports:

array(29) {
  [1]=>
  int(65)
  [2]=>
  int(99)
  [3]=>
  int(101)
  [4]=>
  int(110)
  [5]=>
  int(116)
  [6]=>
  int(117)
  [7]=>
  int(97)
  [8]=>
  int(63)
  [9]=>
  int(92)
  [10]=>
  int(49)
  [11]=>
  int(57)
  [12]=>
  int(53)
  [13]=>
  int(63)
  [14]=>
  int(92)
  [15]=>
  int(49)
  [16]=>
  int(54)
  [17]=>
  int(55)
  [18]=>
  int(63)
  [19]=>
  int(92)
  [20]=>
  int(49)
  [21]=>
  int(57)
  [22]=>
  int(53)
  [23]=>
  int(63)
  [24]=>
  int(92)
  [25]=>
  int(49)
  [26]=>
  int(54)
  [27]=>
  int(51)
  [28]=>
  int(111)
  [29]=>
  int(47)
}

So it's not possible to convert the string to other character set,
since it's invalid.

Expected result:
----------------
It's expected to PHP store the string as it's original binary format.

array(10) {
  [1]=>
  int(65)
  [2]=>
  int(99)
  [3]=>
  int(101)
  [4]=>
  int(110)
  [5]=>
  int(116)
  [6]=>
  int(117)
  [7]=>
  int(97)
  [8]=>
  int(-25)
  [9]=>
  int(-29)
  [10]=>
  int(111)
}



------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=39332&edit=1

Reply via email to