Edit report at https://bugs.php.net/bug.php?id=55507&edit=1
ID: 55507
Comment by: r dot fiedler at ovm dot de
Reported by: gtisza at gmail dot com
Summary: fgetcsv() handles invalid characters inconsistently
Status: Open
Type: Bug
Package: Filesystem function related
Operating System: Linux
PHP Version: Irrelevant
Block user comment: N
Private report: N
New Comment:
In the window-versions it works correct,
I got the error under php 5.2.6-1+lenny13
Previous Comments:
------------------------------------------------------------------------
[2011-08-25 12:46:51] gtisza at gmail dot com
Description:
------------
fgetcsv() throws away the first character of a field if it is invalid in the
current locale, but ignores invalid characters which are not at the beginning
of a field. The inconsistent behavior makes it hard to locate the source of the
bug; it should either throw all invalid characters away, or none of them (IMO
the second is much better).
(This is a duplicate of bug 45356, but that one has been closed as "no
feedback", and apparently mere mortals are not allowed to reopen it, even if
they do provide the feedback...)
Test script:
---------------
<?php
setlocale(LC_ALL,'C');
$utfchar = chr(0xC3).chr(0x89); // U+009C in UTF-8
$csv = $utfchar."x".$utfchar."x\n";
file_put_contents('test.csv', $csv);
$file = fopen('test.csv', 'r');
$data = fgetcsv($file);
for ($i = 0; $i < strlen($data[0]); $i++) {
echo dechex(ord($data[0][$i])).' ';
}
echo "\n";
unlink('test.csv');
// expected: c3 89 78 c3 89 78 - "ÃxÃx"
// actual: 78 c3 89 78 - "xÃx"
?>
------------------------------------------------------------------------
--
Edit this bug report at https://bugs.php.net/bug.php?id=55507&edit=1