Edit report at https://bugs.php.net/bug.php?id=55507&edit=1

 ID:                 55507
 Updated by:         yohg...@php.net
 Reported by:        gtisza at gmail dot com
 Summary:            fgetcsv() handles invalid characters inconsistently
 Status:             Open
 Type:               Bug
 Package:            Filesystem function related
 Operating System:   Linux
 PHP Version:        Irrelevant
 Block user comment: N
 Private report:     N

 New Comment:

Related to #65368


Previous Comments:
------------------------------------------------------------------------
[2012-05-06 16:47:36] dll at fugro dot no

I had similar problems using the Norwegian letter "Ø" as the first letter in 
the elements it was simply not there after the fgetcsv transfer. The following
WORKAROUND worked for me using explode().
dee ell ell at fugro dot no

//read the text file into a variable
$txt=read_txtfile("test.txt");

//explode the stream into an array of $nr rows                                  
$rowArr = explode("\n", $txt);                                  
$nr=count($rowArr);

For ($r=0;$r<$nr;$r++){ 
  
     $insert_data="'".str_replace(";", "','",$rowArr[$r])."'";

     //insert each row in the DB table "test"
        $query_string=" INSERT INTO test (name,name2)"
                        ." VALUES (".$insert_data.")";                          
$result_id = mysql_query($query_string, $my_conn)
                        or die("display_db_query:" . mysql_error()); 
    }

    If ($result_id ==1){echo $nr." rows transfered<br />\n"; }
    

Function read_txtfile($infile){
// read text data from file into a variable
$txt='';
$fo=fopen($infile,"r");
$txt=fread($fo,filesize($infile)); 
fclose($fo);
return $txt;
}       

===================================================================
If there is a need to access each data column in the row before transferring, 
these can easily be accessed by exploding each row once more in an inner loop.

------------------------------------------------------------------------
[2011-10-22 09:33:39] r dot fiedler at ovm dot de

In the window-versions it works correct,
I got the error under php 5.2.6-1+lenny13

------------------------------------------------------------------------
[2011-08-25 12:46:51] gtisza at gmail dot com

Description:
------------
fgetcsv() throws away the first character of a field if it is invalid in the 
current locale, but ignores invalid characters which are not at the beginning 
of a field. The inconsistent behavior makes it hard to locate the source of the 
bug; it should either throw all invalid characters away, or none of them (IMO 
the second is much better).


(This is a duplicate of bug 45356, but that one has been closed as "no 
feedback", and apparently mere mortals are not allowed to reopen it, even if 
they do provide the feedback...)

Test script:
---------------
<?php

setlocale(LC_ALL,'C');
$utfchar = chr(0xC3).chr(0x89); // U+009C in UTF-8

$csv = $utfchar."x".$utfchar."x\n";

file_put_contents('test.csv', $csv);
$file = fopen('test.csv', 'r');
$data = fgetcsv($file);

for ($i = 0; $i < strlen($data[0]); $i++) {
    echo dechex(ord($data[0][$i])).' ';
}
echo "\n";
unlink('test.csv');

// expected: c3 89 78 c3 89 78 - "ÉxÉx"
// actual: 78 c3 89 78 - "xÉx"

?>



------------------------------------------------------------------------



-- 
Edit this bug report at https://bugs.php.net/bug.php?id=55507&edit=1

Reply via email to