Edit report at https://bugs.php.net/bug.php?id=55763&edit=1
ID: 55763 Comment by: alotacents at gmail dot com Reported by: talk at alexmingoia dot com Summary: str_getcsv incorrectly handles line-breaks inside fields Status: Open Type: Bug Package: Strings related Operating System: OS X 10.6 PHP Version: 5.3.8 Block user comment: N Private report: N New Comment: to split the string in to record lines I used a regular expression that makes sure not to split inside of double quotes instead of using the str_getcsv. Then I used the str_getcsv on the line. example $s2=<<<EOD Year,Make,Model,Description,Price 1997,Ford,E350,"ac, abs, moon",3000.00 1999,Chevy,"Venture ""Extended Edition""","",4900.00 1999,Chevy,"Venture ""Extended Edition, Very Large""","",5000.00 1996,Jeep,Grand Cherokee,"MUST SELL! air, moon roof, loaded",4799.00 EOD; lines = preg_split('/[\r\n]{1,2}(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))/',$s2); it outputs Array ( [0] => Year,Make,Model,Description,Price [1] => 1997,Ford,E350,"ac, abs, moon",3000.00 [2] => 1999,Chevy,"Venture ""Extended Edition""","",4900.00 [3] => 1999,Chevy,"Venture ""Extended Edition, Very Large""","",5000.00 [4] => 1996,Jeep,Grand Cherokee,"MUST SELL! air, moon roof, loaded",4799.00 ) to further convert $data = array(); foreach($lines as $row) { $data[] = str_getcsv($row); } print_r($data); which will output Array ( [0] => Array ( [0] => Year [1] => Make [2] => Model [3] => Description [4] => Price ) [1] => Array ( [0] => 1997 [1] => Ford [2] => E350 [3] => ac, abs, moon [4] => 3000.00 ) [2] => Array ( [0] => 1999 [1] => Chevy [2] => Venture "Extended Edition" [3] => [4] => 4900.00 ) [3] => Array ( [0] => 1999 [1] => Chevy [2] => Venture "Extended Edition, Very Large" [3] => [4] => 5000.00 ) [4] => Array ( [0] => 1996 [1] => Jeep [2] => Grand Cherokee [3] => MUST SELL! air, moon roof, loaded [4] => 4799.00 ) ) Previous Comments: ------------------------------------------------------------------------ [2012-04-27 03:11:17] darren at dcook dot org The problem can also be shown with the example from the Wikipedia page (http://en.wikipedia.org/wiki/Comma-separated_values): $s2=<<<EOD Year,Make,Model,Description,Price 1997,Ford,E350,"ac, abs, moon",3000.00 1999,Chevy,"Venture ""Extended Edition""","",4900.00 1999,Chevy,"Venture ""Extended Edition, Very Large""","",5000.00 1996,Jeep,Grand Cherokee,"MUST SELL! air, moon roof, loaded",4799.00 EOD; $lines=str_getcsv($s2,"\n"); print_r($lines); It outputs: Array ( [0] => Year,Make,Model,Description,Price [1] => 1997,Ford,E350,"ac, abs, moon",3000.00 [2] => 1999,Chevy,"Venture ""Extended Edition""","",4900.00 [3] => 1999,Chevy,"Venture ""Extended Edition, Very Large""","",5000.00 [4] => 1996,Jeep,Grand Cherokee,"MUST SELL! [5] => air, moon roof, loaded",4799.00 ) But it should output: Array ( [0] => Year,Make,Model,Description,Price [1] => 1997,Ford,E350,"ac, abs, moon",3000.00 [2] => 1999,Chevy,"Venture ""Extended Edition""","",4900.00 [3] => 1999,Chevy,"Venture ""Extended Edition, Very Large""","",5000.00 [4] => 1996,Jeep,Grand Cherokee,"MUST SELL! air, moon roof, loaded",4799.00 ) ------------------------------------------------------------------------ [2011-09-22 16:45:02] talk at alexmingoia dot com Sorry... expected output should be array(4) { [0]=> string(15) "Name,Desc,Email" [1]=> string(4) "Alex" [2]=> string(18) "Is a PHP developer " [3]=> string(16) "a...@example.com" } ------------------------------------------------------------------------ [2011-09-22 16:41:15] talk at alexmingoia dot com Description: ------------ RFC4180 states that fields can contain line breaks as long as they are properly enclosed by double-quotes. str_getcsv treats line-breaks inside of enclosed fields as new records in the CSV. Setting 'auto_detect_line_ending' to TRUE or using "\r\n" instead of "\n" still produces incorrect results. Test script: --------------- $csv = file_get_contents('test.csv'); $csvArray = str_getcsv($csv, "\n"); var_dump($csvArray); Expected result: ---------------- array(4) { [0]=> string(15) "Name,Desc,Email" [1]=> string(4) "Alex" [2]=> string(18) "Is a PHP developer" [3]=> string(16) "a...@example.com" } Actual result: -------------- array(4) { [0]=> string(15) "Name,Desc,Email" [1]=> string(14) "Alex,"Is a PHP" [2]=> string(9) "developer" [3]=> string(17) ",a...@example.com" } ------------------------------------------------------------------------ -- Edit this bug report at https://bugs.php.net/bug.php?id=55763&edit=1