Ok, first off, this is more of a tutorial than question. It was very 
tough to get working, so I figure that I may help someone by posting it 
here.

I had the need to do a fgetcsv on a file, however the file had some 
binary characters in it that screwed up fgetcsv's ability to split the 
data. I ended up writing some pretty comprehensive preg's to figure it 
out and split it just as fgetcsv would do.

Obviously this isn't as fast as built in functions, however if you need 
a work-a-round, this may help.

<?php
// must send the txt file via stdin
// $argv[1] needs to have the delimiter
// If nothing is passed in $argv[1], the script will assume comma quote.

$file = fopen('php://stdin','r');
while (!feof($file)) {
   $line = fgets($file);
   if($chr = xh_isbinary($line)) ++$results['binary_found']['Character 
'.$chr];
   if(!$argv[1]) $row = fgetcsv2($line);
   elseif($argv[1]) $row = explode($argv[1],$line);
   if($row){
     ++$results['columns'][count($row).' Columns'];
   }
}
print_r($results);

function xh_isbinary($x){
preg_match('#[\x00-\x08]|[\x0b-\x0c]|[\x0e-\x1f]|[\x80-\xff]#',$x,$ret);
   return ord($ret[0]);
}

function fgetcsv2($str){
preg_match_all('#(?<=^|,)(")?((?(1)(?>[^"]|(?<=\\\\)")*|[^,]*))(?(1)"|(?:,|$))#',$str,$ret);
   return $ret[2];
}
?>


[EMAIL PROTECTED] php checkfile.php < test.csv
Array
(
     [columns] => Array
         (
             [108 Columns] => 249145
             [1 Columns] => 1
         )

     [total_lines] => 249146
     [binary_found] => Array
         (
             [Character 233] => 10756
         )
)

Reply via email to