Hi, Joshua, :)

On Thu, 16 Jan 2003, Scott, Joshua wrote:

> I've got a CSV file which I need to process.  The format is as follows.
>
> "Smith, John J",1/1/2002,1/15/2002,"Orlando, FL",Florida
> "Doe, John L",1/1/2002,1/15/2002,Los Angeles, California
>
> I've tried splitting it using:  @row = split(",",$data);

> The problem is with the fields that contain the commas between the
> quotes.  It's splitting the fields at each of these fields as well
> and I'd like to know how to avoid that.

The suggestions for using a module tailored for this purpose are the
way to go.  However, as a learning exercise, here's what I came up
with to satisfy your requirements:

#!/usr/bin/perl

use strict;
use warnings;

# Split CSV lines, which may have commands embedded in quoted strings.

my @lines = ( q("Smith, John J",1/1/2002,1/15/2002,"Orlando, FL",Florida),
              q("Doe, John L",1/1/2002,1/15/2002,Los Angeles, California) );

my @fields;

my $qs = q("');
my $sep = ",";

use re 'debug';

foreach (@lines) {
   # Simple split for strings that don't contain quotes.
   if( index( $_, q(") ) == -1 and
       index( $_, q(') ) == -1 ) {
      push @fields, [ split( ',', $_ ) ];
   }

   # Regex for others.
   print "$_\n";
   my @matches;
   while( /                # EITHER:
           ([$qs])         # A quote character.
              ([^$qs]+?)   # Followed by a bunch of non-quote chars.
           \1              # And ending with the same non-quote char.
           |               # OR:
           $sep?           # Optionally the separator character.
              ([^$sep]+?)  # Followed by a bunch of non-separator chars.
           (?:$sep|$)      # Then the end of the string or the separator char.
          /gx ) {
      print "\$2 = $2; \$3 = $3\n";
      # Throw away $1 - only used to bracket embedded quotes.
      push( @matches, $2 || $3 );
   }
   push @fields, \@matches if @matches;
}

print "@{$_}\n" foreach @fields;

Hope that is enlightening.  I'm sure there are better ways of doing
it, but I'm hardly an "expert" myself!

---Jason


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to