Problems matching or parsing with delimiters in text

KEVIN ZEMBOWER Mon, 28 Mar 2005 08:13:34 -0800

I'm trying to read in text lines from a file that look like this:
"B-B01","Eng","Binder for Complete Set of Population Reports",13,0
"C-CD01","Eng","The Condoms CD-ROM",12,1
"F-J41a","Fre",,13,1
"F-J41a","SPA",,13,1
"M-FC01","Eng","Africa Flip Charts- Planning Your Family (E,F, 
Swahili)(12""x9"")",7,1
"M-FC01","Fre","Africa Flip Charts- Planning Your Family (E,F, 
Swahili)(12""x9"")",7,1


The first two lines are typical of most of the file. The second two have a 
blank third field and the last two show embedded commas and escaped double 
quotes in the third field. This is an output of another program, but I can 
filter it and make substitutions if that makes anything easier.

I'm trying to parse it with these statements:
while (<>) { # While there are more records in the inventory export file called 
on the command line
   ++$ln; #increment the line number count
   my ($partno, $language, $title, $cost, $available) = 
m["(.*)","(.*)","?(.*?)"?,(.*),(.*)$];
   print "PN=$partno, L=$language, T=$title, C=$cost, A=$available\n" if $debug;
   next if $debug;
   createlangversion($partno, $language, $title, $cost, $available);
} #while there are more lines in the import data file

The output looks like this:
[EMAIL PROTECTED]:~/public_html/orderDB/obsolete$ ./loadInventory.pl ../tmp/t 
PN=B-B01, L=Eng, T=Binder for Complete Set of Population Reports, C=13, A=0
PN=C-CD01, L=Eng, T=The Condoms CD-ROM, C=12, A=1
PN=F-J41a, L=Fre, T=, C=13, A=1
PN=F-J41a, L=SPA, T=, C=13, A=1
PN=M-FC01, L=Eng, T=Africa Flip Charts- Planning Your Family (E, C=F, 
Swahili)(12""x9"")",7, A=1
PN=M-FC01, L=Fre, T=Africa Flip Charts- Planning Your Family (E, C=F, 
Swahili)(12""x9"")",7, A=1
[EMAIL PROTECTED]:~/public_html/orderDB/obsolete$ 

Note that the first four lines parsed correctly, but that the last two 
incorrectly assigned $cost to part of the title.

Can anyone help me write a match which would parse all of these lines 
correctly? Extra bonus points for explaining it throughly, so I don't have to 
ask this question here again. If it's easier to just filter or substitute in 
the original input file, what should I do?

Thank you all in advance for your help and suggestions.

-Kevin Zembower

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Problems matching or parsing with delimiters in text

Reply via email to