On Sunday, January 20, 2013 3:04:39 PM UTC-7, Garry wrote: > I'm trying to manipulate family tree data using Python. > > I'm using linux and Python 2.7.3 and have data files saved as Linux formatted > cvs files > > The data appears in this format: > > > > Marriage,Husband,Wife,Date,Place,Source,Note0x0a > > Note: the Source field or the Note field can contain quoted data (same as the > Place field) > > > > Actual data: > > [F0244],[I0690],[I0354],1916-06-08,"Neely's Landing, Cape Gir. Co, MO",,0x0a > > [F0245],[I0692],[I0355],1919-09-04,"Cape Girardeau Co, MO",,0x0a > > > > code snippet follows: > > > > import os > > import re > > #I'm using the following regex in an attempt to decode the data: > > RegExp2 = > "^(\[[A-Z]\d{1,}\])\,(\[[A-Z]\d{1,}\])\,(\[[A-Z]\d{1,}\])\,(\d{,4}\-\d{,2}\-\d{,2})\,(.*|\".*\")\,(.*|\".*\")\,(.*|\".*\")" > > # > > line = "[F0244],[I0690],[I0354],1916-06-08,\"Neely's Landing, Cape Gir. Co, > MO\",," > > # > > (Marriage,Husband,Wife,Date,Place,Source,Note) = re.split(RegExp2,line) > > # > > #However, this does not decode the 7 fields. > > # The following error is displayed: > > Traceback (most recent call last): > > File "<stdin>", line 1, in <module> > > ValueError: too many values to unpack > > # > > # When I use xx the fields apparently get unpacked. > > xx = re.split(RegExp2,line) > > # > > >>> print xx[0] > > > > >>> print xx[1] > > [F0244] > > >>> print xx[5] > > "Neely's Landing, Cape Gir. Co, MO" > > >>> print xx[6] > > > > >>> print xx[7] > > > > >>> print xx[8] > > > > Why is there an extra NULL field before and after my record contents? > > I'm stuck, comments and solutions greatly appreciated. > > > > Garry
Thanks everyone for your comments. I'm new to Python, but can get around in Perl and regular expressions. I sure was taking the long way trying to get the cvs data parsed. Sure hope to teach myself python. Maybe I need to look into courses offered at the local Jr College! Garry -- http://mail.python.org/mailman/listinfo/python-list