Peter Otten, your solution is very nice, it uses groupby splitting on
empty lines, so it doesn't need to read the whole files into memory.
But Daniel Nogradi says:
> But the names of the fields (node, x, y) keeps changing from file to
> file, even their number is not fixed, sometimes it is (node, x, y, z).
Your version with the converters dict fails to convert the number of
node, z fields, etc. (generally using such converters dict is an
elegant solution, it allows to define string, float, etc fields):
> converters = dict(
> x=int,
> y=int
> )
I have created a version with a RE, but it's probably too much rigid,
it doesn't handle files with the z field, etc:
data = """node 10
y 1
x -1
node 11
x -2
y 1
z 5
node 12
x -3
y 1
z 6"""
import re
unpack = re.compile(r"(\D+) \s+ ([-+]? \d+) \s+" * 3, re.VERBOSE)
result = []
for obj in unpack.finditer(data):
block = obj.groups()
d = dict((block[i], int(block[i+1])) for i in xrange(0, 6, 2))
result.append(d)
print result
So I have just modified and simplified your quite nice solution (I have
removed the pprint, but it's the same):
def open(filename):
from cStringIO import StringIO
return StringIO(data)
from itertools import groupby
records = []
for empty, record in groupby(open("records.txt"), key=str.isspace):
if not empty:
pairs = ([k, int(v)] for k,v in map(str.split, record))
records.append(dict(pairs))
print records
Bye,
bearophile
--
http://mail.python.org/mailman/listinfo/python-list