Deborah Swanson wrote: > This is how the list of namedtuples is originally created from a csv: > > infile = open("E:\\Coding projects\\Pycharm\\Moving\\Moving 2017 in - > test.csv") > rows = csv.reader(infile)fieldnames = next(rows) > Record = namedtuple("Record", fieldnames) > records = [Record._make(fieldnames)] > records.extend(Record._make(row) for row in rows) > > Thanks to Peter Otten for this succinct code, and to Greg Ewing for > suggesting namedtuples for this type of problem to begin with. > > Namedtuples worked beautifully for the first two thirds of this code, > but I've run into a snag attempting to proceed. > > Here's my code up to the snag, and I'll explain afterwards what I'm > trying to do: > > import operator > records[1:] = sorted(records[1:], key=operator.attrgetter("title", > "Date")) > > groups = defaultdict() > for r in records[1:]: > # if the key doesn't exist, make a new group > if r.title not in groups.keys(): > groups[r.title] = [r] > # if key (group) exists, append this record > else: > groups[r.title].append(r) > > # make lookup table: indices for field names > records_idx = {} > for idx, label in enumerate(records[0]): > records_idx[label] = idx > > LABELS = ['Location', 'ST', 'co', 'miles', 'first', 'Kind', 'Notes'] # > look at field values for each label on group for group in > groups.values(): > values = [] > for idx, row in enumerate(group): > for label in LABELS: > values.append(group[[idx][records_idx[label]]]) > <-snag > > I want to get lists of field values from the list of namedtuples, one > list of field values for each row in each group (groups are defined in > the section beginning with "groups = defaultdict()". > > LABELS defines the field names for the columns of field values of > interest. So all the locations in this group would be in one list, all > the states in another list, etc. (Jussi, I'm looking at your suggestion > for the next part.) > > (I'm quite sure this bit of code could be written with list and dict > comprehensions, but here I was just trying to get it to work, and > comprehensions still confuse me a little.) > > Using the debugger's watch window, from > group[[idx][records_idx[label]]], I get: > > idx = {int}: 0 > records_idx[label] = {int}: 4 > > which is the correct indices for the first row of the current group (idx > = 0) and the first field label in LABELS, 'Location' (records_idx[label] > = 4). > > And if I look at > > group[0][4] = 'Longview' > > this is also correct. Longview is the Location field value for the first > row of this group. > > However, > > group[[idx][records_idx[label]]] > gets an Index Error: list index out of range > > I've run into this kind of problem with namedtuples before, trying to > access field values with variable names, like: > > label = 'Location' > records.label > > and I get something like "'records' has no attribute 'label'. This can > be fixed by using the subscript form and an index, like: > > for idx, r in enumerate(records): > ... > records[idx] = r > > But here, I get the Index Error and I'm a bit baffled why. Both > subscripts evaluate to valid indices and give the correct value when > explicitly used. > > Can anyone see why I'm getting this Index error? and how to fix it?
I'm not completely sure I can follow you, but you seem to be mixing two problems (1) split a list into groups (2) convert a list of rows into a list of columns and making a kind of mess in the process. Functions to the rescue: #untested def split_into_groups(records, key): groups = defaultdict(list) for record in records: # no need to check if a group already exists # an empty list will automatically added for every # missing key groups[key(record)].append(record) return groups def extract_column(records, name): # you will agree that extracting one column is easy :) return [getattr(record, name) for record in records] def extract_columns(records, names): # we can build on that to make a list of columns return [extract_column(records, name) for name in names] wanted_columns = ['Location', ...] records = ... groups = split_into_groups(records, operator.attrgetter("title")) Columns = namedtuple("Columns", wanted_columns) for title, group in groups.items(): # for easier access we turn the list of columns # into a namedtuple of columns groups[title] = Columns._make(extract_columns(wanted_columns)) If all worked well you should now be able to get a group with group["whatever"] and all locations for that group with group["whatever"].Locations If there is a bug you can pinpoint the function that doesn't work and ask for specific help on that one. -- https://mail.python.org/mailman/listinfo/python-list