[prueba] > The data often contains objects with attributes instead of tuples, and > I expect the new namedtuple datatype to be used also as elements of > the list to be processed. > > But I haven't found a nice generalized way for that kind of pattern > that aggregates from a list of one datatype to a list of key plus > output datatype that would make it practical and suitable for > inclusion in the standard library.
Looks like you've searched the possibilities thoroughly and no one aggregation function seems to meet all needs. That's usually a cue to not try to build one and instead let simple python loops do the work for you (that also saves the awkward itemgetter() calls in your examples). To my eyes, all three examples look like straight-forward, easy-to-write, easy-to-read, fast plain python: >>> d = defaultdict(int) >>> for color, n, info in data: ... d[color] += n >>> d.items() [('blue', 6), ('yellow', 3), ('red', 4)] >>> d = defaultdict(list) >>> for color, n, info in data: ... d[color].append(n) >>> d.items() [('blue', [5, 1]), ('yellow', [3]), ('red', [2, 2])] >>> d = defaultdict(set) >>> for color, n, info in data: ... d[color].add(n) >>> d.items() [('blue', set([1, 5])), ('yellow', set([3])), ('red', set([2]))] I don't think you can readily combine all three examples into a single aggregator without the obfuscation and awkwardness that comes from parameterizing all of the varying parts: def aggregator(default_factory, adder, iterable, keyfunc, valuefunc): d = defaultdict(default_factory) for record in iterable: key = keyfunc(record) value = valuefunc(record) adder(d[key], value) return d.items() >>> aggregator(list, list.append, data, itemgetter(0), itemgetter(1)) [('blue', [5, 1]), ('yellow', [3]), ('red', [2, 2])] >>> aggregator(set, set.add, data, itemgetter(0), itemgetter(1)) [('blue', set([1, 5])), ('yellow', set([3])), ('red', set([2]))] Yuck! Plain Python wins. Raymond P.S. The aggregator doesn't work so well for: >>> aggregator(int, operator.iadd, data, itemgetter(0), itemgetter(1)) [('blue', 0), ('yellow', 0), ('red', 0)] The problem is that operator.iadd() doesn't have a way to both retrieve and store back into a dictionary. -- http://mail.python.org/mailman/listinfo/python-list