On Mar 9, 6:55 pm, Raymond Hettinger <pyt...@rcn.com> wrote: > [prueba] > > > The data often contains objects with attributes instead of tuples, and > > I expect the new namedtuple datatype to be used also as elements of > > the list to be processed. > > > But I haven't found a nice generalized way for that kind of pattern > > that aggregates from a list of one datatype to a list of key plus > > output datatype that would make it practical and suitable for > > inclusion in the standard library. > > Looks like you've searched the possibilities thoroughly and no one > aggregation function seems to meet all needs. That's usually a cue > to not try to build one and instead let simple python loops do the > work for you (that also saves the awkward itemgetter() calls in your > examples). To my eyes, all three examples look like straight-forward, > easy-to-write, easy-to-read, fast plain python: > > >>> d = defaultdict(int) > >>> for color, n, info in data: > ... d[color] += n > >>> d.items() > > [('blue', 6), ('yellow', 3), ('red', 4)] > > >>> d = defaultdict(list) > >>> for color, n, info in data: > > ... d[color].append(n)>>> d.items() > > [('blue', [5, 1]), ('yellow', [3]), ('red', [2, 2])] > > >>> d = defaultdict(set) > >>> for color, n, info in data: > ... d[color].add(n) > >>> d.items() > > [('blue', set([1, 5])), ('yellow', set([3])), ('red', set([2]))] > > I don't think you can readily combine all three examples into a single > aggregator without the obfuscation and awkwardness that comes from > parameterizing all of the varying parts: > > def aggregator(default_factory, adder, iterable, keyfunc, valuefunc): > d = defaultdict(default_factory) > for record in iterable: > key = keyfunc(record) > value = valuefunc(record) > adder(d[key], value) > return d.items() > > >>> aggregator(list, list.append, data, itemgetter(0), itemgetter(1)) > > [('blue', [5, 1]), ('yellow', [3]), ('red', [2, 2])]>>> aggregator(set, > set.add, data, itemgetter(0), itemgetter(1)) > > [('blue', set([1, 5])), ('yellow', set([3])), ('red', set([2]))] > > Yuck! Plain Python wins. > > Raymond > > P.S. The aggregator doesn't work so well for: > > >>> aggregator(int, operator.iadd, data, itemgetter(0), itemgetter(1)) > > [('blue', 0), ('yellow', 0), ('red', 0)] > > The problem is that operator.iadd() doesn't have a way to both > retrieve > and store back into a dictionary.
Yes thinking about this more, one probably needs to have two code paths depending if the type returned by default_factory is mutable or immutable. But you are probably right that the ratio of redundancy/ variability is pretty low for such a function and the plain written out for loop is not too painful. The only redundancy is the creation and manipulation of the dictionary and the explicit looping. -- http://mail.python.org/mailman/listinfo/python-list