On Thu, 16 Feb 2006 13:11:49 -0800, Guido van Rossum <[EMAIL PROTECTED]> wrote:
>A bunch of Googlers were discussing the best way of doing the >following (a common idiom when maintaining a dict of lists of values >relating to a key, sometimes called a multimap): > > if key not in d: d[key] = [] > d[key].append(value) > >An alternative way to spell this uses setdefault(), but it's not very readable: > > d.setdefault(key, []).append(value) > >and it also suffers from creating an unnecessary list instance. >(Timings were inconclusive; the approaches are within 5-10% of each >other in speed.) > >My conclusion is that setdefault() is a failure -- it was a >well-intentioned construct, but doesn't actually create more readable >code. > >Google has an internal data type called a DefaultDict which gets >passed a default value upon construction. Its __getitem__ method, >instead of raising KeyError, inserts a shallow copy (!) of the given >default value into the dict when the value is not found. So the above >code, after > > d = DefaultDict([]) > >can be written as simply > > d[key].append(value) > Wouldn't it be more generally powerful to pass type or factory function to use to instantiate a default object when a missing key is encountered, e.g. d = DefaultDict(list) then d[key].append(value) but then you can also do d = DefaultDict(dict) d[key].update(a=123) or class Foo(object): pass d = DefaultDict(Foo) d[key].phone = '415-555-1212' etc. No worries about generalizing shallow copying either ;-) >Note that of all the possible semantics for __getitem__ that could >have produced similar results (e.g. not inserting the default in the >underlying dict, or not copying the default value), the chosen >semantics are the only ones that makes this example work. > >Over lunch with Alex Martelli, he proposed that a subclass of dict >with this behavior (but implemented in C) would be a good addition to >the language. It looks like it wouldn't be hard to implement. It could >be a builtin named defaultdict. The first, required, argument to the >constructor should be the default value. Remaining arguments (even >keyword args) are passed unchanged to the dict constructor. > >Some more design subtleties: > >- "key in d" still returns False if the key isn't there >- "d.get(key)" still returns None if the key isn't there >- "d.default" should be a read-only attribute giving the default value > >Feedback? > See above. Regards, Bengt Richter _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com