Re: Why defaultdict?
On 07/02/2010 06:11 AM, Steven D'Aprano wrote: I would like to better understand some of the design choices made in collections.defaultdict. Firstly, to initialise a defaultdict, you do this: from collections import defaultdict d = defaultdict(callable, *args) which sets an attribute of d default_factory which is called on key lookups when the key is missing. If callable is None, defaultdicts are *exactly* equivalent to built-in dicts, so I wonder why the API wasn't added on to dict rather than a separate class that needed to be imported. That is: d = dict(*args) d.default_factory = callable That's just not what dicts, a very simple and elementary data type, do. I know this isn't really a good reason. In addition to what Chris said, I expect this would complicate the dict code a great deal. If you failed to explicitly set the dict's default_factory, it would behave precisely as dicts do now. So why create a new class that needs to be imported, rather than just add the functionality to dict? Is it just an aesthetic choice to support passing the factory function as the first argument? I would think that the advantage of having it built- in would far outweigh the cost of an explicit attribute assignment. The cost of this feature would be over-complication of the built-in dict type when a subclass would do just as well Second, why is the factory function not called with key? There are three obvious kinds of default values a dict might want, in order of more-to- less general: (1) The default value depends on the key in some way: return factory(key) I agree, this is a strange choice. However, nothing's stopping you from being a bit verbose about what you want and just doing it: class mydict(defaultdict): def __missing__(self, key): # ... the __missing__ method is really the more useful bit the defaultdict class adds, by the looks of it. -- Thomas (2) The default value doesn't depend on the key: return factory() (3) The default value is a constant: return C defaultdict supports (2) and (3): defaultdict(factory, *args) defaultdict(lambda: C, *args) but it doesn't support (1). If key were passed to the factory function, it would be easy to support all three use-cases, at the cost of a slightly more complex factory function. E.g. the current idiom: defaultdict(factory, *args) would become: defaultdict(lambda key: factory(), *args) (There is a zeroth case as well, where the default value depends on the key and what else is in the dict: factory(d, key). But I suspect that's well and truly YAGNI territory.) -- http://mail.python.org/mailman/listinfo/python-list
Re: Why defaultdict?
On Fri, Jul 2, 2010 at 2:20 AM, Thomas Jollans tho...@jollans.com wrote: On 07/02/2010 06:11 AM, Steven D'Aprano wrote: I would like to better understand some of the design choices made in collections.defaultdict. snip Second, why is the factory function not called with key? There are three obvious kinds of default values a dict might want, in order of more-to- less general: (1) The default value depends on the key in some way: return factory(key) I agree, this is a strange choice. However, nothing's stopping you from being a bit verbose about what you want and just doing it: class mydict(defaultdict): def __missing__(self, key): # ... the __missing__ method is really the more useful bit the defaultdict class adds, by the looks of it. Nitpick: You only need to subclass dict, not defaultdict, to use __missing__(). See the part of the docs Raymond Hettinger quoted. Cheers, Chris -- http://mail.python.org/mailman/listinfo/python-list
Re: Why defaultdict?
On 07/02/2010 11:26 AM, Chris Rebert wrote: On Fri, Jul 2, 2010 at 2:20 AM, Thomas Jollans tho...@jollans.com wrote: On 07/02/2010 06:11 AM, Steven D'Aprano wrote: I would like to better understand some of the design choices made in collections.defaultdict. snip Second, why is the factory function not called with key? There are three obvious kinds of default values a dict might want, in order of more-to- less general: (1) The default value depends on the key in some way: return factory(key) I agree, this is a strange choice. However, nothing's stopping you from being a bit verbose about what you want and just doing it: class mydict(defaultdict): def __missing__(self, key): # ... the __missing__ method is really the more useful bit the defaultdict class adds, by the looks of it. Nitpick: You only need to subclass dict, not defaultdict, to use __missing__(). See the part of the docs Raymond Hettinger quoted. Sorry Raymond, I didn't see you. This is where I cancel my filter out google groups users experiment. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why defaultdict?
On Fri, 02 Jul 2010 04:11:49 +, Steven D'Aprano wrote: I would like to better understand some of the design choices made in collections.defaultdict. [...] Thanks to all who replied. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Why defaultdict?
I would like to better understand some of the design choices made in collections.defaultdict. Firstly, to initialise a defaultdict, you do this: from collections import defaultdict d = defaultdict(callable, *args) which sets an attribute of d default_factory which is called on key lookups when the key is missing. If callable is None, defaultdicts are *exactly* equivalent to built-in dicts, so I wonder why the API wasn't added on to dict rather than a separate class that needed to be imported. That is: d = dict(*args) d.default_factory = callable If you failed to explicitly set the dict's default_factory, it would behave precisely as dicts do now. So why create a new class that needs to be imported, rather than just add the functionality to dict? Is it just an aesthetic choice to support passing the factory function as the first argument? I would think that the advantage of having it built- in would far outweigh the cost of an explicit attribute assignment. Second, why is the factory function not called with key? There are three obvious kinds of default values a dict might want, in order of more-to- less general: (1) The default value depends on the key in some way: return factory(key) (2) The default value doesn't depend on the key: return factory() (3) The default value is a constant: return C defaultdict supports (2) and (3): defaultdict(factory, *args) defaultdict(lambda: C, *args) but it doesn't support (1). If key were passed to the factory function, it would be easy to support all three use-cases, at the cost of a slightly more complex factory function. E.g. the current idiom: defaultdict(factory, *args) would become: defaultdict(lambda key: factory(), *args) (There is a zeroth case as well, where the default value depends on the key and what else is in the dict: factory(d, key). But I suspect that's well and truly YAGNI territory.) Thanks in advance, -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Why defaultdict?
On Thu, Jul 1, 2010 at 9:11 PM, Steven D'Aprano st...@remove-this-cybersource.com.au wrote: I would like to better understand some of the design choices made in collections.defaultdict. Perhaps python-dev should've been CC-ed... Firstly, to initialise a defaultdict, you do this: from collections import defaultdict d = defaultdict(callable, *args) which sets an attribute of d default_factory which is called on key lookups when the key is missing. If callable is None, defaultdicts are *exactly* equivalent to built-in dicts, so I wonder why the API wasn't added on to dict rather than a separate class that needed to be imported. That is: d = dict(*args) d.default_factory = callable If you failed to explicitly set the dict's default_factory, it would behave precisely as dicts do now. So why create a new class that needs to be imported, rather than just add the functionality to dict? Don't know personally, but here's one thought: If it was done that way, passing around a dict could result in it getting a default_factory set where there wasn't one before, which could lead to strange results if you weren't anticipating that. The defaultdict solution avoids this. snip Second, why is the factory function not called with key? Agree, I've never understood this. Ruby's Hash::new does it better (http://ruby-doc.org/core/classes/Hash.html), and even supports your case 0; it calls the equivalent of default_factory(d, key) when generating a default value. There are three obvious kinds of default values a dict might want, in order of more-to- less general: (1) The default value depends on the key in some way: return factory(key) (2) The default value doesn't depend on the key: return factory() (3) The default value is a constant: return C defaultdict supports (2) and (3): defaultdict(factory, *args) defaultdict(lambda: C, *args) but it doesn't support (1). If key were passed to the factory function, it would be easy to support all three use-cases, at the cost of a slightly more complex factory function. snip (There is a zeroth case as well, where the default value depends on the key and what else is in the dict: factory(d, key). But I suspect that's well and truly YAGNI territory.) Cheers, Chris -- http://blog.rebertia.com -- http://mail.python.org/mailman/listinfo/python-list
Re: Why defaultdict?
On Jul 1, 9:11 pm, Steven D'Aprano st...@remove-this- cybersource.com.au wrote: I would like to better understand some of the design choices made in collections.defaultdict. . . . If callable is None, defaultdicts are *exactly* equivalent to built-in dicts, so I wonder why the API wasn't added on to dict rather than a separate class that needed to be imported. . . . Second, why is the factory function not called with key? There are three obvious kinds of default values a dict might want, in order of more-to- less general: (1) The default value depends on the key in some way: return factory(key) (2) The default value doesn't depend on the key: return factory() (3) The default value is a constant: return C The __missing__() magic method lets you provide a factory with a key. That method is supported by dict subclasses, making it easy to create almost any desired behavior. A defaultdict is an example. It is a dict subclass that calls a zero argument factory function. But with __missing__() can roll your own dict subclass to meet your other needs. A defaultdict was provided to meet one commonly requested set of use cases (mostly ones using int() and list() as factory functions). From the docs at http://docs.python.org/library/stdtypes.html#mapping-types-dict : '''New in version 2.5: If a subclass of dict defines a method __missing__(), if the key key is not present, the d[key] operation calls that method with the key key as argument. The d[key] operation then returns or raises whatever is returned or raised by the __missing__(key) call if the key is not present. No other operations or methods invoke __missing__(). If __missing__() is not defined, KeyError is raised. __missing__() must be a method; it cannot be an instance variable. For an example, see collections.defaultdict.''' Raymond -- http://mail.python.org/mailman/listinfo/python-list