I'm concerned that the on_missing() part of the
proposal is gratuitous. The main use cases for defaultdict have a simple
factory that supplies a zero, empty list, or empty set. The on_missing()
hook is only there to support the rarer case of needing a key to
compute a default value. The hook is not needed for the main use
cases.
As it stands, we're adding a method to regular
dicts that cannot be usefully called directly. Essentially, it is a
framework method meant to be overridden in a subclass. So, it only makes
sense in the context of subclassing. In the meantime, we've added an
oddball method to the main dict API, arguably the most important object API in
Python.
To use the hook, you write something like
this:
class D(dict):
def
on_missing(self, key):
return somefunc(key)
However, we can already do something like that
without the hook:
class
D(dict):
def __getitem__(self, key): try: return dict.__getitem__(self, key) except KeyError: self[key] = value = somefunc(key) return value The latter form is already possible, doesn't
require modifying a basic API, and is arguably clearer about when it is called
and what it does (the former doesn't explicitly show that the returned value
gets saved in the dictionary).
Since we can already do the latter form,
we can get some insight into whether the need has ever actually arisen in
real code. I scanned the usual sources (my own code, the standard library,
and my most commonly used third-party libraries) and found no instances of code
like that. The closest approximation was safe_substitute() in
string.Template where missing keys returned themselves as a default value.
Other than that, I conclude that there isn't sufficient need to warrant adding a
funky method to the API for regular dicts.
I wondered why the safe_substitute() example was unique. I think the answer is that we
normally handle default computations through simple in-line code ("if k in d:
do1() else do2()" or a try/except pair). Overriding on_missing() then is
really only useful when you need to create a type that can be passed to a client
function that was expecting a regular dictionary. So it does come-up but
not much.
Aside: Why on_missing() is an oddball among
dict methods. When teaching dicts to beginner, all the methods are easily
explainable except this one. You don't call this method directly, you only
use it when subclassing, you have to override it to do anything useful, it hooks
KeyError but only when raised by __getitem__ and not other methods,
etc. I'm concerned that evening having this method in regular
dict API will create confusion about when to use dict.get(), when to
use dict.setdefault(), when to catch a KeyError, or when to LBYL. Adding
this one extra choice makes the choice more difficult.
My recommendation: Dump the on_missing()
hook. That leaves the dict API unmolested and allows a more
straight-forward implementation/explanation of collections.default_dict or
whatever it ends-up being named. The result is delightfully simple and
easy to understand/explain.
Raymond
|
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com