Re: [Python-Dev] Proposal: defaultdict
d.get(key, [], True).append(value) hmm. are you sure you didn't just reinvent setdefault ? I'm reasonably sure I copied it on purpose, only with a name that isn't 100% misleading as to what it does ;) Heh. From the original Usenet posting that suggested the capability that was added in the form of setdefault: | I suggest a minor change: another optional argument to | get so that | | dict.get(item,default,flag) | | is equivalent to | | if dict.has_key(item): | VALUE IS dict[item] | else: | if flag: dict[item] = default-- This is all that's new | VALUE IS default | | but presumably more efficient. The response was a chorus of people saying Not a bad idea, but that flag sucks. It needs a separate method. :-) -- g ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On 2/19/06, Steve Holden [EMAIL PROTECTED] wrote: You are missing the rationale of the PEP process. The point is *not* documentation. The point of the PEP process is to channel and collect discussion, so that the BDFL can make a decision. The BDFL is not bound at all to the PEP process. To document things, we use (or should use) documentation. One could wish this ideal had been the case for the import extensions defined in PEP 302. (A bit off-topic, but that hit home, so I'll reply...) Agreed, and it's my fault they weren't, to some extent. I did try to find a suitable place, but the import docs are generally fairly scattered, and there wasn't a particularly good place to put the changes. Any suggestions would be gratefully accepted... Paul. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On Sun, Feb 19, 2006, Josiah Carlson wrote: I agree, there is nothing perfect. But at least in all of my use-cases, and the majority of the ones I've seen 'in the wild', my previous post provided an implementation that worked precisely like desired, and precisely like a regular dictionary, except when accessing a non-existant key via: value = dd[key] . __contains__, etc., all work exactly like they do with a non-defaulting dictionary. Iteration via popitem(), pop(key), items(), iteritems(), __iter__, etc., all work the way you would expect them. This is the telling point, IMO. My company makes heavy use of a default dict (actually, it's a default class because using constants as the lookup keys is mostly what we do and the convenience of foo.bar is compelling over foo['bar']). Anyway, our semantics are as Josiah outlines, and I can't see much use case for the alternatives. Those of you arguing something different: do you have a real use case (that you've implemented in real code)? -- Aahz ([EMAIL PROTECTED]) * http://www.pythoncraft.com/ 19. A language that doesn't affect the way you think about programming, is not worth knowing. --Alan Perlis ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On 2/20/06, Aahz [EMAIL PROTECTED] wrote: On Sun, Feb 19, 2006, Josiah Carlson wrote: I agree, there is nothing perfect. But at least in all of my use-cases, and the majority of the ones I've seen 'in the wild', my previous post provided an implementation that worked precisely like desired, and precisely like a regular dictionary, except when accessing a non-existant key via: value = dd[key] . __contains__, etc., all work exactly like they do with a non-defaulting dictionary. Iteration via popitem(), pop(key), items(), iteritems(), __iter__, etc., all work the way you would expect them. This is the telling point, IMO. My company makes heavy use of a default dict (actually, it's a default class because using constants as the lookup keys is mostly what we do and the convenience of foo.bar is compelling over foo['bar']). Anyway, our semantics are as Josiah outlines, and I can't see much use case for the alternatives. Can you say, for the record (since nobody else seems to care), if d.getorset(key, func) would work in your use cases? Those of you arguing something different: do you have a real use case (that you've implemented in real code)? (again, for the record) getorset provides the minimum needed functionality in a clean and intuitive way. Why go for a complicated solution when you simply don't need it? -- Adam Olsen, aka Rhamphoryncus ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Adam Olsen [EMAIL PROTECTED] wrote: Can you say, for the record (since nobody else seems to care), if d.getorset(key, func) would work in your use cases? It doesn't work for the multiset/accumulation case: dd[key] += 1 - Josiah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On Mon, Feb 20, 2006, Adam Olsen wrote: On 2/20/06, Aahz [EMAIL PROTECTED] wrote: On Sun, Feb 19, 2006, Josiah Carlson wrote: I agree, there is nothing perfect. But at least in all of my use-cases, and the majority of the ones I've seen 'in the wild', my previous post provided an implementation that worked precisely like desired, and precisely like a regular dictionary, except when accessing a non-existant key via: value = dd[key] . __contains__, etc., all work exactly like they do with a non-defaulting dictionary. Iteration via popitem(), pop(key), items(), iteritems(), __iter__, etc., all work the way you would expect them. This is the telling point, IMO. My company makes heavy use of a default dict (actually, it's a default class because using constants as the lookup keys is mostly what we do and the convenience of foo.bar is compelling over foo['bar']). Anyway, our semantics are as Josiah outlines, and I can't see much use case for the alternatives. Can you say, for the record (since nobody else seems to care), if d.getorset(key, func) would work in your use cases? Because I haven't been reading this thread all that closely, you'll have to remind me what this means. Those of you arguing something different: do you have a real use case (that you've implemented in real code)? (again, for the record) getorset provides the minimum needed functionality in a clean and intuitive way. Why go for a complicated solution when you simply don't need it? Ditto above. -- Aahz ([EMAIL PROTECTED]) * http://www.pythoncraft.com/ 19. A language that doesn't affect the way you think about programming, is not worth knowing. --Alan Perlis ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On 2/20/06, Josiah Carlson [EMAIL PROTECTED] wrote: Adam Olsen [EMAIL PROTECTED] wrote: Can you say, for the record (since nobody else seems to care), if d.getorset(key, func) would work in your use cases? It doesn't work for the multiset/accumulation case: dd[key] += 1 This is actually a fairly powerful argument for a subclass that redefines __getitem__ in favor of a new dict method. (Not to mention that it's much easier to pick a name for the subclass than for the method. :-) See the new thread I started. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On Feb 20, 2006, at 12:38 PM, Aahz wrote: ... Can you say, for the record (since nobody else seems to care), if d.getorset(key, func) would work in your use cases? Because I haven't been reading this thread all that closely, you'll have to remind me what this means. Roughly the same (save for method/function difference) as: def getorset(d, key, func): if key not in d: d[key] = func() return d[key] Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On Mon, Feb 20, 2006, Alex Martelli wrote: On Feb 20, 2006, at 12:38 PM, Aahz wrote: ... Can you say, for the record (since nobody else seems to care), if d.getorset(key, func) would work in your use cases? Because I haven't been reading this thread all that closely, you'll have to remind me what this means. Roughly the same (save for method/function difference) as: def getorset(d, key, func): if key not in d: d[key] = func() return d[key] That has the problem of looking clumsy, and doubly so for our use case where it's an attribute-based dict. Our style relies on the clean look of code like this: if order.street: ... Even as a dict, that doesn't look horrible: if order['street']: ... OTOH, this starts looking ugly: if order.get('street'): ... And this is just plain bad: if getattr(order, 'street'): ... Note that because we have to deal with *both* the possibility that the attribute/key may not be there *and* that it might be blank -- but both are semantically equivalent for our application -- there's no other clean coding style. Now, I realize this is different from the primary use case for needing mutable values, but any proposed default dict solution that doesn't cleanly support my use case is less interesting to me. -- Aahz ([EMAIL PROTECTED]) * http://www.pythoncraft.com/ 19. A language that doesn't affect the way you think about programming, is not worth knowing. --Alan Perlis ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On 2/20/06, Jim Jewett [EMAIL PROTECTED] wrote: Adam Olsen asked: ... d.getorset(key, func) would work in your use cases? It is an improvement over setdefault, because it doesn't always evaluate the expensive func. (But why should every call have to pass in the function, when it is a property of the dictionary?) Because usually it's a property of how you use it, not a property of the dictionary. The dictionary is just a generic storage mechanism. [snip] In other words, the program would work correctly if I passed in a normal but huge dictionary; I want to avoid that for reasons of efficiency. This isn't the only use for a mapping, but it is the only one I've seen where KeyError is expected by the program's normal flow. Looking at your explanation, I agree, getorset is useless for that use case. However, I'm beginning to think we shouldn't be comparing them. defaultdict is a powerful but heavyweight option, intended for complicated behavior. getorset and setdefault are intended to be very lightweight, even lighter than the try/except KeyError and if key not in X: X[key] = default memes we have right now. getorset's factory function is only appropriate for preexisting functions, not user defined ones. Essentially, I believe getorset should be discussed on its own merits, independent of the addition of a defaultdict class. Perhaps discussion of it (and the deprecation of setdefault) should wait until after defaultdict has been completed? -- Adam Olsen, aka Rhamphoryncus ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On Sat, 2006-02-18 at 12:53 +0100, Pierre Barbier de Reuille wrote: Guido Over lunch with Alex Martelli, he proposed that a subclass of Guido dict with this behavior (but implemented in C) would be a good Guido addition to the language. I agree that .setdefault() is a well-intentioned failure, although I'm much less concerned about any potential performance impact than the fact that it's completely unreadable. And while I like the basic idea, I also agree that deriving from dict is problematic, both because of the constructor signature is tough to forward, but also because dict is such a fundamental type that APIs that return dicts may have to be changed to allow passing in a factory type. I'd rather like to see what Pierre proposes, with a few minor differences. Well, first not ot break the current interface, and second because I think it reads better I would prefer : d = {'a': 1}' d['b'] # raises KeyError d.get('c') # evaluates to None d.default = 42 d['b'] # evaluates to 42 d.get('c') # evaluates to 42 So far so good. And to undo the default, you can simply do : del d.default Although this I'm not crazy about. If you let .default be a callable, you could also write this as def keyerror(): raise KeyError d.default = keyerror or possibly just this as a shortcut: d.default = KeyError The only question in my mind is whether or not getting a non-existent value under the influence of a given default value should stick that value in the dictionary or not. Agreed. I'm not sure whether .get(onearg) should return None or .default. I /think/ I want the latter, but I'd have to play with some real code to know for sure. -Barry signature.asc Description: This is a digitally signed message part ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On Fri, 2006-02-17 at 11:09 -0800, Guido van Rossum wrote: Thanks for all the constructive feedback. Here are some responses and a new proposal. - Yes, I'd like to kill setdefault() in 3.0 if not sooner. A worthy goal, but not possible unless you want to break existing code. I don't think it's worth a DeprecationWarning either. Slating it for removal in 3.0 seems fine. Everything else about your proposal seems great. -Barry signature.asc Description: This is a digitally signed message part ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Adam Olsen [EMAIL PROTECTED] wrote: However, I'm beginning to think we shouldn't be comparing them. defaultdict is a powerful but heavyweight option, intended for complicated behavior. Check out Guido's patch. It's not that heavyweight, and its intended behavior is to make some operations *more* intuitive, if not a bit faster in some cases. Whether or not getorset is introduced, I don't much care, as defaultdict will cover every use case I've been using setdefault for, as well as most of my use cases for get. - Josiah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On 2/19/06, Josiah Carlson [EMAIL PROTECTED] wrote: My post probably hasn't convinced you, but much of the confusion, I believe, is based on Martin's original belief that 'k in dd' should always return true if there is a default. One can argue that way, but then you end up on the circular train of thought that gets you to you can't do anything useful if that is the case, .popitem() doesn't work, len() is undefined, Keep it simple, keep it sane. A default factory implementation fundamentally modifies the behavior of the mapping. There is no single answer to the question what is the right behavior for contains, len, popitem as that depends on what the code that consumes the mapping is written like, what it is attempting to do, and what you are attempting to override it to do. Or, simply, on why you are providing a default value. Resisting the temptation to guess the why and just leaving the methods as is seems the best choice; overriding __contains__ to return true is much easier than reversing that behavior would be. An example when it could theoretically be used, if not particularly useful. The gettext.install() function was just updated to take a names parameter which controls which gettext accessor functions it adds to the builtin namespace. Its implementation looks for method in names to decide. Passing a default-true dict would allow the future behavior to be bind all checked names, but only if __contains__ returns True. Even though it would make a poor base implementation, and these effects aren't a good candidate for it, the code style that could best leverage such a __contains__ exists. Michael -- Michael Urman http://www.tortall.net/mu/blog ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Michael Urman wrote: On 2/19/06, Josiah Carlson [EMAIL PROTECTED] wrote: My post probably hasn't convinced you, but much of the confusion, I believe, is based on Martin's original belief that 'k in dd' should always return true if there is a default. One can argue that way, but then you end up on the circular train of thought that gets you to you can't do anything useful if that is the case, .popitem() doesn't work, len() is undefined, Keep it simple, keep it sane. A default factory implementation fundamentally modifies the behavior of the mapping. There is no single answer to the question what is the right behavior for contains, len, popitem as that depends on what the code that consumes the mapping is written like, what it is attempting to do, and what you are attempting to override it to do. Or, simply, on why you are providing a default value. Resisting the temptation to guess the why and just leaving the methods as is seems the best choice; overriding __contains__ to return true is much easier than reversing that behavior would be. I agree that there is simply no universally correct answer for the various uses of default_factory. I think ambiguity on points like this is a sign that something is overly general. In many of the concrete cases it is fairly clear how these methods should work. In the most obvious case (default_factory=list) what seems to be to be the correct implementation is one that no one is proposing, that is, x in d means d.get(x). But that uses the fact that the return value of default_factory() is a false value, which we cannot assume in general. And it effects .keys() -- which I would propose overriding for multidict (so it only returns keys with non-empty lists for values), but I don't see how it could be made correct for default_factory. I just don't see why we should cram all these potential features into dict by using a vague feature like default_factory. Why can't we just add a half-dozen new types of collections (to the module of the same name)? Each one will get its own page of documentation, a name, a proper __repr__, and well defined meaning for all of these methods that it shares with dict only insofar as it makes sense to share. Note that even if we use defaultdict or autodict or something besides changing dict itself, we still won't get a good __contains__, a good repr, or any of the other features that specific collection implementations will give us. Isn't there anyone else who sees the various dict-like objects being passed around as recipes, and thinks that maybe that's a sign they should go in the stdlib? The best of those recipes aren't all-encompassing, they just do one kind of container well. -- Ian Bicking | [EMAIL PROTECTED] | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Raymond Hettinger [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] [Terry Reedy] One is a 'universal dict' that maps every key to something -- the default if nothing else. That should not have the default ever explicitly entered. Udict.keys() should only give the keys *not* mapped to the universal value. Would you consider it a mapping invariant that k in dd implies k in dd.keys()? Is the notion of __contains__ at odds with notion of universality? No and not sure. I'll leave it to Martin v. Löwis to explain/defend his particular notion of a udict. My main point is that there are multiple legitimate variations of the notion of a default dict, so that there is no 'one right way' to design one. I notice that Michael Urman and Ian Bicking said much the same today. Of course, having said that different variations are useful in different situations, I would nowise claim that all variations can serve as drop-in replacements for regular dicts everywhere they are now used. I think that a new default-dict feature should cater to such variations. Beyond that, I don't know whether it is better to modify dict (with blank hooks) or add a new subclassable default-dict base type. Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Michael Urman [EMAIL PROTECTED] wrote: On 2/19/06, Josiah Carlson [EMAIL PROTECTED] wrote: My post probably hasn't convinced you, but much of the confusion, I believe, is based on Martin's original belief that 'k in dd' should always return true if there is a default. One can argue that way, but then you end up on the circular train of thought that gets you to you can't do anything useful if that is the case, .popitem() doesn't work, len() is undefined, Keep it simple, keep it sane. A default factory implementation fundamentally modifies the behavior of the mapping. There is no single answer to the question what is the right behavior for contains, len, popitem as that depends on what the code that consumes the mapping is written like, what it is attempting to do, and what you are attempting to override it to do. Or, simply, on why you are providing a default value. Resisting the temptation to guess the why and just leaving the methods as is seems the best choice; overriding __contains__ to return true is much easier than reversing that behavior would be. I agree, there is nothing perfect. But at least in all of my use-cases, and the majority of the ones I've seen 'in the wild', my previous post provided an implementation that worked precisely like desired, and precisely like a regular dictionary, except when accessing a non-existant key via: value = dd[key] . __contains__, etc., all work exactly like they do with a non-defaulting dictionary. Iteration via popitem(), pop(key), items(), iteritems(), __iter__, etc., all work the way you would expect them. The only nit is that code which iterates like: for key in keys: try: value = dd[key] except KeyError: continue (where 'keys' has nothing to do with dd.keys(), it is merely a listing of keys which are desired at this particular point) However, the following works like it always did: for key in keys: if key not in dd: continue value = dd[key] An example when it could theoretically be used, if not particularly useful. The gettext.install() function was just updated to take a names parameter which controls which gettext accessor functions it adds to the builtin namespace. Its implementation looks for method in names to decide. Passing a default-true dict would allow the future behavior to be bind all checked names, but only if __contains__ returns True. Even though it would make a poor base implementation, and these effects aren't a good candidate for it, the code style that could best leverage such a __contains__ exists. Indeed, there are cases where an always-true __contains__ exists, and the pure-Python implementation I previously posted can be easily modified to offer such a feature. However, because there are also use cases for the not-always-true __contains__, picking either as the one true way seems a bit unnecessary. Presumably, if one goes into the collections module, the other will too. Actually, they could share all of their code except for a simple flag which determines the always-true __contains__. With minor work, that 'flag', or really the single bit it would require, may even be embeddable into the type object. Arguably, there should be a handful of these defaulting dictionary-like objects, and for each variant, it should be documented what their use-cases are, and any gotcha's that will inevitably come up. - Josiah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Adam Olsen wrote: Demo/metaclass/Meta.py:55 That wouldn't break. If you had actually read the code, you would have seen it is try: ga = dict['__getattr__'] except KeyError: pass How would it break if dict had a default factory? ga would get the __getattr__ value, and everything would be fine. The KeyError is ignored, after all. Demo/tkinter/guido/AttrDialog.py:121 # Subclasses override self.classes Hmm try: cl = self.classes[c] except KeyError: cl = 'unknown' So cl wouldn't be 'unknown'. Why would that be a problem? Lib/ConfigParser.py:623 try: v = map[var] except KeyError: raise InterpolationMissingOptionError( option, section, rest, var) So there is no InterpolationMissingOptionError. *Of course not*. The whole point would be to provide a value for all interpolation variables. Lib/random.py:315 This entire functions samples k elements with indices between 0 and len(population). Now, people shouldn't be passing dictionaries in in the first place; that specific code tests whether there valid values at indices 0, n//2, and n. If the dictionary isn't really a sequence (i.e. if it doesn't provide values at all indices), the function may later fail even if it passes that test. With a default-valued dictionary, the function would not fail, but a large number of samples might be the default value. Lib/string.py:191 Same like ConfigParser: the intperpolation will always succeed, interpolating all values (rather than leaving $identifier in the string). That would be precisely the expected behaviour. Lib/weakref.py:56 # Currently uses UserDict but I assume it will switch to dict eventually Or, rather, UserDict might grow the on_missing feature as well. That is irrelevant for this issue, though: o = self.data[key]() if o is None: raise KeyError, key # line 56 else: return o So we are looking for lookup failures in self.data, here: self.dict is initialized to {} in UserDict, with no default factory. So there cannot be a change in behaviour. Perhaps the KeyError shouldn't ever get triggered in this case, I'm not sure. I think that's besides the point though. The programmer clearly expected it would. No. I now see your problem: An except KeyError does *not* mean that the programmer clearly expects it will raise an KeyError. Instead, the programmer expects it *might* raise a KeyError, and tries to deal with this situation. If the situation doesn't arise, the code continue just fine. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Guido van Rossum wrote: On 2/17/06, Ian Bicking [EMAIL PROTECTED] wrote: Guido van Rossum wrote: d = {} d.default_factory = set ... d[key].add(value) Another option would be: d = {} d.default_factory = set d.get_default(key).add(value) Unlike .setdefault, this would use a factory associated with the dictionary, and no default value would get passed in. Unlike the proposal, this would not override __getitem__ (not overriding __getitem__ is really the only difference with the proposal). It would be clear reading the code that you were not implicitly asserting they key in d was true. get_default isn't the best name, but another name isn't jumping out at me at the moment. Of course, it is not a Pythonic argument to say that an existing method should be overridden, or functionality made nameless simply because we can't think of a name (looking to anonymous functions of course ;) I'm torn. While trying to implement this I came across some ugliness in PyDict_GetItem() -- it would make sense if this also called on_missing(), but it must return a value without incrementing its refcount, and isn't supposed to raise exceptions -- so what to do if on_missing() returns a value that's not inserted in the dict? If the __getattr__()-like operation that supplies and inserts a dynamic default was a separate method, we wouldn't have this problem. OTOH most reviewers here seem to appreciate on_missing() as a way to do various other ways of alterning a dict's __getitem__() behavior behind a caller's back -- perhaps it could even be (ab)used to implement case-insensitive lookup. I don't like the fact that on_missing()/default_factory can change the behaviour of __getitem__, which upto now has been something simple and understandable. Why don't we put the on_missing()/default_factory functionality into get() instead? d.get(key, default) does what it did before. d.get(key) invokes on_missing() (and dict would have default_factory == type(None)) Bye, Walter Dörwald ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Guido van Rossum [EMAIL PROTECTED] writes: I'm torn. While trying to implement this I came across some ugliness in PyDict_GetItem() -- it would make sense if this also called on_missing(), but it must return a value without incrementing its refcount, and isn't supposed to raise exceptions This last bit has been a painful lie for quite some time. I don't know what can be done about it, though -- avoid the use of PyDict_GetItem() in situations where you don't expect string only dicts (so using it on globals and instance dicts would still be ok)? -- so what to do if on_missing() returns a value that's not inserted in the dict? Well, like some others I am a bit uncomfortable with changing the semantics of such an important operation on such an important data structure. But then I'm also not that unhappy with setdefault, so I must be weird. If the __getattr__()-like operation that supplies and inserts a dynamic default was a separate method, we wouldn't have this problem. Yes. OTOH most reviewers here seem to appreciate on_missing() as a way to do various other ways of alterning a dict's __getitem__() behavior behind a caller's back -- perhaps it could even be (ab)used to implement case-insensitive lookup. Well, I'm not sure I do. There seems to be quite a conceptual difference between being able to make a new kind of dictionary and mess with the behaviour of one that exists already, but I don't know if that matters in practice (the fact that you can currently do things like import sys; sys.__dict__.clear() doesn't seem to cause real problems). Finally, I'll just note that subclassing to modify the behaviour of a builtin type has generally been actively discouraged in python so far. If all dictionary lookups went through a method that you could override in Python (i.e. subclasses could replace ma_lookup, in effect) this would be easy to do in Python code. But they don't, and bug reports suggesting that they do have been rejected in the past (and I agree with the rejection, fwiw). So that rambled a bit. But in essence: I'd much prefer much prefer an addtion of a method or a type than modifictaion of existing behaviour. Cheers, mwh -- If you're talking useful, I'm not your bot. -- Tim Peters, 08 Nov 2001 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Quoting [EMAIL PROTECTED]: Guido Over lunch with Alex Martelli, he proposed that a subclass of Guido dict with this behavior (but implemented in C) would be a good Guido addition to the language. Instead, why not define setdefault() the way it should have been done in the first place? When you create a dict it has the current behavior. If you then call its setdefault() method that becomes the default value for missing keys. d = {'a': 1}' d['b'] # raises KeyError d.get('c') # evaluates to None d.setdefault(42) d['b'] # evaluates to 42 d.get('c') # evaluates to 42 For symmetry, setdefault() should probably be undoable: deldefault(), removedefault(), nodefault(), default_free(), whatever. Well, first not ot break the current interface, and second because I think it reads better I would prefer : d = {'a': 1}' d['b'] # raises KeyError d.get('c') # evaluates to None d.default = 42 d['b'] # evaluates to 42 d.get('c') # evaluates to 42 And to undo the default, you can simply do : del d.default And of course, you can get the current value : d.default But then, as proposed many times, I would rather see a function call. Like : d.default = lambda key: 42 The argument of the function is the current key. It would allow things like that : d.default = time_comsuming_operation where time_comsuming_operation get a single argument. The only question in my mind is whether or not getting a non-existent value under the influence of a given default value should stick that value in the dictionary or not. down-with-more-builtins-ly, y'rs, Skip ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/pierre.barbier%40cirad.fr ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On Feb 18, 2006, at 2:33 AM, Martin v. Löwis wrote: I don't understand. In the rationale of PEP 333, it says The rationale for requiring a dictionary is to maximize portability between servers. The alternative would be to define some subset of a dictionary's methods as being the standard and portable interface. That rationale is not endangered: if the environment continues to be a dict exactly, servers continue to be guaranteed what precise set of operations is available on the environment. Yes it is endangered. Well, as you say: you get a KeyError if there is an error with the key. With a default_factory, there isn't normally an error with the key. But there should be. Consider the case of two servers. One which takes all the items out of the dictionary (using items()) and puts them in some other data structure. Then it checks if the Date header has been set. It was not, so it adds it. Consider another similar server which checks if the Date header has been set on the dict passed in by the user. The default_factory then makes one up. Different behavior due to internal implementation details of how the server uses the dict object, which is what the restriction to _exactly_ dict prevents. Consider another server which takes the dict instance and transports it across thread boundaries, from the wsgi-app's thread to the main server thread. Because WSGI specifies that you can only use 'dict', and the server checked that type(obj) == dict, it is guaranteed that using the dict won't run thread-unsafe code. That is now broken, since dict.__getitem__ can now invoke arbitrary user code. That is a major change. James ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
James Y Knight wrote: But there should be. Consider the case of two servers. One which takes all the items out of the dictionary (using items()) and puts them in some other data structure. Then it checks if the Date header has been set. It was not, so it adds it. Consider another similar server which checks if the Date header has been set on the dict passed in by the user. The default_factory then makes one up. Different behavior due to internal implementation details of how the server uses the dict object, which is what the restriction to _exactly_ dict prevents. Right. I would claim that this is an artificial example: you can't provide a HTTP_DATE value in a default_factory implementation, since you don't know what the key is. However, you are now making up a different rationale from the one the PEP specifies: The PEP says that you need an exact dict so that everybody knows precisely how the dictionary behaves; instead of having to define which precise subset of the dict API is to be used. *That* goal is still achieved: everybody knows that the dict might have an on_missing/default_factory implementation. So to find out whether HTTP_DATE has a value (which might be defaulted), you need to invoke d['HTTP_DATE']. Consider another server which takes the dict instance and transports it across thread boundaries, from the wsgi-app's thread to the main server thread. Because WSGI specifies that you can only use 'dict', and the server checked that type(obj) == dict, it is guaranteed that using the dict won't run thread-unsafe code. That is now broken, since dict.__getitem__ can now invoke arbitrary user code. That is a major change. Not at all. dict.__getitem__ could always invoke arbitrary user code, through __hash__. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On 2/18/06, James Y Knight [EMAIL PROTECTED] wrote: On Feb 18, 2006, at 2:33 AM, Martin v. Löwis wrote: Well, as you say: you get a KeyError if there is an error with the key. With a default_factory, there isn't normally an error with the key. But there should be. Consider the case of two servers. One which takes all the items out of the dictionary (using items()) and puts them in some other data structure. Then it checks if the Date header has been set. It was not, so it adds it. Consider another similar server which checks if the Date header has been set on the dict passed in by the user. The default_factory then makes one up. Different behavior due to internal implementation details of how the server uses the dict object, which is what the restriction to _exactly_ dict prevents. It just occured to me, what affect does this have on repr? Does it attempt to store the default_factory in the representation, or does it remove it? Is it even possible to store a reference to a builtin such as list and have eval restore it? -- Adam Olsen, aka Rhamphoryncus ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Guido van Rossum [EMAIL PROTECTED] writes: If the __getattr__()-like operation that supplies and inserts a dynamic default was a separate method, we wouldn't have this problem. Why implement it in the dictionary type at all? If, for intance, the default value functionality were provided as a decorator, it could be used with all kinds of mappings. I.e. you could have something along these lines: class defaultwrapper(object): def __init__(self, base, factory): self.__base = base self.__factory = factory def __getitem__(self, key): try: return self.__base[key] except KeyError: value = self.__factory() self.__base[key] = value return value def __getattr__(self, attr): return getattr(self.__base, attr) def test(): dd = defaultwrapper({}, list) dd[abc].append(1) dd[abc].append(2) dd[def].append(1) assert sorted(dd.keys()) == [abc, def] assert sorted(dd.values()) == [[1], [1, 2]] assert sorted(dd.items()) == [(abc, [1, 2]), (def, [1])] assert dd.has_key(abc) assert not dd.has_key(xyz) The precise semantics would have to be determined yet, of course. OTOH most reviewers here seem to appreciate on_missing() as a way to do various other ways of alterning a dict's __getitem__() behavior behind a caller's back -- perhaps it could even be (ab)used to implement case-insensitive lookup. case-insensitive lookup could be implemented with another wrapper/decorator. If you need both case-insitivity and a default value, you can easily stack the decorators. Bernhard -- Intevation GmbH http://intevation.de/ Skencil http://skencil.org/ Thuban http://thuban.intevation.org/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
At 01:44 PM 02/18/2006 -0500, James Y Knight wrote: On Feb 18, 2006, at 2:33 AM, Martin v. Löwis wrote: I don't understand. In the rationale of PEP 333, it says The rationale for requiring a dictionary is to maximize portability between servers. The alternative would be to define some subset of a dictionary's methods as being the standard and portable interface. That rationale is not endangered: if the environment continues to be a dict exactly, servers continue to be guaranteed what precise set of operations is available on the environment. Yes it is endangered. So we'll update the spec to say you can't use a dict that has the default set. It's not reasonable to expect that language changes might not require updates to a PEP. Certainly, we don't have to worry about being backward compatible when it's only Python 2.5 that's affected by the change. :) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Would people perhaps feel better if defaultdict *wasn't* a subclass of dict, but a distinct mapping type of its own? That would make it clearer that it's not meant to be a drop-in replacement for a dict in arbitrary contexts. Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
[Greg Ewing] Would people perhaps feel better if defaultdict *wasn't* a subclass of dict, but a distinct mapping type of its own? That would make it clearer that it's not meant to be a drop-in replacement for a dict in arbitrary contexts. Absolutely. That's the right way to avoid Liskov violations from altered invariants and API changes. Besides, with Python's propensity for duck typing, there's no reason to subclass when we don't have to. Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Bengt Richter wrote: My guess is that realistically default_factory will be used to make clean code for filling a dict, and then turning the factory off if it's to be passed into unknown contexts. This suggests that maybe the autodict behaviour shouldn't be part of the dict itself, but provided by a wrapper around the dict. The you can fill the dict through the wrapper, and still have a normal dict underneath to use for other purposes. Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On Sat, 18 Feb 2006 10:44:15 +0100 (CET), =?iso-8859-1?Q?Walter_D=F6rwald?= [EMAIL PROTECTED] wrote: Guido van Rossum wrote: On 2/17/06, Ian Bicking [EMAIL PROTECTED] wrote: Guido van Rossum wrote: d =3D {} d.default_factory =3D set ... d[key].add(value) Another option would be: d =3D {} d.default_factory =3D set d.get_default(key).add(value) Unlike .setdefault, this would use a factory associated with the diction= ary, and no default value would get passed in. Unlike the proposal, this would not override __getitem__ (not overriding __getitem__ is really the only difference with the proposal). It would = be clear reading the code that you were not implicitly asserting they key in d was true. get_default isn't the best name, but another name isn't jumping out at= me at the moment. Of course, it is not a Pythonic argument to say that an existing method should be overridden, or functio= nality made nameless simply because we can't think of a name (looking to anonymous functions of course ;) I'm torn. While trying to implement this I came across some ugliness in P= yDict_GetItem() -- it would make sense if this also called on_missing(), but it must return a value without incrementing its refcount, and isn't supposed to raise exceptions -- so what to do if on_m= issing() returns a value that's not inserted in the dict? If the __getattr__()-like operation that supplies and inserts a dynamic default was a separate method, we wouldn't have this problem. OTOH most reviewers here seem to appreciate on_missing() as a way to do v= arious other ways of alterning a dict's __getitem__() behavior behind a caller's back -- perhaps it could even be= (ab)used to implement case-insensitive lookup. I don't like the fact that on_missing()/default_factory can change the beha= viour of __getitem__, which upto now has been something simple and understandable. Why don't we put the on_missing()/default_factory functionality into get() = instead? d.get(key, default) does what it did before. d.get(key) invokes on_missing(= ) (and dict would have default_factory =3D=3D type(None)) OTOH, I forgot why it was desirable in the first place to overload d[k] with defaulting logic. E.g., why wouldn't d.defaulting[k] be ok to write when you want the d.default_factory action? on_missing feels more like a tracing hook though, so maybe it could always act either way if defined. Also, for those wanting to avoid lambda:42 as factory, would a callable test cost a lot? Of course then the default_factory name might require revision. Regards, Bengt Richter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Martin v. Löwis wrote: Guido van Rossum wrote: Feedback? I would like this to be part of the standard dictionary type, rather than being a subtype. d.setdefault([]) (one argument) should install a default value, and d.cleardefault() should remove that setting; d.default should be read-only. Alternatively, d.default could be assignable and del-able. The issue with setting the default this way is that a copy would have to be created if the behavior was to differ from the sometimes-confusing default argument behavior for functions. Also, I think has_key/in should return True if there is a default. It certainly seems desirable to see True where d[some_key] doesn't raise an exception, but one could argue either way. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Guido van Rossum wrote: On 2/16/06, Guido van Rossum [EMAIL PROTECTED] wrote: Over lunch with Alex Martelli, he proposed that a subclass of dict with this behavior (but implemented in C) would be a good addition to the language. It looks like it wouldn't be hard to implement. It could be a builtin named defaultdict. The first, required, argument to the constructor should be the default value. Remaining arguments (even keyword args) are passed unchanged to the dict constructor. Thanks for all the constructive feedback. Here are some responses and a new proposal. - Yes, I'd like to kill setdefault() in 3.0 if not sooner. - It would indeed be nice if this was an optional feature of the standard dict type. - I'm ignoring the request for other features (ordering, key transforms). If you want one of these, write a PEP! - Many, many people suggested to use a factory function instead of a default value. This is indeed a much better idea (although slightly more cumbersome for the simplest cases). One might think about calling it if it were callable, otherwise using it literally. Of course this would require jiggery-pokery int eh cases where you actually *wantes* the default value to be a callable (you'd have to provide a callable to return the callable as a default). - Some people seem to think that a subclass constructor signature must match the base class constructor signature. That's not so. The subclass constructor must just be careful to call the base class constructor with the correct arguments. Think of the subclass constructor as a factory function. True, but then this does get in the way of treating the base dict and its defaulting subtype polymorphically. That might not be a big issue. - There's a fundamental difference between associating the default value with the dict object, and associating it with the call. So proposals to invent a better name/signature for setdefault() don't compete. (As to one specific such proposal, adding an optional bool as the 3rd argument to get(), I believe I've explained enough times in the past that flag-like arguments that always get a constant passed in at the call site are a bad idea and should usually be refactored into two separate methods.) - The inconsistency introduced by __getitem__() returning a value for keys while get(), __contains__(), and keys() etc. don't show it, cannot be resolved usefully. You'll just have to live with it. Modifying get() to do the same thing as __getitem__() doesn't seem useful -- it just takes away a potentially useful operation. So here's a new proposal. Let's add a generic missing-key handling method to the dict class, as well as a default_factory slot initialized to None. The implementation is like this (but in C): def on_missing(self, key): if self.default_factory is not None: value = self.default_factory() self[key] = value return value raise KeyError(key) When __getitem__() (and *only* __getitem__()) finds that the requested key is not present in the dict, it calls self.on_missing(key) and returns whatever it returns -- or raises whatever it raises. __getitem__() doesn't need to raise KeyError any more, that's done by on_missing(). The on_missing() method can be overridden to implement any semantics you want when the key isn't found: return a value without inserting it, insert a value without copying it, only do it for certain key types/values, make the default incorporate the key, etc. But the default implementation is designed so that we can write d = {} d.default_factory = list to create a dict that inserts a new list whenever a key is not found in __getitem__(), which is most useful in the original use case: implementing a multiset so that one can write d[key].append(value) to add a new key/value to the multiset without having to handle the case separately where the key isn't in the dict yet. This also works for sets instead of lists: d = {} d.default_factory = set ... d[key].add(value) This seems like a very good compromise. [non-functional alternatives ...] regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Greg Ewing [EMAIL PROTECTED] wrote: Bengt Richter wrote: My guess is that realistically default_factory will be used to make clean code for filling a dict, and then turning the factory off if it's to be passed into unknown contexts. This suggests that maybe the autodict behaviour shouldn't be part of the dict itself, but provided by a wrapper around the dict. The you can fill the dict through the wrapper, and still have a normal dict underneath to use for other purposes. I prefer this to changing dictionaries directly. The actual wrapper could sit in the collections module, ready for subclassing/replacement of the on_missing method. - Josiah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Also, I think has_key/in should return True if there is a default. It certainly seems desirable to see True where d[some_key] doesn't raise an exception, but one could argue either way. Some things can be agreed by everyone: * if __contains__ always returns True, then it is a useless feature (since scripts containing a line such as if k in dd can always eliminate that line without affecting the algorithm). * if defaultdicts are supposed to be drop-in dict substitutes, then having __contains__ always return True will violate basic dict invariants: del d[some_key] assert some_key not in d Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Martin v. Löwis wrote: Adam Olsen wrote: Still -1. It's better, but it violates the principle of encapsulation by mixing how-you-use-it state with what-it-stores state. In doing that it has the potential to break an API documented as accepting a dict. Code that expects d[key] to raise an exception (and catches the resulting KeyError) will now silently succeed. Of course it will, and without quotes. That's the whole point. I believe that necessitates a PEP to document it. You are missing the rationale of the PEP process. The point is *not* documentation. The point of the PEP process is to channel and collect discussion, so that the BDFL can make a decision. The BDFL is not bound at all to the PEP process. To document things, we use (or should use) documentation. One could wish this ideal had been the case for the import extensions defined in PEP 302. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Quoting [EMAIL PROTECTED]: The only question in my mind is whether or not getting a non-existent value under the influence of a given default value should stick that value in the dictionary or not. It seems to me that there are at least two types of default dicts, which have opposite answers to that question. One is a 'universal dict' that maps every key to something -- the default if nothing else. That should not have the default ever explicitly entered. Udict.keys() should only give the keys *not* mapped to the universal value. Another is the accumlator dict. The default value is the identity (0, [], or whatever) for the type of accumulation. An adict must have the identity added, even though that null will usually be immedially incremented by +=1 or .append(ob) or whatever. Guido's last proposal was for the default default_dict to cater to the second type (and others needing the same behavior) while catering to the first by making the default fill-in method over-rideable. It we go with, for instance, wrappers in the collections module instead of modification of dict, then perhaps there should be at least two wrappers included, with each of these two behaviors. Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Raymond Hettinger wrote: Also, I think has_key/in should return True if there is a default. * if __contains__ always returns True, then it is a useless feature (since scripts containing a line such as if k in dd can always eliminate that line without affecting the algorithm). If you mean if __contains__ always returns True for a default dict, then it is a useless feature, I disagree. The code using if k in dd cannot be eliminated if you don't know that you have a default dict. * if defaultdicts are supposed to be drop-in dict substitutes, then having __contains__ always return True will violate basic dict invariants: del d[some_key] assert some_key not in d If you have a default value, you cannot ultimately del a key. This sequence is *not* a basic mapping invariant. If it was, then it would be also an invariant that, after del d[some_key], d[some_key] will raise a KeyError. This kind of invariant doesn't take into account that there might be a default value. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
[Martin v. Löwis] If you have a default value, you cannot ultimately del a key. This sequence is *not* a basic mapping invariant. You believe that key deletion is not basic to mappings? This kind of invariant doesn't take into account that there might be a default value. Precisely. Therefore, a defaultdict subclass violates the Liskov Substitution Principle. Of course, the __del__ followed __contains__ sequence is not the only invariant that is thrown-off. There are plenty of examples. Here's one that is absolutely basic to the method's contract: k, v = dd.popitem() assert k not in dd Any code that was expecting a dictionary and uses popitem() as a means of looping over and consuming entries will fail. No one should kid themselves that a default dictionary is a drop-in substitute. Much of the dict's API has an ambiguous meaning when applied to defaultdicts. If all keys are in-theory predefined, what is the meaning of len(dd)? Should dd.items() include any entries where the value is equal to the default or should the collection never store those? If the former, then how do you access the entries without looping over the whole contents? If the latter, then do you worry that dd[v]=k does not imply (k,v) in dd.items()? Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Raymond Hettinger wrote: If you have a default value, you cannot ultimately del a key. This sequence is *not* a basic mapping invariant. You believe that key deletion is not basic to mappings? No, not in the sense that the key will go away through deletion. I view a mapping as a modifiable partial function. There is some initial key/value association (in a classic mapping, it is initially empty), and then there are modifications. Key deletion means to reset the key to the initial association. Of course, the __del__ followed __contains__ sequence is not the only invariant that is thrown-off. There are plenty of examples. Here's one that is absolutely basic to the method's contract: k, v = dd.popitem() assert k not in dd Any code that was expecting a dictionary and uses popitem() as a means of looping over and consuming entries will fail. Well, code that loops over a dictionary using popitem typically terminates when the dictionary becomes false (or its length becomes zero). That code wouldn't be affected by the behaviour of in. No one should kid themselves that a default dictionary is a drop-in substitute. Much of the dict's API has an ambiguous meaning when applied to defaultdicts. Right. But it is only ambiguous until specified. Of course, in the face of ambiguity, refuse the temptation to guess. If all keys are in-theory predefined, what is the meaning of len(dd)? Taking my definition from the beginning of the message, it is the number of keys that have been modified from the initial mapping. Should dd.items() include any entries where the value is equal to the default or should the collection never store those? It should include all modified items, and none of the unmodified ones. Explicitly assigning the default value still makes the entry modified; you need to del it to set it back to unmodified. If the former, then how do you access the entries without looping over the whole contents? Not sure I understand the question. You use d[k] to access an entry. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
[Terry Reedy] One is a 'universal dict' that maps every key to something -- the default if nothing else. That should not have the default ever explicitly entered. Udict.keys() should only give the keys *not* mapped to the universal value. Would you consider it a mapping invariant that k in dd implies k in dd.keys()? Is the notion of __contains__ at odds with notion of universality? Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Raymond Hettinger [EMAIL PROTECTED] wrote: [Martin v. Löwis] This kind of invariant doesn't take into account that there might be a default value. Precisely. Therefore, a defaultdict subclass violates the Liskov Substitution Principle. class defaultdict(dict): def __getitem__(self, key): try: return dict.__getitem__(self, key) except KeyError: return self.on_missing(key) def on_missing(self, key): if not hasattr(self, 'default') or not callable(self.default): raise KeyError, key r = self[key] = self.default() return r In my opinion, the above implementation as a subclass does the right thing in regards to __del__, __contains__, get, pop, popitem, __len__, has_key, and anything else I can think of. Does it violate the Liskov Substitution Principle? Yes, but only if user code relies on dd[key] raising a KeyError on a lack of a key. This can be easily remedied by removing the default when it is unneeded, at which point, you get your Liskov Substitution. Of course, the __del__ followed __contains__ sequence is not the only invariant that is thrown-off. There are plenty of examples. Here's one that is absolutely basic to the method's contract: k, v = dd.popitem() assert k not in dd Any code that was expecting a dictionary and uses popitem() as a means of looping over and consuming entries will fail. a = defaultdict() a.default = list a['hello'] [] k, v = a.popitem() assert k not in a Seems to work for the above implementation. No one should kid themselves that a default dictionary is a drop-in substitute. Much of the dict's API has an ambiguous meaning when applied to defaultdicts. Actually, if one is careful, the dict's API is completely unchanged, except for direct access to the object via b = a[i]. del a['hello'] Traceback (most recent call last): File stdin, line 1, in ? KeyError: 'hello' 'hello' in a False a.get('hello') a.pop('hello') Traceback (most recent call last): File stdin, line 1, in ? KeyError: 'pop(): dictionary is empty' a.popitem() Traceback (most recent call last): File stdin, line 1, in ? KeyError: 'popitem(): dictionary is empty' len(a) 0 a.has_key('hello') False If all keys are in-theory predefined, what is the meaning of len(dd)? It depends on the sequence of actions. Play around with the above defaultdict implementation. From what I understood of Guido's original post, this is essentially what he was proposing, only implemented in C. Should dd.items() include any entries where the value is equal to the default or should the collection never store those? Yes, it should store any value which was stored via 'dd[k]=v', or any default value created via access by 'v=dd[k]' . If the former, then how do you access the entries without looping over the whole contents? Presumably one is looking for a single kind of default (empty list, 0, etc.) because one wanted to accumulate into them, similar to one of the following... for item, value in input: try: d[item] += value #or d[item].append(value) except KeyError: d[item] = value #or d[item] = [value] which becomes for item in input: dd[item] += 1 #or dd[item].append(value) Once accumulation has occurred, iteration over them via .iteritems(), .items(), .popitem(), etc., would progress exactly the same way as with a regular dictionary. If the code which is using the accumulated data does things like... for key in wanted_keys: try: value = dd[key] except KeyError: continue #do something nontrivial with value rather than... for key in wanted_keys: if key not in dd: continue value = dd[key] #do something nontrivial with value Then the user has at least three options to make it 'work right': 1. User can change to using 'in' to iterate rather than relying on a KeyError. 2. User could remember to remove the default. 3. User can create a copy of the default dictionary via dict(dd) and pass it into the code which relies on the non-defaulting dictionary. If the latter, then do you worry that dd[v]=k does not imply (k,v) in dd.items()? I personally wouldn't want the latter. My post probably hasn't convinced you, but much of the confusion, I believe, is based on Martin's original belief that 'k in dd' should always return true if there is a default. One can argue that way, but then you end up on the circular train of thought that gets you to you can't do anything useful if that is the case, .popitem() doesn't work, len() is undefined, Keep it simple, keep it sane. - Josiah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe:
Re: [Python-Dev] Proposal: defaultdict
Guido van Rossum wrote: d = DefaultDict([]) can be written as simply d[key].append(value) Feedback? Probably a good idea, has been proposed multiple times on clpy. One good thing would be to be able to specify either a default value or a factory function. While at it, other interesting dict subclasses could be: * sorteddict, practically reinvented by every larger project * keytransformdict, such as d = keytransformdict(str.lower). Georg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Guido van Rossum wrote: A bunch of Googlers were discussing the best way of doing the following (a common idiom when maintaining a dict of lists of values relating to a key, sometimes called a multimap): if key not in d: d[key] = [] d[key].append(value) An alternative way to spell this uses setdefault(), but it's not very readable: d.setdefault(key, []).append(value) and it also suffers from creating an unnecessary list instance. (Timings were inconclusive; the approaches are within 5-10% of each other in speed.) My conclusion is that setdefault() is a failure -- it was a well-intentioned construct, but doesn't actually create more readable code. Google has an internal data type called a DefaultDict which gets passed a default value upon construction. Its __getitem__ method, instead of raising KeyError, inserts a shallow copy (!) of the given default value into the dict when the value is not found. So the above code, after d = DefaultDict([]) can be written as simply d[key].append(value) Using a shallow copy of the default seems a bit too magical to me. How would this be done? Via copy.copy? And passing [] to the constructor of dict has a different meaning already. Fetching the default via a static/class method would solve both problems: class default_dict(dict): def __getitem__(self, key): if key in self: return dict.__getitem__(self, key) else: default = self.getdefault() self[key] = default return default class multi_map(default_dict): @staticmethod def getdefault(self): return [] class counting_dict(default_dict): @staticmethod def getdefault(self): return 0 [...] Bye, Walter Dörwald ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On Thu, 16 Feb 2006 13:11:49 -0800, Guido van Rossum [EMAIL PROTECTED] wrote: A bunch of Googlers were discussing the best way of doing the following (a common idiom when maintaining a dict of lists of values relating to a key, sometimes called a multimap): if key not in d: d[key] = [] d[key].append(value) An alternative way to spell this uses setdefault(), but it's not very readable: d.setdefault(key, []).append(value) and it also suffers from creating an unnecessary list instance. (Timings were inconclusive; the approaches are within 5-10% of each other in speed.) My conclusion is that setdefault() is a failure -- it was a well-intentioned construct, but doesn't actually create more readable code. Google has an internal data type called a DefaultDict which gets passed a default value upon construction. Its __getitem__ method, instead of raising KeyError, inserts a shallow copy (!) of the given default value into the dict when the value is not found. So the above code, after d = DefaultDict([]) can be written as simply d[key].append(value) Wouldn't it be more generally powerful to pass type or factory function to use to instantiate a default object when a missing key is encountered, e.g. d = DefaultDict(list) then d[key].append(value) but then you can also do d = DefaultDict(dict) d[key].update(a=123) or class Foo(object): pass d = DefaultDict(Foo) d[key].phone = '415-555-1212' etc. No worries about generalizing shallow copying either ;-) Note that of all the possible semantics for __getitem__ that could have produced similar results (e.g. not inserting the default in the underlying dict, or not copying the default value), the chosen semantics are the only ones that makes this example work. Over lunch with Alex Martelli, he proposed that a subclass of dict with this behavior (but implemented in C) would be a good addition to the language. It looks like it wouldn't be hard to implement. It could be a builtin named defaultdict. The first, required, argument to the constructor should be the default value. Remaining arguments (even keyword args) are passed unchanged to the dict constructor. Some more design subtleties: - key in d still returns False if the key isn't there - d.get(key) still returns None if the key isn't there - d.default should be a read-only attribute giving the default value Feedback? See above. Regards, Bengt Richter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Thomas Heller wrote: Probably a good idea, has been proposed multiple times on clpy. One good thing would be to be able to specify either a default value or a factory function. While at it, other interesting dict subclasses could be: * sorteddict, practically reinvented by every larger project You mean ordereddict, not sorteddict, I hope. Well, yes. * keytransformdict, such as d = keytransformdict(str.lower). Not sure what you mean by that. What *I* would like is probably more ambitious: I want a dict that allows case-insensitive lookup of string keys This is exactly what this would do. All keys are transformed to lowercase when setting and looking up. plus ideally I want to use it as class or instance dictionary. Use case: COM wrappers. regards, Georg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On 2/17/06, Raymond Hettinger [EMAIL PROTECTED] wrote: Over lunch with Alex Martelli, he proposed that a subclass of dict with this behavior (but implemented in C) would be a good addition to the language I would like to add something like this to the collections module, +1 but a PEP is probably needed to deal with issues like: +0 (You're probably right, but I fear there's no perfect answer, so discussions could go round in circles...) Paul. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Martin v. Lwis wrote: Guido van Rossum wrote: Feedback? I would like this to be part of the standard dictionary type, rather than being a subtype. d.setdefault([]) (one argument) should install a default value, and d.cleardefault() should remove that setting; d.default should be read-only. Alternatively, d.default could be assignable and del-able. Also, I think has_key/in should return True if there is a default. And exactly what use would it then be ? Michael Foord Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Guido van Rossum wrote: A bunch of Googlers were discussing the best way of doing the following (a common idiom when maintaining a dict of lists of values relating to a key, sometimes called a multimap): if key not in d: d[key] = [] d[key].append(value) /.../ Feedback? +1. check it in, already (as collections.defaultdict, perhaps?) alternatively, you could specialize even further: collections.multimap, which deals with list values only (that shallow copy thing feels a bit questionable, but all alternatives feel slightly overgeneralized...) /F ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On Thu, Feb 16, 2006 at 01:11:49PM -0800, Guido van Rossum wrote: Over lunch with Alex Martelli, he proposed that a subclass of dict with this behavior (but implemented in C) would be a good addition to the language. It looks like it wouldn't be hard to implement. It could be a builtin named defaultdict. The first, required, argument to the constructor should be the default value. Remaining arguments (even keyword args) are passed unchanged to the dict constructor. Should a dict subclass really change the constructor/initializer signature in an incompatible way? -- Thomas Wouters [EMAIL PROTECTED] Hi! I'm a .signature virus! copy me into your .signature file to help me spread! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Martin v. Löwis wrote: Also, I think has_key/in should return True if there is a default. and keys should return all possible key values! /F ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
My conclusion is that setdefault() is a failure -- it was a well-intentioned construct, but doesn't actually create more readable code. It was an across the board failure: naming, clarity, efficiency. Can we agree to slate dict.setdefault() to disappear in Py3.0? Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
At 10:10 AM 02/17/2006 +0100, Georg Brandl wrote: Guido van Rossum wrote: d = DefaultDict([]) can be written as simply d[key].append(value) Feedback? Probably a good idea, has been proposed multiple times on clpy. One good thing would be to be able to specify either a default value or a factory function. +1 on factory function, e.g. DefaultDict(list). A default value isn't very useful, because for immutable defaults, setdefault() works well enough. If what you want is a copy of some starting object, you can always do something like DefaultDict({1:2,3:4}.copy). ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Fredrik Lundh wrote: Raymond Hettinger wrote: I would like to add something like this to the collections module, but a PEP is probably needed to deal with issues like: frankly, now that Guido is working 50% on Python, do we really have to use the full PEP process also for simple things like this? I'd say we let the BDFL roam free. (if he adds something really lousy, it can always be tweaked/removed before the next final release. not every checkin needs to be final...). +1. Georg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On 2/16/06, Guido van Rossum [EMAIL PROTECTED] wrote: A bunch of Googlers were discussing the best way of doing the following (a common idiom when maintaining a dict of lists of values relating to a key, sometimes called a multimap): if key not in d: d[key] = [] d[key].append(value) An alternative way to spell this uses setdefault(), but it's not very readable: d.setdefault(key, []).append(value) I'd like to see it done passing a factory function (and with a better name): d.getorset(key, list).append(value) The name is slightly odd but it is effective. Plus it avoids creating a new class when a slight tweak to an existing one will do. Over lunch with Alex Martelli, he proposed that a subclass of dict with this behavior (but implemented in C) would be a good addition to the language. It looks like it wouldn't be hard to implement. It could be a builtin named defaultdict. The first, required, argument to the constructor should be the default value. Remaining arguments (even keyword args) are passed unchanged to the dict constructor. -1 (atleast until you can explain why that's better than .getorset()) -- Adam Olsen, aka Rhamphoryncus ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Phillip J. Eby wrote: At 10:10 AM 02/17/2006 +0100, Georg Brandl wrote: Guido van Rossum wrote: d = DefaultDict([]) can be written as simply d[key].append(value) Feedback? Probably a good idea, has been proposed multiple times on clpy. One good thing would be to be able to specify either a default value or a factory function. +1 on factory function, e.g. DefaultDict(list). A default value isn't very useful, because for immutable defaults, setdefault() works well enough. If what you want is a copy of some starting object, you can always do something like DefaultDict({1:2,3:4}.copy). +1 here, too (for permitting a factory function only). This doesn't really limit usage, as you can still supply DefaultDict(partial(copy, x)) or DefaultDict(partial(deepcopy, x)), or (heaven forbid) a lambda expression. . . As others have mentioned, the basic types are all easy, since the typename can be used directly. +1 on supplying that factory function to the constructor, too (the default value is a fundamental part of the defaultdict). That is, I'd prefer: d = defaultdict(func) # The defaultdict is fully defined, but not yet populated d.update(init_values) over: d = defaultdict(init_values) # The defaultdict is partially populated, but not yet fully defined! d.default(func) That is, something that is the same the normal dict except for: def __init__(self, default): self.default = default def __getitem__(self, key): return self.get(key, self.default()) Considering some of Raymond's questions in light of the above * implications of a __getitem__ succeeding while get(value, x) returns x (possibly different from the overall default) * implications of a __getitem__ succeeding while __contains__ would fail These behaviours seem reasonable for a default dictionary - containment is based on whether or not the key actually exists in the dictionary as it currently stands, and the default is really a default default that can be overridden using 'get'. * whether to add this to the collections module (I would say yes) * whether to allow default functions as well as default values (so you could instantiate a new default list) My preference is for factory functions only, to eliminate ambiguity. # bag like behavior dd = collections.default_dict(int) for elem in collection: dd[elem] += 1 # setdefault-like behavior dd = collections.default_dict(list) for page_number, page in enumerate(book): for word in page.split(): dd[word].append(word) -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Adam Olsen wrote: Over lunch with Alex Martelli, he proposed that a subclass of dict with this behavior (but implemented in C) would be a good addition to the language. It looks like it wouldn't be hard to implement. It could be a builtin named defaultdict. The first, required, argument to the constructor should be the default value. Remaining arguments (even keyword args) are passed unchanged to the dict constructor. -1 (atleast until you can explain why that's better than .getorset()) Because the default default is a fundamental characteristic of the default dictionary (meaning it works with normal indexing syntax), whereas getorset makes it a characteristic of the method call. Besides, if there are going to be any method changes on normal dicts, I'd rather see a boolean third argument set to the get method. That is (for a normal dict): def get(self, key, *args): set = False no_default = False if len(args) == 2: default, set = args elif args: default, = args else: no_default = True if key in self: return self[key] if no_default: raise KeyError(repr(key)) if set: self[key] = default return default Using Guido's original example: d.get(key, [], True).append(value) I don't really think this is a replacement for defaultdict, though. Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Nick Coghlan wrote: Using Guido's original example: d.get(key, [], True).append(value) hmm. are you sure you didn't just reinvent setdefault ? /F ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Guido Over lunch with Alex Martelli, he proposed that a subclass of Guido dict with this behavior (but implemented in C) would be a good Guido addition to the language. Instead, why not define setdefault() the way it should have been done in the first place? When you create a dict it has the current behavior. If you then call its setdefault() method that becomes the default value for missing keys. d = {'a': 1}' d['b'] # raises KeyError d.get('c') # evaluates to None d.setdefault(42) d['b'] # evaluates to 42 d.get('c') # evaluates to 42 For symmetry, setdefault() should probably be undoable: deldefault(), removedefault(), nodefault(), default_free(), whatever. The only question in my mind is whether or not getting a non-existent value under the influence of a given default value should stick that value in the dictionary or not. down-with-more-builtins-ly, y'rs, Skip ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Fredrik Lundh wrote: Nick Coghlan wrote: Using Guido's original example: d.get(key, [], True).append(value) hmm. are you sure you didn't just reinvent setdefault ? I'm reasonably sure I copied it on purpose, only with a name that isn't 100% misleading as to what it does ;) I think collections.defaultdict is a better approach, though. Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
[EMAIL PROTECTED] wrote: Guido Over lunch with Alex Martelli, he proposed that a subclass of Guido dict with this behavior (but implemented in C) would be a good Guido addition to the language. Instead, why not define setdefault() the way it should have been done in the first place? When you create a dict it has the current behavior. If you then call its setdefault() method that becomes the default value for missing keys. That puts it off until 3.0. From what I read I think defaultdict won't become builtin anyway. Georg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On Thu, Feb 16, 2006 at 01:11:49PM -0800, Guido van Rossum wrote: [snip] Google has an internal data type called a DefaultDict which gets passed a default value upon construction. Its __getitem__ method, instead of raising KeyError, inserts a shallow copy (!) of the given default value into the dict when the value is not found. So the above code, after d = DefaultDict([]) can be written as simply d[key].append(value) Note that of all the possible semantics for __getitem__ that could have produced similar results (e.g. not inserting the default in the underlying dict, or not copying the default value), the chosen semantics are the only ones that makes this example work. Having __getitem__ insert the returned default value allows it to work with a larger variety of classes. My own ForgivingDict does not do this and works fine for ints and lists but not much else. fd = ForgivingDict(list) fd[key] += [val] # extends the list and does a __setitem__ The += operator isn't useful for dicts. How can you make a defaultdict with a defaultdict as the default? My head asploded when I tried it with the constructor arg. It does seem possible with the 'd.default = func' syntax # empty defaultdict constructor d = defaultdict() d.default = d tree = defaultdict() tree.default = d.copy -jackdied ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
+1It's about time!- COn 2/16/06, Guido van Rossum [EMAIL PROTECTED] wrote: A bunch of Googlers were discussing the best way of doing thefollowing (a common idiom when maintaining a dict of lists of valuesrelating to a key, sometimes called a multimap):if key not in d: d[key] = [] d[key].append(value)An alternative way to spell this uses setdefault(), but it's not very readable:d.setdefault(key, []).append(value)and it also suffers from creating an unnecessary list instance. (Timings were inconclusive; the approaches are within 5-10% of eachother in speed.)My conclusion is that setdefault() is a failure -- it was awell-intentioned construct, but doesn't actually create more readable code.Google has an internal data type called a DefaultDict which getspassed a default value upon construction. Its __getitem__ method,instead of raising KeyError, inserts a shallow copy (!) of the given default value into the dict when the value is not found. So the abovecode, afterd = DefaultDict([])can be written as simplyd[key].append(value)Note that of all the possible semantics for __getitem__ that could have produced similar results (e.g. not inserting the default in theunderlying dict, or not copying the default value), the chosensemantics are the only ones that makes this example work.Over lunch with Alex Martelli, he proposed that a subclass of dict with this behavior (but implemented in C) would be a good addition tothe language. It looks like it wouldn't be hard to implement. It couldbe a builtin named defaultdict. The first, required, argument to the constructor should be the default value. Remaining arguments (evenkeyword args) are passed unchanged to the dict constructor.Some more design subtleties:- key in d still returns False if the key isn't there - d.get(key) still returns None if the key isn't there- d.default should be a read-only attribute giving the default valueFeedback?Guido van Rossum (home page: http://www.python.org/~guido/)___Python-Dev mailing listPython-Dev@python.org http://mail.python.org/mailman/listinfo/python-devUnsubscribe: http://mail.python.org/mailman/options/python-dev/monpublic%40gmail.com -- A programmer learning programming from Perl is like a chemistry student learning the definition of 'exothermic' with dynamite. - evilpenguin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Fredrik Lundh [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Raymond Hettinger wrote: I would like to add something like this to the collections module, but a PEP is probably needed to deal with issues like: frankly, now that Guido is working 50% on Python, do we really have to use the full PEP process also for simple things like this? I'd say we let the BDFL roam free. PEPs are useful for question-answering purposes even after approval. The design phase can be cut short by simply posting the approved design doc. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Raymond Hettinger wrote: Over lunch with Alex Martelli, he proposed that a subclass of dict with this behavior (but implemented in C) would be a good addition to the language I would like to add something like this to the collections module, but a PEP is probably needed to deal with issues like: * implications of a __getitem__ succeeding while get(value, x) returns x (possibly different from the overall default) * implications of a __getitem__ succeeding while __contains__ would fail * whether to add this to the collections module (I would say yes) * whether to allow default functions as well as default values (so you could instantiate a new default list) * comparing all the existing recipes and third-party modules that have already done this * evaluating its fitness for common use cases (i.e. bags and dict of lists). It doesn't seem that useful for bags, assuming we're talking about an {object: count} implementation of bags; bags should really have a more set-like interface than a dict-like interface. A dict of lists typically means a multi-valued dict. In that case it seems like x[key_not_found] should return the empty list, as that means zero values; even though zero values also means that x.has_key(key_not_found) should return False as well. *but* getting x[key_not_found] does not (for a multi-valued dict) mean that suddently has_key should return true. I find the side-effect nature of __getitem__ as proposed in default_dict to be rather confusing, and when reading code it will very much break my expectations. I assume that attribute access and [] access will not have side effects. Coming at it from that direction, I'm -1, though I'm +1 on dealing with the specific use case that started this (x.setdefault(key, []).append(value)). An implementation targetted specifically at multi-valued dictionaries seems like it would be better. Incidentally, on Web-SIG we've discussed wsgiref, and it includes a mutli-values, ordered, case-insensitive dictionary. Such a dictionary(ish) object has clear applicability for HTTP headers, but certainly it is something I've used many times elsewhere. In a case-sensitive form it applies to URL variables. Really there's several combinations of features, each with different uses. So we have now... dicts: unordered, key:value (associative), single-value sets: unordered, not key:value, single-value lists: ordered, not key:value, multi-value We don't have... bags: unordered, not key:value, multi-value multi-dict: unordered, key:value, multi-value ordered-dict: ordered, key:value, single-value ordered-multi-dict: ordered, key:value, single-value For all key:value collections, normalized keys can be useful. (Though notably the wsgiref Headers object does not have normalized keys, but instead does case-insensitive comparisons.) I don't know where dict-of-dict best fits in here. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Terry Reedy wrote: I'd say we let the BDFL roam free. PEPs are useful for question-answering purposes even after approval. The design phase can be cut short by simply posting the approved design doc. not for trivialities. it'll take Guido more time to write a PEP than to implement the damn thing. is that really a good use of his time ? why is python-dev suddenly full of control freaks ? /F ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Also, I think has_key/in should return True if there is a default. Fredrik and keys should return all possible key values! I think keys() and in should reflect reality. Only when you do something like x = d['nonexistent'] or x = d.get('nonexistent') should the default value come into play. Skip ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On 2/16/06, Guido van Rossum [EMAIL PROTECTED] wrote: Over lunch with Alex Martelli, he proposed that a subclass of dict with this behavior (but implemented in C) would be a good addition to the language. It looks like it wouldn't be hard to implement. It could be a builtin named defaultdict. The first, required, argument to the constructor should be the default value. Remaining arguments (even keyword args) are passed unchanged to the dict constructor. Thanks for all the constructive feedback. Here are some responses and a new proposal. - Yes, I'd like to kill setdefault() in 3.0 if not sooner. - It would indeed be nice if this was an optional feature of the standard dict type. - I'm ignoring the request for other features (ordering, key transforms). If you want one of these, write a PEP! - Many, many people suggested to use a factory function instead of a default value. This is indeed a much better idea (although slightly more cumbersome for the simplest cases). - Some people seem to think that a subclass constructor signature must match the base class constructor signature. That's not so. The subclass constructor must just be careful to call the base class constructor with the correct arguments. Think of the subclass constructor as a factory function. - There's a fundamental difference between associating the default value with the dict object, and associating it with the call. So proposals to invent a better name/signature for setdefault() don't compete. (As to one specific such proposal, adding an optional bool as the 3rd argument to get(), I believe I've explained enough times in the past that flag-like arguments that always get a constant passed in at the call site are a bad idea and should usually be refactored into two separate methods.) - The inconsistency introduced by __getitem__() returning a value for keys while get(), __contains__(), and keys() etc. don't show it, cannot be resolved usefully. You'll just have to live with it. Modifying get() to do the same thing as __getitem__() doesn't seem useful -- it just takes away a potentially useful operation. So here's a new proposal. Let's add a generic missing-key handling method to the dict class, as well as a default_factory slot initialized to None. The implementation is like this (but in C): def on_missing(self, key): if self.default_factory is not None: value = self.default_factory() self[key] = value return value raise KeyError(key) When __getitem__() (and *only* __getitem__()) finds that the requested key is not present in the dict, it calls self.on_missing(key) and returns whatever it returns -- or raises whatever it raises. __getitem__() doesn't need to raise KeyError any more, that's done by on_missing(). The on_missing() method can be overridden to implement any semantics you want when the key isn't found: return a value without inserting it, insert a value without copying it, only do it for certain key types/values, make the default incorporate the key, etc. But the default implementation is designed so that we can write d = {} d.default_factory = list to create a dict that inserts a new list whenever a key is not found in __getitem__(), which is most useful in the original use case: implementing a multiset so that one can write d[key].append(value) to add a new key/value to the multiset without having to handle the case separately where the key isn't in the dict yet. This also works for sets instead of lists: d = {} d.default_factory = set ... d[key].add(value) I went through several iterations to obtain this design; my first version of on_missing() would just raise KeyError(key), requiring you to always provide a subclass; this is more minimalistic but less useful and would probably raise the bar for using the feature to some extent. To saev you attempts to simplify this, here are some near-misses I considered that didn't quite work out: - def on_missing(self, key): if self.default_factory is not None: return self.default_factory() raise KeyError(key) This would require the multiset example to subclass, since default_factory doesn't see the key so it can't insert it. - def on_missing(self, key): if self.default_factory is not None: return self.default_factory(key) raise KeyError(key) This appears to fix that problem, but now you can't write d.default_value = list since (a) list(key) doesn't return an empty list, and (b) it also doesn't insert the key into the dict; attempting to assign a callback function to default_factory that solves these issues fail because the callback doesn't have access to the dict instance (unless there's only one). - Do away with on_missing() and just include its body at the end of __getitem__(), to be invoked when the key isn't found. This is less general in case you want different default semantics (e.g. not inserting the default, or making the default a function of the key) -- you'd have to override __getitem__() for that, which
Re: [Python-Dev] Proposal: defaultdict
On Friday 17 February 2006 14:09, Guido van Rossum wrote: So here's a new proposal. I like the version you came up with. It has sufficient functionality to make it easy to use, and enough flexibility to be useful in more specialized cases. I'm quite certain it would handle all the cases I've actually dealt with where I wanted a variation of a mapping with default values. -Fred -- Fred L. Drake, Jr. fdrake at acm.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On 2/16/06, Guido van Rossum [EMAIL PROTECTED] wrote: A bunch of Googlers were discussing the best way of doing the ... Wow, what a great discussion! As you'll recall, I had also mentioned the callable factory as a live possibility, and there seems to be a strong sentiment in favor of that; not really a weakness case for HOFs, as you feared it might be during the lunchtime discussion. Out of all I've read here, I like the idea of having a collections.autodict (a much nicer name than defaultdict, a better collocation for 2.5 than the builtins). One point I think nobody has made is that whenever reasonably possible the setting of a callback (the callable factory here) should include *a and **k to use when calling back. So, for example: ad = collections.autodict(copy.copy, whatever) would easily cover the use case of Google's DefaultDict (yes, partial would also cover this use case, but having *a and **k is usefully more general). If you're convinced copy.copy is an overwhelmingly popular use case (I'm not, personally), then this specific idiom might also be summarized in a classmethod, a la ad = collections.autodict.by_copy(whatever) This way, all autodicts would start out empty (and be filled by update if needed). An alternative would be to have autodict's ctor have the same signature as dict's, with a separate .set_initial method to pass the factory (and *a, **k) -- this way an autodict might start out populated, but would always start with some default factory, such as lambda:None I guess. I think the first alternative (autodict always starts empty, but with a specifically chosen factory [including *a, **k]) is more useful. Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On 2/17/06, Alex Martelli [EMAIL PROTECTED] wrote: On 2/16/06, Guido van Rossum [EMAIL PROTECTED] wrote: A bunch of Googlers were discussing the best way of doing the ... Wow, what a great discussion! As you'll recall, I had also mentioned the callable factory as a live possibility, and there seems to be a strong sentiment in favor of that; not really a weakness case for HOFs, as you feared it might be during the lunchtime discussion. :-) You seem to have missed my revised proposal. Out of all I've read here, I like the idea of having a collections.autodict (a much nicer name than defaultdict, a better collocation for 2.5 than the builtins). One point I think nobody has made is that whenever reasonably possible the setting of a callback (the callable factory here) should include *a and **k to use when calling back. That's your C/C++ brain talking. :-) If you need additional data passed to a callback (to be provided at the time the callback is *set*, not when it is *called*) the customary approach is to make the callback a parameterless lambda; you can also use a bound method, etc. There's no need to complicate ever piece of code that calls a callback with the machinery to store and use arbirary arguments and keyword arguments. I forgot to mention in my revised proposal that the API for setting the default_factory is slightly odd: d = {} # or dict() d.default_factory = list rather than d = dict(default_factory=list) This is of course because we cut off that way when we defined what arbitrary keyword arguments to the dict constructor would do. My original proposal solved this by creating a subclass. But there were several suggestions that this would be fine functionality to add to the standard dict type -- and then I really don't see any other way to do this. (Yes, I could have a set_default_factory() method -- but a simple settable attribute seems more pythonic!) -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Guido van Rossum wrote: So here's a new proposal. Let's add a generic missing-key handling method to the dict class, as well as a default_factory slot initialized to None. The implementation is like this (but in C): def on_missing(self, key): if self.default_factory is not None: value = self.default_factory() self[key] = value return value raise KeyError(key) When __getitem__() (and *only* __getitem__()) finds that the requested key is not present in the dict, it calls self.on_missing(key) and returns whatever it returns -- or raises whatever it raises. __getitem__() doesn't need to raise KeyError any more, that's done by on_missing(). Will this also work when PyDict_GetItem() does not find the key? Thomas ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
[cc to py-dev again] Guido van Rossum wrote: On 2/17/06, Thomas Heller [EMAIL PROTECTED] wrote: Guido van Rossum wrote: So here's a new proposal. Let's add a generic missing-key handling method to the dict class, as well as a default_factory slot initialized to None. The implementation is like this (but in C): def on_missing(self, key): if self.default_factory is not None: value = self.default_factory() self[key] = value return value raise KeyError(key) When __getitem__() (and *only* __getitem__()) finds that the requested key is not present in the dict, it calls self.on_missing(key) and returns whatever it returns -- or raises whatever it raises. __getitem__() doesn't need to raise KeyError any more, that's done by on_missing(). Will this also work when PyDict_GetItem() does not find the key? Ouch, tricky. It should, of course, but the code will be a tad tricky because it's not supposed to inc the refcount. Thanks for reminding me! Ahem, I'm still looking for ways to 'overtake' the dict to implement weird and fancy things. Can on_missing be overridden in subclasses (writing the subclass in C would not be a problem)? Thanks, Thomas ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On 2/17/06, Thomas Heller [EMAIL PROTECTED] wrote: Ahem, I'm still looking for ways to 'overtake' the dict to implement weird and fancy things. Can on_missing be overridden in subclasses (writing the subclass in C would not be a problem)? Why ahem? The answer is yes. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Guido van Rossum wrote: d = {} d.default_factory = set ... d[key].add(value) Another option would be: d = {} d.default_factory = set d.get_default(key).add(value) Unlike .setdefault, this would use a factory associated with the dictionary, and no default value would get passed in. Unlike the proposal, this would not override __getitem__ (not overriding __getitem__ is really the only difference with the proposal). It would be clear reading the code that you were not implicitly asserting they key in d was true. get_default isn't the best name, but another name isn't jumping out at me at the moment. Of course, it is not a Pythonic argument to say that an existing method should be overridden, or functionality made nameless simply because we can't think of a name (looking to anonymous functions of course ;) -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On Fri, Feb 17, 2006, Guido van Rossum wrote: But the default implementation is designed so that we can write d = {} d.default_factory = list +1 I actually like the fact that you're forced to use a separate statement for setting the default_factory. From my POV, this can go into 2.5. (I was only +0 on the previous proposal and I was -1 on making it a built-in; this extension is much nicer.) -- Aahz ([EMAIL PROTECTED]) * http://www.pythoncraft.com/ 19. A language that doesn't affect the way you think about programming, is not worth knowing. --Alan Perlis ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On 2/17/06, Guido van Rossum [EMAIL PROTECTED] wrote: - There's a fundamental difference between associating the default value with the dict object, and associating it with the call. So proposals to invent a better name/signature for setdefault() don't compete. That's a feature, not a bug. :) See below. - The inconsistency introduced by __getitem__() returning a value for keys while get(), __contains__(), and keys() etc. don't show it, cannot be resolved usefully. You'll just have to live with it. Modifying get() to do the same thing as __getitem__() doesn't seem useful -- it just takes away a potentially useful operation. Again, see below. So here's a new proposal. Let's add a generic missing-key handling method to the dict class, as well as a default_factory slot initialized to None. The implementation is like this (but in C): def on_missing(self, key): if self.default_factory is not None: value = self.default_factory() self[key] = value return value raise KeyError(key) When __getitem__() (and *only* __getitem__()) finds that the requested key is not present in the dict, it calls self.on_missing(key) and returns whatever it returns -- or raises whatever it raises. __getitem__() doesn't need to raise KeyError any more, that's done by on_missing(). Still -1. It's better, but it violates the principle of encapsulation by mixing how-you-use-it state with what-it-stores state. In doing that it has the potential to break an API documented as accepting a dict. Code that expects d[key] to raise an exception (and catches the resulting KeyError) will now silently succeed. I believe that necessitates a PEP to document it. It's also makes it harder to read code. You may expect d[key] to raise an exception, but it won't because of a single line up several pages (or in another file entierly!) d.getorset(key, func) has no such problems and has a much simpler specification: def getorset(self, key, func): try: return self[key] except KeyError: value = self[key] = func() return value -- Adam Olsen, aka Rhamphoryncus ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Fuzzyman wrote: Also, I think has_key/in should return True if there is a default. And exactly what use would it then be ? Code that checks if d.has_key(k): print d[k] would work correctly. IOW, you could use a dictionary with a default key just as if it were a normal dictionary - which is a useful property, IMO. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On 2/17/06, Adam Olsen [EMAIL PROTECTED] wrote: It's also makes it harder to read code. You may expect d[key] to raise an exception, but it won't because of a single line up several pages (or in another file entierly!) Such are the joys of writing polymorphic code. I don't really see how you can avoid this kind of confusion -- I could have given you some other mapping object that does weird stuff. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Adam Olsen wrote: Still -1. It's better, but it violates the principle of encapsulation by mixing how-you-use-it state with what-it-stores state. In doing that it has the potential to break an API documented as accepting a dict. Code that expects d[key] to raise an exception (and catches the resulting KeyError) will now silently succeed. Of course it will, and without quotes. That's the whole point. I believe that necessitates a PEP to document it. You are missing the rationale of the PEP process. The point is *not* documentation. The point of the PEP process is to channel and collect discussion, so that the BDFL can make a decision. The BDFL is not bound at all to the PEP process. To document things, we use (or should use) documentation. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On 2/17/06, Ian Bicking [EMAIL PROTECTED] wrote: Guido van Rossum wrote: d = {} d.default_factory = set ... d[key].add(value) Another option would be: d = {} d.default_factory = set d.get_default(key).add(value) Unlike .setdefault, this would use a factory associated with the dictionary, and no default value would get passed in. Unlike the proposal, this would not override __getitem__ (not overriding __getitem__ is really the only difference with the proposal). It would be clear reading the code that you were not implicitly asserting they key in d was true. get_default isn't the best name, but another name isn't jumping out at me at the moment. Of course, it is not a Pythonic argument to say that an existing method should be overridden, or functionality made nameless simply because we can't think of a name (looking to anonymous functions of course ;) I'm torn. While trying to implement this I came across some ugliness in PyDict_GetItem() -- it would make sense if this also called on_missing(), but it must return a value without incrementing its refcount, and isn't supposed to raise exceptions -- so what to do if on_missing() returns a value that's not inserted in the dict? If the __getattr__()-like operation that supplies and inserts a dynamic default was a separate method, we wouldn't have this problem. OTOH most reviewers here seem to appreciate on_missing() as a way to do various other ways of alterning a dict's __getitem__() behavior behind a caller's back -- perhaps it could even be (ab)used to implement case-insensitive lookup. I'm not going to do a point-by-point to your longer post (I don't have the time); let's (again) agree to disagree and I'll sleep on it. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Guido van Rossum wrote: On 2/17/06, Adam Olsen [EMAIL PROTECTED] wrote: It's also makes it harder to read code. You may expect d[key] to raise an exception, but it won't because of a single line up several pages (or in another file entierly!) Such are the joys of writing polymorphic code. I don't really see how you can avoid this kind of confusion -- I could have given you some other mapping object that does weird stuff. The way you avoid confusion is by not working with code or programmers who write bad code. Python and polymorphic code in general pushes the responsibility for many errors from the language structure onto the programmer -- it is the programmers' responsibility to write good code. Python has never kept people from writing obcenely horrible code. We ought to have an obfuscated Python contest just to prove that point -- it is through practice and convention that readable Python code happens, not through the restrictions of the language. (Honestly, I think such a contest would be a good idea.) I know *I* at least don't like code that mixes up access and modification. Maybe not everyone does (or maybe not everyone thinks of getitem as access, but that's unlikely). I will assert that it is Pythonic to keep access and modification separate, which is why methods and attributes are different things, and why assignment is not an expression, and why functions with side effects typically return None, or have names that are very explicit about the side effect, with names containing command verbs like update or set. All of these distinguish access from modification. Note that all of what I'm saying *only* applies to the overriding of __getitem__, not the addition of any new method. I think multidict is better for the places it applies, but I see no problem at all with a new method on dictionaries that calls on_missing. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Guido van Rossum wrote: I'm torn. While trying to implement this I came across some ugliness in PyDict_GetItem() -- it would make sense if this also called on_missing(), but it must return a value without incrementing its refcount, and isn't supposed to raise exceptions -- so what to do if on_missing() returns a value that's not inserted in the dict? I think there should be a guideline to use PyObject_GetItem/PyMapping_GetItemString normally, i.e. in all cases where you would write d[k] in Python code. It should be considered a bug if PyDict_GetItem is used in a place that should invoke defaulting; IOW, the function should be reserved to really low-level cases (e.g. if it is known that the dict doesn't have any defaulting, e.g. the string interned dictionary). There should be a policy whether name-lookup invokes defaulting (i.e. __dict__ access); I think it should. This would cause __getattr__ to have no effect if the object's dictionary has a default factory (unless that raises a KeyError). Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On 2/17/06, Guido van Rossum [EMAIL PROTECTED] wrote: On 2/17/06, Adam Olsen [EMAIL PROTECTED] wrote: It's also makes it harder to read code. You may expect d[key] to raise an exception, but it won't because of a single line up several pages (or in another file entierly!) Such are the joys of writing polymorphic code. I don't really see how you can avoid this kind of confusion -- I could have given you some other mapping object that does weird stuff. You could pass a float in as well. But if the function is documented as taking a dict, and the programmer expects a dict.. that now has to be changed to dict without a default. Or they have to code defensively since d[key] may or may not raise KeyError, so they must avoid depending on it either way. -- Adam Olsen, aka Rhamphoryncus ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On 2/17/06, Adam Olsen [EMAIL PROTECTED] wrote: Such are the joys of writing polymorphic code. I don't really see how you can avoid this kind of confusion -- I could have given you some other mapping object that does weird stuff. You could pass a float in as well. But if the function is documented as taking a dict, and the programmer expects a dict.. that now has to be changed to dict without a default. Or they have to code defensively since d[key] may or may not raise KeyError, so they must avoid depending on it either way. I'd like to see a real-life example of code that would break this way. I believe that *most* code that takes a dict will work just fine if that dict has a default factory. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Adam Olsen wrote: You could pass a float in as well. But if the function is documented as taking a dict, and the programmer expects a dict.. that now has to be changed to dict without a default. Or they have to code defensively since d[key] may or may not raise KeyError, so they must avoid depending on it either way. Can you give an example of real, existing code that will break if a such a dict is passed? Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On 2/17/06, Martin v. Löwis [EMAIL PROTECTED] wrote: Adam Olsen wrote: Still -1. It's better, but it violates the principle of encapsulation by mixing how-you-use-it state with what-it-stores state. In doing that it has the potential to break an API documented as accepting a dict. Code that expects d[key] to raise an exception (and catches the resulting KeyError) will now silently succeed. Of course it will, and without quotes. That's the whole point. Consider these two pieces of code: if key in d: dosomething(d[key]) else: dosomethingelse() try: dosomething(d[key]) except KeyError: dosomethingelse() Before they were the same (assuming dosomething() won't raise KeyError). Now they would behave differently. The latter is even the prefered form, since it only invokes a single dict lookup: On 2/16/06, Delaney, Timothy (Tim) [EMAIL PROTECTED] wrote: try: v = d[key] except: v = d[key] = value Obviously this example could be changed to use default_factory, but I find it hard to believe the only use of that pattern is to set default keys. Of course you could just assume that of all the people passing your function a dict, none of them will ever use the default_factory when they build the dict. Should be easy, right? -- Adam Olsen, aka Rhamphoryncus ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Martin v. Löwis wrote: I know *I* at least don't like code that mixes up access and modification. Maybe not everyone does (or maybe not everyone thinks of getitem as access, but that's unlikely). I will assert that it is Pythonic to keep access and modification separate, which is why methods and attributes are different things, and why assignment is not an expression, and why functions with side effects typically return None, or have names that are very explicit about the side effect, with names containing command verbs like update or set. All of these distinguish access from modification. Do you never write d[some_key].append(some_value) This is modification and access, all in a single statement, and all without assignment operator. (d[some_key]) is access. (...).append(some_value) is modification. Expressions are compound; of course you can mix both access and modification in a single expression. d[some_key] is access that returns something, and .append(some_value) modifies that something, it doesn't modify d. I don't see the setting of the default value as a modification. The default value has been there, all the time. It only is incarnated lazily. It is lazily incarnated for multidict, because there is no *noticeable* side effect -- if there is any internal side effects that is an implementation detail. However for default_factory=list, the result of .keys(), .has_key(), and .items() changes when you do d[some_key]. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Adam Olsen wrote: The latter is even the prefered form, since it only invokes a single dict lookup: On 2/16/06, Delaney, Timothy (Tim) [EMAIL PROTECTED] wrote: try: v = d[key] except: v = d[key] = value Obviously this example could be changed to use default_factory, but I find it hard to believe the only use of that pattern is to set default keys. I'd go further -- I doubt many cases where try:except KeyError: is used could be refactored to use default_factory -- default_factory can only be used to set default keys to something that can be determined sometime close to the time the dictionary is created, and that the default is not dependent on the context in which the key is fetched, and that default value will not cause unintended side effects if the dictionary leaks out of the code where it was initially used (like if the dictionary is returned to someone). Any default factory is more often an algorithmic detail than truly part of the nature of the dictionary itself. For instance, here is something I do often: try: value = cache[key] except KeyError: ... calculate value ... cache[key] = value Realistically, factoring ... calculate value ... into a factory that calculates the value would be difficult, produce highly unreadable code, perform worse, and have more bugs. For simple factories like list and dict the factory works okay. For immutable values like 0 and None, the factory (lambda : 0 and lambda : None) is a wasteful way to create a default value (because storing the value in the dictionary is unnecessary). For non-trivial factories the whole thing falls apart, and one can just hope that no one will try to use this feature and will instead stick with the try:except KeyError: technique. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Martin v. Löwis writes: You are missing the rationale of the PEP process. The point is *not* documentation. The point of the PEP process is to channel and collect discussion, so that the BDFL can make a decision. The BDFL is not bound at all to the PEP process. To document things, we use (or should use) documentation. You are oversimplifying significantly. The purpose of the PEP process is to lay out and document the reasoning that went into forming the decision. The BDFL is *allowed* to be capricious, but he's sensible enough to choose not to: in cases where it matters, he tries to document the reasoning behind his decisions. In fact, he does better than that... he gets the PEP author to document it for him! The PEP (whether accepted, rejected, or in-progress) serves as the official documentation of how the decision was made (or of what option it is that is still undecided). If a _trivial_ decision is already made, there's no need for a PEP, but if a difficult decision has been made, then documenting it in a PEP saves years of having to justify it to newbies. -- Michael Chermside ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Adam Olsen wrote: Consider these two pieces of code: if key in d: dosomething(d[key]) else: dosomethingelse() try: dosomething(d[key]) except KeyError: dosomethingelse() Before they were the same (assuming dosomething() won't raise KeyError). Now they would behave differently. I personally think they should continue to do the same thing, i.e. in should return True if there is a default; in the current proposal, it should invoke the default factory. But that's beside the point: Where is the real example where this difference would matter? (I'm not asking for a realistic example, I'm asking for a real one) Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Ian Bicking wrote: It is lazily incarnated for multidict, because there is no *noticeable* side effect -- if there is any internal side effects that is an implementation detail. However for default_factory=list, the result of .keys(), .has_key(), and .items() changes when you do d[some_key]. That's why I think has_key and in should return True for any key. This leaves keys(), items(), and values(). From a pure point of view, they should return infinite sets. Practicality beats purity, so yes, d[k] could be considered a modifying operation. If you look carefully, you find that many access operations also have side effects. For example, .read() on a file not only returns some data, but also advances the file position. Queue.get not only returns the next item, but also removes it from the queue. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On 2/17/06, Martin v. Löwis [EMAIL PROTECTED] wrote: That's why I think has_key and in should return True for any key. This leaves keys(), items(), and values(). From a pure point of view, they should return infinite sets. Practicality beats purity, so yes, d[k] could be considered a modifying operation. I think practicality beats purity even for has_key/in; IMO these operations are more useful when they match keys() instead of always returning True. But someone should start writing some code to play with this. I have a working patch (including a hack for PyDict_GetItem()): python.org/sf/1433928 So there's no excuse to be practical now. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On 2/17/06, Martin v. Löwis [EMAIL PROTECTED] wrote: Adam Olsen wrote: You could pass a float in as well. But if the function is documented as taking a dict, and the programmer expects a dict.. that now has to be changed to dict without a default. Or they have to code defensively since d[key] may or may not raise KeyError, so they must avoid depending on it either way. Can you give an example of real, existing code that will break if a such a dict is passed? I only got halfway through the grep KeyError results, but.. Demo/metaclass/Meta.py:55 Demo/tkinter/guido/AttrDialog.py:121 # Subclasses override self.classes Lib/ConfigParser.py:623 Lib/random.py:315 Lib/string.py:191 Lib/weakref.py:56 # Currently uses UserDict but I assume it will switch to dict eventually And the pièce de résistance.. Doc/tools/anno-api.py:51 It has this: try: info = rcdict[s] except KeyError: sys.stderr.write(No refcount data for %s\n % s) else: ... rcdict is loaded from refcounts.load(). refcounts.load() calls refcounts.loadfile(), which has this (inside a loop): try: entry = d[function] except KeyError: entry = d[function] = Entry(function) A prime candidate for a default. Perhaps the KeyError shouldn't ever get triggered in this case, I'm not sure. I think that's besides the point though. The programmer clearly expected it would. -- Adam Olsen, aka Rhamphoryncus ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Adam Olsen wrote: And the pièce de résistance.. Doc/tools/anno-api.py:51 It has this: try: info = rcdict[s] except KeyError: sys.stderr.write(No refcount data for %s\n % s) else: ... rcdict is loaded from refcounts.load(). refcounts.load() calls refcounts.loadfile(), which has this (inside a loop): try: entry = d[function] except KeyError: entry = d[function] = Entry(function) A prime candidate for a default. Perhaps the KeyError shouldn't ever get triggered in this case, I'm not sure. I think that's besides the point though. The programmer clearly expected it would. Assuming the following override: class EntryDict(dict): def on_missing(self, key): value = Entry(key) self[key] = value return value Then what it means is that the behaviour of missing functions get an empty refcount entry propagates to the rcdict code. So the consequence is that the code in anno-api will never print an error message - all functions are deemed to have associated refcount data in refcount.dat. But that would be a bug in refcounts.loadfile: if it returns an EntryDict instead of a normal dict it is, in effect, returning an *infinite* dictionary that contains refcount definitions for every possible function name (some of them are just populated on demand). So *if* refcounts.loadfile was converted to use an EntryDict, it would need to return dict(d) instead of returning d directly. And this is where the question of whether has_key/__having__ return True or False when default_factory is set is important. If they return False, then the LBYL (if key in d:) and EAFTP (try/except) approaches give *different answers*. More importantly, LBYL will never have side effects, whereas EAFTP may. If the methods always return True (as Martin suggests), then we retain the current behaviour where there is no real difference between the two approaches. Given the amount of time spent in recent years explaining this fact, I don't think it is an equivalence that should be broken lightly (IOW, I've persuaded myself that I agree with Martin) The alternative would be to have an additional query API will_default that reflects whether or not a given key is actually present in the dictionary (if key not in d.keys() would serve a similar purpose, but requires building the list of keys). Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
Guido van Rossum wrote: But there were several suggestions that this would be fine functionality to add to the standard dict type -- and then I really don't see any other way to do this. Given the constructor problem, and the issue with this function expects a plain dictionary, I think your original instinct to use a subclass may have been correct. The constructor is much cleaner that way: # bag like behavior dd = collections.autodict(int) for elem in collection: dd[elem] += 1 # setdefault-like behavior dd = collections.autodict(list) for page_number, page in enumerate(book): for word in page.split(): dd[word].append(word) And it can be a simple fact that for an autodict, if key in d and d[key] may give different answers. Much cleaner than making the semantics of normal dicts dependent on: a. whether or not on_missing has been overridden b. whether or not default_factory has been set Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
At 11:58 AM 02/17/2006 -0800, Guido van Rossum wrote: I forgot to mention in my revised proposal that the API for setting the default_factory is slightly odd: d = {} # or dict() d.default_factory = list rather than d = dict(default_factory=list) This is of course because we cut off that way when we defined what arbitrary keyword arguments to the dict constructor would do. My original proposal solved this by creating a subclass. But there were several suggestions that this would be fine functionality to add to the standard dict type -- and then I really don't see any other way to do this. (Yes, I could have a set_default_factory() method -- but a simple settable attribute seems more pythonic!) Why not a classmethod constructor: d = dict.with_factory(list) Admittedly, the name's not that great. Actually, it's almost as bad as setdefault in some ways. But I'd rather set the default and create the dictionary in one operation, since when reading it as two, you first think 'd is a dictionary', and then 'oh, but it has a default factory', as opposed to d is a dict with a factory in one thought. But maybe that's just me. :) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On 2/17/06, Nick Coghlan [EMAIL PROTECTED] wrote: And this is where the question of whether has_key/__having__ return True or False when default_factory is set is important. If they return False, then the LBYL (if key in d:) and EAFTP (try/except) approaches give *different answers*. More importantly, LBYL will never have side effects, whereas EAFTP may. If the methods always return True (as Martin suggests), then we retain the current behaviour where there is no real difference between the two approaches. Given the amount of time spent in recent years explaining this fact, I don't think it is an equivalence that should be broken lightly (IOW, I've persuaded myself that I agree with Martin) The alternative would be to have an additional query API will_default that reflects whether or not a given key is actually present in the dictionary (if key not in d.keys() would serve a similar purpose, but requires building the list of keys). Looking at it from the which invariants hold POV isn't always the right perspective. Reality is that some amount of code that takes a dict won't work if you give it a dict with a default_factory. Well, that's nothing new. Some code also breaks if you pass it a dict containing key or value types it doesn't expect, or if you pass it an anydbm instance, or os.environ on Windows (which implements case-insensitive keys). From the POV of someone who decides to use a dict with a default_factory (or overriding on-missing()), having the 'in' operator always return True is d*mn annoying -- it means that any kind of introspection of the dict doesn't work. Take for example the multiset use case. Suppose you're aware that you're using a dict with this special behavior. Now you've built up your multiset and now you want to use it. Part of your app is interested in knowing the list of values associated with each key. But another part may be interested only in whether a particular key hs *any* values associated. If key in d returns whether that key is currently present, you can write if key in d: print whatever But under Martin and your proposed semantics, you'd have to write if d.get(key): print whatever or (worse) if d[key]: # inserts an empty list into the dict! print whatever I'd much rather be able to write if key in d and get the result I want... -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On Sat, 18 Feb 2006 00:52:51 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= [EMAIL PROTECTED] wrote: Adam Olsen wrote: Consider these two pieces of code: if key in d: dosomething(d[key]) else: dosomethingelse() try: dosomething(d[key]) except KeyError: dosomethingelse() Before they were the same (assuming dosomething() won't raise KeyError). Now they would behave differently. I personally think they should continue to do the same thing, i.e. in should return True if there is a default; in the current proposal, it should invoke the default factory. But that's beside the point: Where is the real example where this difference would matter? (I'm not asking for a realistic example, I'm asking for a real one) My guess is that realistically default_factory will be used to make clean code for filling a dict, and then turning the factory off if it's to be passed into unknown contexts. Those contexts can then use old code to do as above, or if worth it can temporarily set a factory to do some work. Tightly coupled code I guess could pass factory-enabled dicts between each other. IOW, no code should break unless you pass a factory-enabled dict where you shouldn't ;-) That said, maybe enabling/disabling could be separate from d.default_factory (e.g., d.defaults_enabled) as that could allow e.g. foo(**kw) more options in how to copy kw and what foo could do. Would total copy including defaulting state be best? What other copies must be sanitized? type('Foo',(), **{'this':'one?'}) It will be interesting to see what comes out of the woodwork ;-) Regards, Bengt Richter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On 2/17/06, Adam Olsen [EMAIL PROTECTED] wrote: if key in d: dosomething(d[key]) else: dosomethingelse() try: dosomething(d[key]) except KeyError: dosomethingelse() I agree with the gut feeling that these should still do the same thing. Could we modify d.get() instead? class ddict(dict): ... default_value_factory = None ... def get(self, k, d=None): ... v = super(ddict, self).get(k, d) ... if v is not None or d is not None or self.default_value_factory is None: ... return v ... return self.setdefault(k, self.default_value_factory()) ... d = ddict() d.default_value_factory = list d.get('list', []) [] d['list'] Traceback (most recent call last): File stdin, line 1, in ? KeyError: 'list' d.get('list').append(5) d['list'] [5] There was never an exception raised by d.get so this wouldn't change (assuming the C is implemented more carefully than the python above). What are the problems with this other than, like setdefault, it only works on values with mutator methods (i.e., no counting dicts)? Is the lack of counting dicts that d.__getitem__ supports a deal breaker? d.default_value_factory = int d.get('count') += 1 SyntaxError: can't assign to function call How does the above either in dict or a subclass compare to five line or smaller custom subclasses using something like the following? def append(self, k, val): self.setdefault(k, []).append(val) or def accumulate(self, k, val): try: self[k] += val except KeyError: self[k] = val Michael -- Michael Urman http://www.tortall.net/mu/blog ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal: defaultdict
On Fri, 17 Feb 2006, Phillip J. Eby wrote: d = {} # or dict() d.default_factory = list Why not a classmethod constructor: d = dict.with_factory(list) But I'd rather set the default and create the dictionary in one operation, since when reading it as two, you first think 'd is a dictionary', and then 'oh, but it has a default factory', as opposed to d is a dict with a factory in one thought. Also, class method would mean less typing (esp if dictionary name happens to be longer than a couple of characters ;-) But I'd like to suggest a different name: d = dict.with_default( list) Ilya ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com