Re: [Python-Dev] defaultdict proposal round three
Raymond Hettinger wrote: Like autodict could mean anything. Everything is meaningless until you know something about it. If you'd never seen Python before, would you know what 'dict' meant? If I were seeing defaultdict for the first time, I would need to look up the docs before I was confident I knew exactly what it did -- as I've mentioned before, my initial guess would have been wrong. The same procedure would lead me to an understanding of 'autodict' just as quickly. Maybe 'autodict' isn't the best term either -- I'm open to suggestions. But my instincts still tell me that 'defaultdict' is the best term for something *else* that we might want to add one day as well, so I'm just trying to make sure we don't squander it lightly. -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Greg Ewing wrote: Raymond Hettinger wrote: Like autodict could mean anything. Everything is meaningless until you know something about it. If you'd never seen Python before, would you know what 'dict' meant? If I were seeing defaultdict for the first time, I would need to look up the docs before I was confident I knew exactly what it did -- as I've mentioned before, my initial guess would have been wrong. The same procedure would lead me to an understanding of 'autodict' just as quickly. Maybe 'autodict' isn't the best term either -- I'm open to suggestions. But my instincts still tell me that 'defaultdict' is the best term for something *else* that we might want to add one day as well, so I'm just trying to make sure we don't squander it lightly. Given that the default entries behind the non-existent keys don't actually exist, something like virtual_dict might be appropriate. Or phantom_dict, or ghost_dict. I agree that the naming of things is important. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Steve Holden wrote: Given that the default entries behind the non-existent keys don't actually exist, something like virtual_dict might be appropriate. No, that would suggest to me something like a wrapper object that delegates most of the mapping protocol to something else. That's even less like what we're discussing. In our case the default values are only virtual until you use them, upon which they become real. Sort of like a wave function collapse... hmmm... I suppose 'heisendict' wouldn't fly, would it? -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Greg Ewing wrote: Fuzzyman wrote: I've had problems in code that needs to treat strings, lists and dictionaries differently (assigning values to a container where all three need different handling) and telling the difference but allowing duck typing is *problematic*. You need to rethink your design so that you don't have to make that kind of distinction. Well... to *briefly* explain the use case, it's for value assignment in ConfigObj. It basically accepts as valid values strings and lists of strings [#]_. You can also create new subsections by assigning a dictionary. It needs to be able to recognise lists in order to check each list member is a string. (See note below, it still needs to be able to recognise lists when writing, even if it is not doing type checking on assignment.) It needs to be able to recognise dictionaries in order to create a new section instance (rather than directly assigning the dictionary). This is *terribly* convenient for the user (trivial example of creating a new config file programatically) : from configobj import ConfigObj cfg = ConfigObj(newfilename) cfg['key'] = 'value' cfg['key2'] = ['value1', 'value2', 'value3'] cfg['section'] = {'key': 'value', 'key2': ['value1', 'value2', 'value3']} cfg.write() Writes out : key = value key2 = value1, value2, value3 [section] key = value key2 = value1, value2, value3 (Note none of those values needed quoting, so they aren't.) Obviously I could force the creation of sections and the assignment of list values to use separate methods, but it's much less readable and unnecessary. The code as is works and has a nice API. It still needs to be able to tell what *type* of value is being assigned. Mapping and sequence protocols are so loosely defined that in order to support 'list like objects' and 'dictionary like objects' some arbitrary decision about what methods they should support has to be made. (For example a read only mapping container is unlikely to implement __setitem__ or methods like update). At first we defined a mapping object as one that defines __getitem__ and keys (not update as I previously said), and list like objects as ones that define __getitem__ and *not* keys. For strings we required a basestring subclass. In the end I think we ripped this out and just settled on isinstance tests. All the best, Michael Foord .. [#] Although it has two modes. In the 'default' mode you can assign any object as a value and a string representation is written out. A more strict mode checks values at the point you assign them - so errors will be raised at that point rather than propagating into the config file. When writing you still need to able to recognise lists because each element is properly quoted. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Raymond Hettinger wrote: Like autodict could mean anything. fwiw, the first google hit for autodict appears to be part of someone's link farm At this website we have assistance with autodict. In addition to information for autodict we also have the best web sites concerning dictionary, non profit and new york. This makes autodict.com the most reliable guide for autodict on the Internet. and the second is a description of a self-initializing dictionary data type for Python. /F ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Fuzzyman wrote: cfg = ConfigObj(newfilename) cfg['key'] = 'value' cfg['key2'] = ['value1', 'value2', 'value3'] cfg['section'] = {'key': 'value', 'key2': ['value1', 'value2', 'value3']} If the main purpose is to support this kind of notational convenience, then I'd be inclined to require all the values used with this API to be concrete strings, lists or dicts. If you're going to make types part of the API, I think it's better to do so with a firm hand rather than being half- hearted and wishy-washy about it. Then, if it's really necessary to support a wider variety of types, provide an alternative API that separates the different cases and isn't type-dependent at all. If someone has a need for this API, using it isn't going to be much of an inconvenience, since he won't be able to write out constructors for his types using notation as compact as the above anyway. -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Greg Ewing wrote: Fuzzyman wrote: cfg = ConfigObj(newfilename) cfg['key'] = 'value' cfg['key2'] = ['value1', 'value2', 'value3'] cfg['section'] = {'key': 'value', 'key2': ['value1', 'value2', 'value3']} If the main purpose is to support this kind of notational convenience, then I'd be inclined to require all the values used with this API to be concrete strings, lists or dicts. If you're going to make types part of the API, I think it's better to do so with a firm hand rather than being half- hearted and wishy-washy about it. [snip..] Thanks, that's the solution we settled on. We use ``isinstance`` tests to determine types. The user can always do something like : cfg['section'] = dict(dict_like_object) Which isn't so horrible. All the best, Michael -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
[Alex] I'd love to remove setdefault in 3.0 -- but I don't think it can be done before that: default_factory won't cover the occasional use cases where setdefault is called with different defaults at different locations, and, rare as those cases may be, any 2.* should not break any existing code that uses that approach. I'm not too concerned about this one. Whenever setdefault gets deprecated , then ALL code that used it would have to be changed. If there were cases with different defaults, a regular try/except would do the job just fine (heck, it might even be faster because the won't be a wasted instantiation in the cases where the key already exists). There may be other reasons to delay removing setdefault(), but multiple default use case isn't one of them. An alternative is to have two possible attributes: d.default_factory = list or d.default_value = 0 with an exception being raised when both are defined (the test is done when the attribute is created, not when the lookup is performed). I see default_value as a way to get exactly the same beginner's error we already have with function defaults: That makes sense. I'm somewhat happy with the patch as it stands now. The only part that needs serious rethinking is putting on_missing() in regular dicts. See my other email on that subject. Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
On Feb 22, 2006, at 7:21 AM, Raymond Hettinger wrote: ... I'm somewhat happy with the patch as it stands now. The only part that needs serious rethinking is putting on_missing() in regular dicts. See my other email on that subject. What if we named it _on_missing? Hook methods intended only to be overridden in subclasses are sometimes spelled that way, and it removes the need to teach about it to beginners -- it looks private so we don't explain it at that point. My favorite example is Queue.Queue: I teach it (and in fact evangelize for it as the one sane way to do threading;-) in Python 101, *without* ever mentioning _get, _put etc -- THOSE I teach in Patterns with Python as the very bext example of the Gof4's classic Template Method design pattern. If dict had _on_missing I'd have another wonderful example to teach from! (I believe the Library Reference avoids teaching about _get, _put etc, too, though I haven't checked it for a while). TM is my favorite DP, so I'm biased in favor of Guido's design, and I think that by giving the hook method (not meant to be called, only overridden) a private name we're meeting enough of your and /F's concerns to let _on_missing remain. Its existence does simplify the implementation of defaultdict (and some other dict subclasses), and if the implementation is easy to explain, it may be a good idea, after all;-) Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Fredrik Lundh wrote: fwiw, the first google hit for autodict appears to be part of someone's link farm At this website we have assistance with autodict. In addition to information for autodict we also have the best web sites concerning dictionary, non profit and new york. Hmmm, looks like some sort of bot that takes the words in your search and stuffs them into its response. I wonder if they realise how silly the results end up sounding? I've seen these sorts of things before, but I haven't quite figured out yet how they manage to get into Google's database if they're auto-generated. Anyone have any clues what goes on? -- Greg Ewing, Computer Science Dept, +--+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | [EMAIL PROTECTED] +--+ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Delaney, Timothy (Tim) wrote: However, *because* Python uses duck typing, I tend to feel that subclasses in Python *should* be drop-in replacements. Duck-typing means that the only reliable way to assess whether two types are sufficiently compatible for some purpose is to consult the documentation -- you can't just look at the base class list. I think this should work both ways. It should be okay to *not* document autodict as being a subclass of dict, even if it happens to be implemented that way. I've adopted a convention like this in PyGUI, where I document the classes in terms of a conceptual interface hierarchy, without promising that they will be implemented that way. Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Guido van Rossum wrote: It's quite tricky to implement a fully transparent wrapper that supports all the special methods (__setitem__ etc.). I was thinking the wrapper would only be a means of filling the dict -- it wouldn't even pretend to implement the full dict interface. The only method it would really need to have is __getitem__. The semantics of defaultdict are crystal clear. __contains__(), keys() and friends represent the *actual*, *current* keys. If you're happy with that, then I am too. I was never particularly attached to the wrapper idea -- I just mentioned it as a possible alternative. Just one more thing -- have you made a final decision about the name yet? I'd still prefer something like 'autodict', because to me 'defaultdict' suggests a type that just returns default values without modifying the dict. Maybe it should be reserved for some possible future type that behaves that way. Also, considering the intended use cases (accumulation, etc.) it seems more accurate to think of the value produced by the factory as an 'initial value' rather than a 'default value', and I'd prefer to see it described that way in the docs. If that is done, having 'default' in the name wouldn't be so appropriate. Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Bengt Richter wrote: you could write d = dict()**list Or alternatively, ld = dict[list] i.e. a dict of lists. In the maximally twisted form of this idea, the result wouldn't be a dict but a new *type* of dict, which you would then instantiate: d = ld(your_favourite_args_here) This solves both the constructor-argument problem (the new type can have the same constructor signature as a regular dict with no conflict) and the perceived-Liskov-nonsubstitutability problem (there's no requirement that the new type have any particular conceptual and/or actual inheritance relationship to any other type). Plus being a really cool introduction to the concepts of metaclasses, higher-order functions and all that neat head-exploding stuff. :-) Resolving-not-to-coin-any-more-multihyphenated- hyperpolysyllabic-words-like-'perceived-Liskov- nonsubstitutability'-this-week-ly, Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Greg Ewing wrote: Delaney, Timothy (Tim) wrote: However, *because* Python uses duck typing, I tend to feel that subclasses in Python *should* be drop-in replacements. Duck-typing means that the only reliable way to assess whether two types are sufficiently compatible for some purpose is to consult the documentation -- you can't just look at the base class list. What's the API for that ? I've had problems in code that needs to treat strings, lists and dictionaries differently (assigning values to a container where all three need different handling) and telling the difference but allowing duck typing is *problematic*. Slightly-off-topic'ly-yours, Michael Foord ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
On 2/20/06, Bengt Richter [EMAIL PROTECTED] wrote: How about doing it as an expression, empowering ( ;-) the dict just afer creation? E.g., for d = dict() d.default_factory = list you could write d = dict()**list Bengt, can you let your overactive imagination rest for a while? I recommend that you sit back, relax for a season, and reflect on the zen nature of Pythonicity. Then come back and hopefully you'll be able to post without embarrassing yourself continuously. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
On 2/21/06, Fuzzyman [EMAIL PROTECTED] wrote: I've had problems in code that needs to treat strings, lists and dictionaries differently (assigning values to a container where all three need different handling) and telling the difference but allowing duck typing is *problematic*. Consider designing APIs that don't require you to mae that kind of distinction, if you're worried about edge cases and classifying arbitrary other objects correctly. It's totally possible to create an object that behaves like a hybrid of a string and a dict. If you're only interested in classifying the three specific built-ins you mention, I'd check for the presense of certain attributes: hasattr(x, lower) - x is a string of some kind; hasattr(x, sort) - x is a list; hasattr(x, update) - x is a dict. Also, hasattr(x, union) - x is a set; hasattr(x, readline) - x is a file. That's duck typing! -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
On Feb 21, 2006, at 1:51 AM, Greg Ewing wrote: ... Just one more thing -- have you made a final decision about the name yet? I'd still prefer something like 'autodict', because to me 'defaultdict' suggests autodict is shorter and sharper and I prefer it, too: +1 etc.) it seems more accurate to think of the value produced by the factory as an 'initial value' rather than a 'default value', and I'd prefer to see it If we call the type autodict, then having the factory attribute named autofactory seems to fit. This leaves it open to the reader's imagination to choose whether to think of the value as initial or default -- it's the *auto* (automatic) value. Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Guido van Rossum wrote: On 2/21/06, Fuzzyman [EMAIL PROTECTED] wrote: I've had problems in code that needs to treat strings, lists and dictionaries differently (assigning values to a container where all three need different handling) and telling the difference but allowing duck typing is *problematic*. Consider designing APIs that don't require you to mae that kind of distinction, if you're worried about edge cases and classifying arbitrary other objects correctly. It's totally possible to create an object that behaves like a hybrid of a string and a dict. Understood. If you're only interested in classifying the three specific built-ins you mention, I'd check for the presense of certain attributes: hasattr(x, lower) - x is a string of some kind; hasattr(x, sort) - x is a list; hasattr(x, update) - x is a dict. Also, hasattr(x, union) - x is a set; hasattr(x, readline) - x is a file. That's duck typing! Sure, but that requires a dictionary like object to define an update method, and a list like object to define a sort method. The mapping and sequence protocols are so loosely defined that some arbitrary decision like this has to be made. (Any object that defines __getitem__ could follow either or both and duck typing doesn't help you unless you're prepared to make an additional requirement that is outside the loose requirements of the protocol.) I can't remember how we solved it, but I think we decided that an object would be treated as a string if it passed isinstance, and a dictionary or sequence if it has _getitem__ (but isn't a string instance or subclass). If it has update as well as __getitem__ it is a dictionary-alike. All the best, Michael Foord -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
On 2/21/06, Alex Martelli [EMAIL PROTECTED] wrote: On Feb 21, 2006, at 1:51 AM, Greg Ewing wrote: ... Just one more thing -- have you made a final decision about the name yet? I'd still prefer something like 'autodict', because to me 'defaultdict' suggests autodict is shorter and sharper and I prefer it, too: +1 Apart from it somehow hashing to the same place as autodidact in my brain :), I don't like it as much.; someone who doesn't already know what it is doesn't have a clue what an automatic dictionary would offer compared to a regular one. IMO default conveys just enough of a hint that something is being defaulted. A name long enough to convey all the details of why, when, and it defaults wouldn't be practical. (Look up the history of botanical names under Linnaeus for a simile.) I'll let it brew in SF for a while but I expect to be checking this in at PyCon. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
On Tue, 21 Feb 2006 05:58:52 -0800, Guido van Rossum [EMAIL PROTECTED] wrote: On 2/20/06, Bengt Richter [EMAIL PROTECTED] wrote: How about doing it as an expression, empowering ( ;-) the dict just afer creation? E.g., for d = dict() d.default_factory = list you could write d = dict()**list Bengt, can you let your overactive imagination rest for a while? I recommend that you sit back, relax for a season, and reflect on the zen nature of Pythonicity. Then come back and hopefully you'll be able to post without embarrassing yourself continuously. It is tempting to seek vindication re embarrassing yourself continuously but I'll let it go, and treat it as an opportunity to explore the nature of my ego a little further ;-) I am not embarrassed by having an overactive imagination, thank you, but if it is causing a problem for you here, I apologize, and will withdraw. Thanks for the nudge. I really have been wasting a lot of time using python trivial pursuits as an escape from tackling stuff that I haven't been ready for. It's time I focus. Thanks, and good luck. I'll be off now ;-) Regards, Bengt Richter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Alex Martelli wrote: If we call the type autodict, then having the factory attribute named autofactory seems to fit. Or just 'factory', since it's the only kind of factory the object is going to have. -- Greg Ewing, Computer Science Dept, +--+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | [EMAIL PROTECTED] +--+ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Fuzzyman wrote: I've had problems in code that needs to treat strings, lists and dictionaries differently (assigning values to a container where all three need different handling) and telling the difference but allowing duck typing is *problematic*. You need to rethink your design so that you don't have to make that kind of distinction. -- Greg Ewing, Computer Science Dept, +--+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | [EMAIL PROTECTED] +--+ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Alex Martelli wrote: If we call the type autodict, then having the factory attribute named autofactory seems to fit. Or just 'factory', since it's the only kind of factory the object is going to have. Gack, no. You guys are drifting towards complete ambiguity. You might as well call it thingie_that_doth_return_an_object. The word factory by itself says nothing about lookups and default values. Like autodict could mean anything. Keep in mind that we may well end-up having this side-by-side with collections.ordered_dict. The word auto tells you nothing about how this is different from a regular dict or ordered dictionary. It's meaningless. Please, stick with defaultdictionary and default_factory. While not perfectly descriptive, they are suggest just enough to jog the memory and make the code readable. Try to resist generalizing the name into nothingness. Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
On Feb 20, 2006, at 5:41 AM, Guido van Rossum wrote: ... Alternative A: add a new method to the dict type with the semantics of __getattr__ from the last proposal, using default_factory if not None (except on_missing is inlined). This avoids the discussion about broken invariants, but one could argue that it adds to an already overly broad API. Alternative B: provide a dict subclass that implements the __getattr__ semantics from the last proposal. It could be an unrelated type for all I care, but I do care about implementation inheritance since it should perform just as well as an unmodified dict object, and that's hard to do without sharing implementation (copying would be worse). Let's do both!...;-). Add a method X to dict as per A _and_ provide in collections a subclass of dict that sets __getattr__ to X and also takes the value of default_dict as the first mandatory argument to __init__. Yes, mapping is a fat interface, chock full of convenience methods, but that's what makes it OK to add another, when it's really convenient; and nearly nobody's been arguing against defaultdict, only about the details of its architecture, so the convenience of this X can be taken as established. As long as DictMixin changes accordingly, the downsides are small. Also having a collections.defaultdict as well as method X would be my preference, for even more convenience. From my POV, either or both of these additions would be an improvement wrt 2.4 (as would most of the other alternatives debated here), but I'm keen to have _some_ alternative get in, rather than all being blocked out of 2.5 by analysis paralysis. Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
On Mon, 20 Feb 2006 05:41:43 -0800, Guido van Rossum [EMAIL PROTECTED] wrote: I'm withdrawing the last proposal. I'm not convinced by the argument that __contains__ should always return True (perhaps it should also insert the value?), nor by the complaint that a holy invariant would be violated (so what?). But the amount of discussion and the number of different viewpoints present makes it clear that the feature as I last proposed would be forever divisive. I see two alternatives. These will cause a different kind of philosophical discussion; so be it. I'll describe them relative to the last proposal; for those who wisely skipped the last thread, here's a link to the proposal: http://mail.python.org/pipermail/python-dev/2006-February/061261.html. Alternative A: add a new method to the dict type with the semantics of __getattr__ from the last proposal, using default_factory if not None (except on_missing is inlined). This avoids the discussion about broken invariants, but one could argue that it adds to an already overly broad API. Alternative B: provide a dict subclass that implements the __getattr__ semantics from the last proposal. It could be an unrelated type for all I care, but I do care about implementation inheritance since it should perform just as well as an unmodified dict object, and that's hard to do without sharing implementation (copying would be worse). Parting shots: - Even if the default_factory were passed to the constructor, it still ought to be a writable attribute so it can be introspected and modified. A defaultdict that can't change its default factory after its creation is less useful. - It would be unwise to have a default value that would be called if it was callable: what if I wanted the default to be a class instance that happens to have a __call__ method for unrelated reasons? You'd have to put it in a lambda: thing_with_unrelated__call__method Callability is an elusive propperty; APIs should not attempt to dynamically decide whether an argument is callable or not. - A third alternative would be to have a new method that takes an explicit defaut factory argument. This differs from setdefault() only in the type of the second argument. I'm not keen on this; the original use case came from an example where the readability of d.setdefault(key, []).append(value) was questioned, and I'm not sure that d.something(key, list).append(value) is any more readable. IOW I like (and I believe few have questioned) associating the default factory with the dict object instead of with the call site. Let the third round of the games begin! Sorry if I missed it, but is it established that defaulting lookup will be spelled the same as traditional lookup, i.e. d[k] or d.__getitem__(k) ? IOW, are default-enabled dicts really going to be be passed into unknown contexts for use as a dict workalike? I can see using on_missing for external side effects like logging etc., or _maybe_ modifying the dict with a known separate set of keys that wouldn't be used for the normal purposes of the dict. ISTM a defaulting dict could only reasonably be passed into contexts that expected it, but that could still be useful also. How about d = dict() for a totally normal dict, and d.defaulting to get a view that uses d.default_factory if present? E.g., d = dict() d.default_factory = list for i,name in enumerate('Eeny Meeny Miny Moe'.split()): # prefix insert order d.defaulting[name].append(i) # or hoist d.defaulting = dd[name].append(i) Maybe d.defaulting could be a descriptor? If the above were done, could d.on_missing be independent and always active if present? E.g., d.on_missing = lambda self, key: self.__setitem__(key, 0) or 0 would be allowed to work on its own first, irrespective of whether default_factory was set. If it created d[key] it would effectively override default_factory if active, and if not active, it would still act, letting you instrument a normal dict with special effects. Of course, if you wanted to write an on_missing handler to use default_factory like your original example, you could. So on_missing would always trigger if present, for missing keys, but d.defaulting[k] would only call d.default_factory if the latter was set and the key was missing even after on_missing (if present) did something (e.g., it could be logging passively). Regards, Bengt Richter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
On 2/20/06, Raymond Hettinger [EMAIL PROTECTED] wrote: [GvR] Alternative A: add a new method to the dict type with the semantics of __getattr__ from the last proposal Did you mean __getitem__? Yes, sorry, I meant __getitem__. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
On Feb 20, 2006, at 8:35 AM, Raymond Hettinger wrote: [GvR] I'm not convinced by the argument that __contains__ should always return True Me either. I cannot think of a more useless behavior or one more likely to have unexpected consequences. Besides, as Josiah pointed out, it is much easier for a subclass override to substitute always True return values than vice-versa. Agreed on all counts. I prefer this approach over subclassing. The mental load from an additional method is less than the load from a separate type (even a subclass). Also, avoidance of invariant issues is a big plus. Besides, if this allows setdefault() to be deprecated, it becomes an all-around win. I'd love to remove setdefault in 3.0 -- but I don't think it can be done before that: default_factory won't cover the occasional use cases where setdefault is called with different defaults at different locations, and, rare as those cases may be, any 2.* should not break any existing code that uses that approach. - Even if the default_factory were passed to the constructor, it still ought to be a writable attribute so it can be introspected and modified. A defaultdict that can't change its default factory after its creation is less useful. Right! My preference is to have default_factory not passed to the constructor, so we are left with just one way to do it. But that is a nit. No big deal either way, but I see passing the default factory to the ctor as the one obvious way to do it, so I'd rather have it (be it with a subclass or a classmethod-alternate constructor). I won't weep bitter tears if this drops out, though. - It would be unwise to have a default value that would be called if it was callable: what if I wanted the default to be a class instance that happens to have a __call__ method for unrelated reasons? Callability is an elusive propperty; APIs should not attempt to dynamically decide whether an argument is callable or not. That makes sense, though it seems over-the-top to need a zero- factory for a multiset. But int is a convenient zero-factory. An alternative is to have two possible attributes: d.default_factory = list or d.default_value = 0 with an exception being raised when both are defined (the test is done when the attribute is created, not when the lookup is performed). I see default_value as a way to get exactly the same beginner's error we already have with function defaults: a mutable object will not work as beginners expect, and we can confidently predict (based on the function defaults case) that python-list and python-help and python-tutor and a bazillion other venues will see an unending stream of confused beginners (in addition to those confused by mutable objects as default values for function arguments, but those can't be avoided). I presume you consider the one obvious way is to use default_value for immutables and default_factory for mutables, but based on a lot of experience teaching Python I feel certain that this won't be obvious to many, MANY users (and not just non-Dutch ones, either). Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Guido van Rossum wrote: Alternative A: add a new method to the dict type with the semantics of __getattr__ from the last proposal, using default_factory if not None (except on_missing is inlined). I'm not certain I understood this right but (after s/__getattr__/__getitem__) this seems to suggest that for keeping a dict of counts the code wouldn't really improve much: dd = {} dd.default_factory = int for item in items: # I want to do ``dd[item] += 1`` but with a regular method instead # of __getitem__, this is not possible dd[item] = dd.somenewmethod(item) + 1 I don't think that's much better than just calling ``dd.get(item, 0)``. Did I misunderstand Alternative A? Alternative B: provide a dict subclass that implements the __getattr__ semantics from the last proposal. If I didn't misinterpret Alternative A, I'd definitely prefer Alternative B. A dict of counts is by far my most common use case... STeVe -- Grammar am for people who can't think for myself. --- Bucky Katt, Get Fuzzy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
On 2/20/06, Steven Bethard [EMAIL PROTECTED] wrote: Guido van Rossum wrote: Alternative A: add a new method to the dict type with the semantics of [__getitem__] from the last proposal, using default_factory if not None (except on_missing is inlined). I'm not certain I understood this right but [...] this seems to suggest that for keeping a dict of counts the code wouldn't really improve much: You don't need a new feature for that use case; d[k] = d.get(k, 0) + 1 is perfectly fine there and hard to improve upon. It's the slightly more esoteric use case where the default is a list and you want to append to that list that we're trying to improve: currently the shortest version is d.setdefault(k, []).append(v) but that lacks legibility and creates an empty list that is thrown away most of the time. We're trying to obtain the minimal form d.foo(k).append(v) where the new list is created by implicitly calling d.default_factory if d[k] doesn't yet exist, and d.default_factory is set to the list constructor. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Sorry to chime in so late, but why are we setting a value when the key isn't defined? It seems there are many situations where you want: a) default values, and b) the ability to determine if a value was defined. There are many times that I want d[key] to give me a value even when it isn't defined, but that doesn't always mean I want to _save_ that value in the dict. Sometimes I do, sometimes I don't. We should have some means of describing this in any defaultdict implementation On 2/20/06, Guido van Rossum [EMAIL PROTECTED] wrote: I'm withdrawing the last proposal. I'm not convinced by the argument that __contains__ should always return True (perhaps it should also insert the value?), nor by the complaint that a holy invariant would be violated (so what?). But the amount of discussion and the number of different viewpoints present makes it clear that the feature as I last proposed would be forever divisive. I see two alternatives. These will cause a different kind of philosophical discussion; so be it. I'll describe them relative to the last proposal; for those who wisely skipped the last thread, here's a link to the proposal: http://mail.python.org/pipermail/python-dev/2006-February/061261.html. Alternative A: add a new method to the dict type with the semantics of __getattr__ from the last proposal, using default_factory if not None (except on_missing is inlined). This avoids the discussion about broken invariants, but one could argue that it adds to an already overly broad API. Alternative B: provide a dict subclass that implements the __getattr__ semantics from the last proposal. It could be an unrelated type for all I care, but I do care about implementation inheritance since it should perform just as well as an unmodified dict object, and that's hard to do without sharing implementation (copying would be worse). Parting shots: - Even if the default_factory were passed to the constructor, it still ought to be a writable attribute so it can be introspected and modified. A defaultdict that can't change its default factory after its creation is less useful. - It would be unwise to have a default value that would be called if it was callable: what if I wanted the default to be a class instance that happens to have a __call__ method for unrelated reasons? Callability is an elusive propperty; APIs should not attempt to dynamically decide whether an argument is callable or not. - A third alternative would be to have a new method that takes an explicit defaut factory argument. This differs from setdefault() only in the type of the second argument. I'm not keen on this; the original use case came from an example where the readability of d.setdefault(key, []).append(value) was questioned, and I'm not sure that d.something(key, list).append(value) is any more readable. IOW I like (and I believe few have questioned) associating the default factory with the dict object instead of with the call site. Let the third round of the games begin! -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/crutcher%40gmail.com -- Crutcher Dunnavant [EMAIL PROTECTED] littlelanguages.com monket.samedi-studios.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
I'm thinking something mutch closer to this (note default_factory gets the key): def on_missing(self, key): if self.default_factory is not None: value = self.default_factory(key) if self.on_missing_define_key: self[key] = value return value raise KeyError(key) On 2/20/06, Crutcher Dunnavant [EMAIL PROTECTED] wrote: Sorry to chime in so late, but why are we setting a value when the key isn't defined? It seems there are many situations where you want: a) default values, and b) the ability to determine if a value was defined. There are many times that I want d[key] to give me a value even when it isn't defined, but that doesn't always mean I want to _save_ that value in the dict. Sometimes I do, sometimes I don't. We should have some means of describing this in any defaultdict implementation On 2/20/06, Guido van Rossum [EMAIL PROTECTED] wrote: I'm withdrawing the last proposal. I'm not convinced by the argument that __contains__ should always return True (perhaps it should also insert the value?), nor by the complaint that a holy invariant would be violated (so what?). But the amount of discussion and the number of different viewpoints present makes it clear that the feature as I last proposed would be forever divisive. I see two alternatives. These will cause a different kind of philosophical discussion; so be it. I'll describe them relative to the last proposal; for those who wisely skipped the last thread, here's a link to the proposal: http://mail.python.org/pipermail/python-dev/2006-February/061261.html. Alternative A: add a new method to the dict type with the semantics of __getattr__ from the last proposal, using default_factory if not None (except on_missing is inlined). This avoids the discussion about broken invariants, but one could argue that it adds to an already overly broad API. Alternative B: provide a dict subclass that implements the __getattr__ semantics from the last proposal. It could be an unrelated type for all I care, but I do care about implementation inheritance since it should perform just as well as an unmodified dict object, and that's hard to do without sharing implementation (copying would be worse). Parting shots: - Even if the default_factory were passed to the constructor, it still ought to be a writable attribute so it can be introspected and modified. A defaultdict that can't change its default factory after its creation is less useful. - It would be unwise to have a default value that would be called if it was callable: what if I wanted the default to be a class instance that happens to have a __call__ method for unrelated reasons? Callability is an elusive propperty; APIs should not attempt to dynamically decide whether an argument is callable or not. - A third alternative would be to have a new method that takes an explicit defaut factory argument. This differs from setdefault() only in the type of the second argument. I'm not keen on this; the original use case came from an example where the readability of d.setdefault(key, []).append(value) was questioned, and I'm not sure that d.something(key, list).append(value) is any more readable. IOW I like (and I believe few have questioned) associating the default factory with the dict object instead of with the call site. Let the third round of the games begin! -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/crutcher%40gmail.com -- Crutcher Dunnavant [EMAIL PROTECTED] littlelanguages.com monket.samedi-studios.com -- Crutcher Dunnavant [EMAIL PROTECTED] littlelanguages.com monket.samedi-studios.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Alex Martelli wrote: I prefer this approach over subclassing. The mental load from an additional method is less than the load from a separate type (even a subclass). Also, avoidance of invariant issues is a big plus. Besides, if this allows setdefault() to be deprecated, it becomes an all-around win. I'd love to remove setdefault in 3.0 -- but I don't think it can be done before that: default_factory won't cover the occasional use cases where setdefault is called with different defaults at different locations, and, rare as those cases may be, any 2.* should not break any existing code that uses that approach. Would it be deprecated in 2.*, or start deprecating in 3.0? Also, is default_factory=list threadsafe in the same way .setdefault is? That is, you can safely do this from multiple threads: d.setdefault(key, []).append(value) I believe this is safe with very few caveats -- setdefault itself is atomic (or else I'm writing some bad code ;). My impression is that default_factory will not generally be threadsafe in the way setdefault is. For instance: def make_list(): return [] d = dict d.default_factory = make_list # from multiple threads: d.getdef(key).append(value) This would not be correct (a value can be lost if two threads concurrently enter make_list for the same key). In the case of default_factory=list (using the list builtin) is the story different? Will this work on Jython, IronPython, or PyPy? Will this be a documented guarantee? Or alternately, are we just creating a new way to punish people who use threads? And if we push threadsafety up to user code, are we trading a very small speed issue (creating lists that are thrown away) for a much larger speed issue (acquiring a lock)? I tried to make a test for this threadsafety, actually -- using a technique besides setdefault which I knew was bad (try:except KeyError:). And (except using time.sleep(), which is cheating), I wasn't actually able to trigger the bug. Which is frustrating, because I know the bug is there. So apparently threadsafety is hard to test in this case. (If anyone is interested in trying it, I can email what I have.) Note that multidict -- among other possible concrete collection patterns (like Bag, OrderedDict, or others) -- can be readily implemented with threading guarantees. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Steven Bethard wrote: Alternative A: add a new method to the dict type with the semantics of __getattr__ from the last proposal, using default_factory if not None (except on_missing is inlined). I'm not certain I understood this right but (after s/__getattr__/__getitem__) this seems to suggest that for keeping a dict of counts the code wouldn't really improve much: dd = {} dd.default_factory = int for item in items: # I want to do ``dd[item] += 1`` but with a regular method instead # of __getitem__, this is not possible dd[item] = dd.somenewmethod(item) + 1 This would be better done with a bag (a set that can contain multiple instances of the same item): dd = collections.Bag() for item in items: dd.add(item) Then to see how many there are of an item, perhaps something like: dd.count(item) No collections.Bag exists, but of course one should. It has nice properties -- inclusion is done with __contains__ (with dicts it probably has to be done with get), you can't accidentally go below zero, the methods express intent, and presumably it will implement only a meaningful set of methods. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
On 2/20/06, Ian Bicking [EMAIL PROTECTED] wrote: Would it be deprecated in 2.*, or start deprecating in 3.0? 3.0 will have no backwards compatibility allowances. Whenever someone says remove this in 3.0 they mean exactly that. There will be too many incompatibilities in 3.0 to be bothered with deprecating them all; most likely we'll have to have some kind of (semi-)automatic conversion tool. Deprecation in 2.x is generally done to indicate that a feature will be removed in 2.y for y = x+1. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
On Feb 20, 2006, at 12:33 PM, Guido van Rossum wrote: ... You don't need a new feature for that use case; d[k] = d.get(k, 0) + 1 is perfectly fine there and hard to improve upon. I see d[k]+=1 as a substantial improvement -- conceptually more direct, I've now seen one more k than I had seen before. Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
On 2/20/06, Ian Bicking [EMAIL PROTECTED] wrote: Also, is default_factory=list threadsafe in the same way .setdefault is? That is, you can safely do this from multiple threads: d.setdefault(key, []).append(value) I believe this is safe with very few caveats -- setdefault itself is atomic (or else I'm writing some bad code ;). Only if the key is a string and all values in the dict are also strings (or other builtins). And I don't think that Jython or IronPython promise anything here. Here's a sketch of a situation that isn't thread-safe: class C: def __eq__(self, other): return False def __hash__(self): return hash(abc) d = {C(): 42} print d[abc] Because abc and C() have the same hash value, the lookup will compare abc to C() which will invoke C.__eq__(). Why are you so keen on using a dictionary to share data between threads that may both modify it? IMO this is asking for trouble -- the advice about sharing data between threads is always to use the Queue module. [...] Note that multidict -- among other possible concrete collection patterns (like Bag, OrderedDict, or others) -- can be readily implemented with threading guarantees. I don't believe that this is as easy as you think. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
On 2/20/06, Alex Martelli [EMAIL PROTECTED] wrote: On Feb 20, 2006, at 12:33 PM, Guido van Rossum wrote: ... You don't need a new feature for that use case; d[k] = d.get(k, 0) + 1 is perfectly fine there and hard to improve upon. I see d[k]+=1 as a substantial improvement -- conceptually more direct, I've now seen one more k than I had seen before. Yes, I now agree. This means that I'm withdrawing proposal A (new method) and championing only B (a subclass that implements __getitem__() calling on_missing() and on_missing() defined in that subclass as before, calling default_factory unless it's None). I don't think this crisis is big enough to need *two* solutions, and this example shows B's superiority over A. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Guido van Rossum wrote: Why are you so keen on using a dictionary to share data between threads that may both modify it? IMO this is asking for trouble -- the advice about sharing data between threads is always to use the Queue module. I use them often for a shared caches. But yeah, it's harder than I thought at first -- I think the actual cases I'm using work, since they use simple keys (ints, strings), but yeah, thread guarantees are too difficult to handle in general. Damn threads. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
I wrote: # I want to do ``dd[item] += 1`` Guido van Rossum wrote: You don't need a new feature for that use case; d[k] = d.get(k, 0) + 1 is perfectly fine there and hard to improve upon. Alex Martelli wrote: I see d[k]+=1 as a substantial improvement -- conceptually more direct, I've now seen one more k than I had seen before. Guido van Rossum wrote: Yes, I now agree. This means that I'm withdrawing proposal A (new method) and championing only B (a subclass that implements __getitem__() calling on_missing() and on_missing() defined in that subclass as before, calling default_factory unless it's None). Probably already obvious from my previous post, but FWIW, +1. Two unaddressed issues: * What module should hold the type? I hope the collections module isn't too controversial. * Should default_factory be an argument to the constructor? The three answers I see: - No. I'm not a big fan of this answer. Since the whole point of creating a defaultdict type is to provide a default, requiring two statements (the constructor call and the default_factory assignment) to initialize such a dictionary seems a little inconvenient. - Yes and it should be followed by all the normal dict constructor arguments. This is okay, but a few errors, like ``defaultdict({1:2})`` will pass silently (until you try to use the dict, of course). - Yes and it should be the only constructor argument. This is my favorite mainly because I think it's simple, and I couldn't think of good examples where I really wanted to do ``defaultdict(list, some_dict_or_iterable)`` or ``defaultdict(list, **some_keyword_args)``. It's also forward compatible if we need to add some of the dict constructor args in later. STeVe -- Grammar am for people who can't think for myself. --- Bucky Katt, Get Fuzzy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
On 2/20/06, Steven Bethard [EMAIL PROTECTED] wrote: I wrote: # I want to do ``dd[item] += 1`` Guido van Rossum wrote: You don't need a new feature for that use case; d[k] = d.get(k, 0) + 1 is perfectly fine there and hard to improve upon. Alex Martelli wrote: I see d[k]+=1 as a substantial improvement -- conceptually more direct, I've now seen one more k than I had seen before. Guido van Rossum wrote: Yes, I now agree. This means that I'm withdrawing proposal A (new method) and championing only B (a subclass that implements __getitem__() calling on_missing() and on_missing() defined in that subclass as before, calling default_factory unless it's None). Probably already obvious from my previous post, but FWIW, +1. Two unaddressed issues: * What module should hold the type? I hope the collections module isn't too controversial. * Should default_factory be an argument to the constructor? The three answers I see: - No. I'm not a big fan of this answer. Since the whole point of creating a defaultdict type is to provide a default, requiring two statements (the constructor call and the default_factory assignment) to initialize such a dictionary seems a little inconvenient. - Yes and it should be followed by all the normal dict constructor arguments. This is okay, but a few errors, like ``defaultdict({1:2})`` will pass silently (until you try to use the dict, of course). - Yes and it should be the only constructor argument. This is my favorite mainly because I think it's simple, and I couldn't think of good examples where I really wanted to do ``defaultdict(list, some_dict_or_iterable)`` or ``defaultdict(list, **some_keyword_args)``. It's also forward compatible if we need to add some of the dict constructor args in later. While #3 is my preferred solution as well, it does pose a Liskov violation if this is a direct dict subclass instead of storing a dict internally (can't remember the name of the design pattern that does this). But I think it is good to have the constructor be different since it does also help drive home the point that this is not a standard dict. -Brett ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
On 2/20/06, Raymond Hettinger [EMAIL PROTECTED] wrote: An alternative is to have two possible attributes:d.default_factory = listord.default_value = 0with an exception being raised when both are defined (the test is done when theattribute is created, not when the lookup is performed). Why not have the factory function take the key being looked up as an argument? Seems like there would be uses to customize the default based on the key. It also forces you to handle list factory functions and constant factory functions (amongst others) to be handled the same way: d.default_factory = lambda k : list() d.default_factory = lambda k : 0 Dan Gass ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
[Steven Bethard] * Should default_factory be an argument to the constructor? The three answers I see: - No. I'm not a big fan of this answer. Since the whole point of creating a defaultdict type is to provide a default, requiring two statements (the constructor call and the default_factory assignment) to initialize such a dictionary seems a little inconvenient. You still have to allow assignments to the default_factory attribute to allow the factory to be changed: dd.default_factory = SomeFactory If it's too much effort to do the initial setup in two lines, a classmethod could serve as an alternate constructor (leaving the regular contructor fully interchangeable with dicts): dd = defaultdict.setup(list, {'k1':'v1', 'k2:v2'}) or when there are no initial values: dd = defaultdict.setup(list) Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
On 2/20/06, Dan Gass [EMAIL PROTECTED] wrote: Why not have the factory function take the key being looked up as an argument? Seems like there would be uses to customize the default based on the key. It also forces you to handle list factory functions and constant factory functions (amongst others) to be handled the same way: d.default_factory = lambda k : list() d.default_factory = lambda k : 0 Guido's currently backing a subclass that implements __getitem__() calling on_missing() and on_missing() ... calling default_factory unless it's None. I think for 90% of the use-cases, you don't need a key argument. If you do, you should subclass defaultdict and override the on_missing() method. STeVe -- Grammar am for people who can't think for myself. --- Bucky Katt, Get Fuzzy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
On 2/20/06, Brett Cannon [EMAIL PROTECTED] wrote: While #3 is my preferred solution as well, it does pose a Liskov violation if this is a direct dict subclass instead of storing a dict internally (can't remember the name of the design pattern that does this). But I think it is good to have the constructor be different since it does also help drive home the point that this is not a standard dict. I've heard this argument a few times now from different folks and I'm tired of it. It's wrong. It's not true. It's a dead argument. It's pushing up the daisies, so to speak. Please stop abusing Barbara Liskov's name and remember that the constructor signature is *not* part of the interface to an instance! Changing the constructor signature in a subclass does *not* cause *any* Liskov violations because the constructor is not called by *users* of the object -- it is only called to *create* an object. As the *user* of an object you're not allowed to *create* another instance (unless the object provides an explicit API to do so, of course, in which case you deal with that API's signature, not with the constructor). -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
On 2/20/06, Dan Gass [EMAIL PROTECTED] wrote: Why not have the factory function take the key being looked up as an argument? This was considered and rejected already. You can already customize based on the key by overriding on_missing() [*]. If the factory were to take a key argument, we couldn't use list or int as the factory function; we'd have to write lambda key: list(). There aren't that many use cases for having the factory function depend on the key anyway; it's mostly on_missing() that needs the key so it can insert the new value into the dict. [*] Earlier in this thread I wrote that on_missing() could be inlined. I take that back; I think it's better to have it be available explicitly so you can override it without having to override __getitem__(). This is faster, assuming most __getitem__() calls find the key already inserted, and reduces the amount of code you have to write to customize the behavior; it also reduces worries about how to call the superclass __getitem__ method (catching KeyError *might* catch an unrelated KeyError caused by a bug in the key's __hash__ or __eq__ method). -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Guido van Rossum wrote: I see two alternatives. Have you considered the third alternative that's been mentioned -- a wrapper? The issue of __contains__ etc. could be sidestepped by not giving the wrapper a __contains__ method at all. If you want to do an 'in' test you do it on the underlying dict, and then the semantics are clear. Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Brett Cannon wrote: While #3 is my preferred solution as well, it does pose a Liskov violation if this is a direct dict subclass I'm not sure we should be too worried about that. Inheritance in Python has always been more about implementation than interface, so Liskov doesn't really apply in the same way it does in statically typed languages. In other words, just because A inherits from B in Python isn't meant to imply that an A is a drop-in replacement for a B. Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Greg Ewing wrote: In other words, just because A inherits from B in Python isn't meant to imply that an A is a drop-in replacement for a B. Hmm - this is interesting. I'm not arguing Liskov violations or anything ... However, *because* Python uses duck typing, I tend to feel that subclasses in Python *should* be drop-in replacements. If it's not a drop-in replacement, then it should probably not subclass, but just use duck typing (probably by wrapping). Subclassing implies a stronger relationship to me. Which is why I think I prefer using a wrapper for a default dict, rather than a subclass. Tim Delaney ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Delaney, Timothy (Tim) wrote: However, *because* Python uses duck typing, I tend to feel that subclasses in Python *should* be drop-in replacements. If it's not a drop-in replacement, then it should probably not subclass, but just use duck typing (probably by wrapping). Inheritance is more about code reuse than about polymorphism. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
On Feb 20, 2006, at 3:04 PM, Brett Cannon wrote: ... - Yes and it should be the only constructor argument. This is my ... While #3 is my preferred solution as well, it does pose a Liskov violation if this is a direct dict subclass instead of storing a dict How so? Liskov's principle is (in her own words): If for each object o1 of type S there is an object o2 of type T such that for all programs P defined in terms of T, the behavior of P is unchanged when o1 is substituted for o2 then S is a subtype of T. How can this ever be broken by the mere presence of incompatible signatures for T's and S's ctors? I believe the principle, as stated above, was imperfectly stated, btw (it WAS preceded by something like the following substitution property, indicating that Liskov was groping towards a good formulation), but that's an aside -- the point is that the principle is about substitution of _objects_, i.e., _instances_ of the types S and T, not about substitution of the _types_ themselves for each other. Instances exist and are supposed to satisfy their invariants _after_ ctors are done executing; ctor's signatures don't matter. In Python, of course, you _could_ call type(o2)(...) and possibly get different behavior if that was changed into type(o1)(...) -- the curse of powerful introspection;-). But then, isn't it trivial to obtain cases in which the behavior is NOT unchanged? If it was always unchanged, what would be the point of ever subclassing?-) Say that o2 is an int and o1 is a bool -- just a print o2 already breaks the principle as stated (it's harder to get a simpler P than this...). Unless you have explicitly documented invariants (such as any 'print o' must emit 1+ digits followed by a newline for integers), you cannot say that some alleged subclass is breaking Liskov's property, in general. Mere change of behavior in the most general case cannot qualify, if method overriding is to be any use; such change IS traditionally allowed as long as preconditions are looser and postconditions are stricter; and I believe than in any real-world subclassing, with sufficient introspection you'll always find a violation E.g., a subtype IS allowed to add methods, by Liskov's specific example; but then, len(dir(o1)) cannot fail to be a higher number than len(dir(o2)), from which you can easily construct a P which changes behavior for any definition you care to choose. E.g., pick constant N as the len(dir(...)) for instances of type T, and say that MN is the len(dir(...)) for instances of S. Well, then, math.sqrt(N-len(dir(o2))) is well defined -- but change o2 into o1, and since N-M is 0, you'll get an exception. If you can give an introspection-free example showing how Liskov substitution would be broken by a mere change to incompatible signature in the ctor, I'll be grateful; but I don't think it can be done. Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
[Alex] I see d[k]+=1 as a substantial improvement -- conceptually more direct, I've now seen one more k than I had seen before. [Guido] Yes, I now agree. This means that I'm withdrawing proposal A (new method) and championing only B (a subclass that implements __getitem__() calling on_missing() and on_missing() defined in that subclass as before, calling default_factory unless it's None). I don't think this crisis is big enough to need *two* solutions, and this example shows B's superiority over A. FWIW, I'm happy with the proposal and think it is a nice addition to Py2.5. Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
On Feb 20, 2006, at 5:05 PM, Raymond Hettinger wrote: [Alex] I see d[k]+=1 as a substantial improvement -- conceptually more direct, I've now seen one more k than I had seen before. [Guido] Yes, I now agree. This means that I'm withdrawing proposal A (new method) and championing only B (a subclass that implements __getitem__() calling on_missing() and on_missing() defined in that subclass as before, calling default_factory unless it's None). I don't think this crisis is big enough to need *two* solutions, and this example shows B's superiority over A. FWIW, I'm happy with the proposal and think it is a nice addition to Py2.5. OK, sounds great to me. collections.defaultdict, then? Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
in two ways: 1) dict.get doesn't work for object dicts or in exec/eval contexts, and 2) dict.get requires me to generate the default value even if I'm not going to use it, a process which may be expensive. On 2/20/06, Raymond Hettinger [EMAIL PROTECTED] wrote: [Crutcher Dunnavant ] There are many times that I want d[key] to give me a value even when it isn't defined, but that doesn't always mean I want to _save_ that value in the dict. How does that differ from the existing dict.get method? Raymond -- Crutcher Dunnavant [EMAIL PROTECTED] littlelanguages.com monket.samedi-studios.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
On 2/20/06, Guido van Rossum [EMAIL PROTECTED] wrote: [stuff with typos] Here's the proofread version: I have a patch ready that implements this. I've assigned it to Raymond for review. I'm just reusing the same SF patch as before: http://python.org/sf/1433928 . One subtlety: for maximal flexibility and speed, the standard dict type now defines an on_missing(key) method; however this version *just* raises KeyError and the implementation actually doesn't call it unless the class is a subtype (with the possibility of overriding on_missing()). collections.defaultdict overrides on_missing(key) to insert and return self.default_factory() if it is not None; otherwise it raises KeyError. (It should really call the base class on_missing() but I figured I'd just in-line it which is easier to code in C than a super-call.) The defaultdict signature takes an optional positional argument which is the default_factory, defaulting to None. The remaining positional and all keyword arguments are passed to the dict constructor. IOW: d = defaultdict(list, [(1, 2)]) is equivalent to: d = defaultdict() d.default_factory = list d.update([(1, 2)]) At this point, repr(d) will be: defaultdict(type 'list', {1: 2}) Once Raymond approves the patch I'll check it in. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Martin v. Löwis wrote: Delaney, Timothy (Tim) wrote: However, *because* Python uses duck typing, I tend to feel that subclasses in Python *should* be drop-in replacements. If it's not a drop-in replacement, then it should probably not subclass, but just use duck typing (probably by wrapping). Inheritance is more about code reuse than about polymorphism. Oh - it's definitely no hard-and-fast rule. owever, I have found that *usually* people (including myself) only subclass when they want an is-a relationship, whereas duck typing is behaves-like. In any case, Guido has produced a patch, and the tone of his message sounded like a Pronouncement ... Tim Delaney ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
[Guido] ... What's the practical use case for not wanting __contains__() to function? I don't know. I have practical use cases for wanting __contains__() to function, but there's been no call for those. For an example, think of any real use ;-) For example, I often use dicts to represent multisets, where a key maps to a strictly positive count of the number of times that key appears in the multiset. A default of 0 is the right thing to return for a key not in the multiset, so that M[k] += 1 works to add another k to multiset M regardless of whether k was already present. I sure hope I can implement multiset intersection as, e.g., def minter(a, b): if len(b) len(a): # make `a` the smaller, and iterate over it a, b = b, a result = defaultdict defaulting to 0, however that's spelled for k in a: if k in b: result[k] = min(a[k], b[k]) return result Replacing the loop nest with: for k in a: result[k] = min(a[k], b[k]) would be semantically correct so far as it goes, but pragmatically wrong: I maintain my strictly positive count invariant because consuming RAM to hold elements that aren't there can be a pragmatic disaster. (When `k` is in `a` but not in `b`, I don't want `k` to be stored in `result`) I have other examples, but they come so easily it's better to leave that an exercise for the reader. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com