Re: [Python-Dev] Proposal: defaultdict

2006-03-01 Thread Gareth McCaughan
d.get(key, [], True).append(value)
  
  hmm.  are you sure you didn't just reinvent setdefault ?
 
 I'm reasonably sure I copied it on purpose, only with a name that isn't 100% 
 misleading as to what it does ;)

Heh. From the original Usenet posting that suggested the capability
that was added in the form of setdefault:

  | I suggest a minor change: another optional argument to
  | get so that
  | 
  | dict.get(item,default,flag)
  | 
  | is equivalent to
  | 
  | if dict.has_key(item):
  |   VALUE IS dict[item]
  | else:
  |   if flag: dict[item] = default-- This is all that's new
  |   VALUE IS default
  | 
  | but presumably more efficient.

The response was a chorus of people saying Not a bad idea, but
that flag sucks. It needs a separate method. :-)

-- 
g

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-20 Thread Paul Moore
On 2/19/06, Steve Holden [EMAIL PROTECTED] wrote:
  You are missing the rationale of the PEP process. The point is
  *not* documentation. The point of the PEP process is to channel
  and collect discussion, so that the BDFL can make a decision.
  The BDFL is not bound at all to the PEP process.
 
  To document things, we use (or should use) documentation.
 
 
 One could wish this ideal had been the case for the import extensions
 defined in PEP 302.

(A bit off-topic, but that hit home, so I'll reply...)

Agreed, and it's my fault they weren't, to some extent. I did try to
find a suitable place, but the import docs are generally fairly
scattered, and there wasn't a particularly good place to put the
changes.

Any suggestions would be gratefully accepted...
Paul.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-20 Thread Aahz
On Sun, Feb 19, 2006, Josiah Carlson wrote:

 I agree, there is nothing perfect.  But at least in all of my use-cases,
 and the majority of the ones I've seen 'in the wild', my previous post
 provided an implementation that worked precisely like desired, and
 precisely like a regular dictionary, except when accessing a
 non-existant key via: value = dd[key] . __contains__, etc., all work
 exactly like they do with a non-defaulting dictionary. Iteration via
 popitem(), pop(key), items(), iteritems(), __iter__, etc., all work the
 way you would expect them. 

This is the telling point, IMO.  My company makes heavy use of a default
dict (actually, it's a default class because using constants as the
lookup keys is mostly what we do and the convenience of foo.bar is
compelling over foo['bar']).  Anyway, our semantics are as Josiah
outlines, and I can't see much use case for the alternatives.

Those of you arguing something different: do you have a real use case
(that you've implemented in real code)?
-- 
Aahz ([EMAIL PROTECTED])   * http://www.pythoncraft.com/

19. A language that doesn't affect the way you think about programming,
is not worth knowing.  --Alan Perlis
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-20 Thread Adam Olsen
On 2/20/06, Aahz [EMAIL PROTECTED] wrote:
 On Sun, Feb 19, 2006, Josiah Carlson wrote:
 
  I agree, there is nothing perfect.  But at least in all of my use-cases,
  and the majority of the ones I've seen 'in the wild', my previous post
  provided an implementation that worked precisely like desired, and
  precisely like a regular dictionary, except when accessing a
  non-existant key via: value = dd[key] . __contains__, etc., all work
  exactly like they do with a non-defaulting dictionary. Iteration via
  popitem(), pop(key), items(), iteritems(), __iter__, etc., all work the
  way you would expect them.

 This is the telling point, IMO.  My company makes heavy use of a default
 dict (actually, it's a default class because using constants as the
 lookup keys is mostly what we do and the convenience of foo.bar is
 compelling over foo['bar']).  Anyway, our semantics are as Josiah
 outlines, and I can't see much use case for the alternatives.

Can you say, for the record (since nobody else seems to care), if
d.getorset(key, func) would work in your use cases?


 Those of you arguing something different: do you have a real use case
 (that you've implemented in real code)?

(again, for the record) getorset provides the minimum needed
functionality in a clean and intuitive way.  Why go for a complicated
solution when you simply don't need it?

--
Adam Olsen, aka Rhamphoryncus
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-20 Thread Josiah Carlson

Adam Olsen [EMAIL PROTECTED] wrote:
 Can you say, for the record (since nobody else seems to care), if
 d.getorset(key, func) would work in your use cases?

It doesn't work for the multiset/accumulation case:

dd[key] += 1

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-20 Thread Aahz
On Mon, Feb 20, 2006, Adam Olsen wrote:
 On 2/20/06, Aahz [EMAIL PROTECTED] wrote:
 On Sun, Feb 19, 2006, Josiah Carlson wrote:

 I agree, there is nothing perfect.  But at least in all of my use-cases,
 and the majority of the ones I've seen 'in the wild', my previous post
 provided an implementation that worked precisely like desired, and
 precisely like a regular dictionary, except when accessing a
 non-existant key via: value = dd[key] . __contains__, etc., all work
 exactly like they do with a non-defaulting dictionary. Iteration via
 popitem(), pop(key), items(), iteritems(), __iter__, etc., all work the
 way you would expect them.

 This is the telling point, IMO.  My company makes heavy use of a default
 dict (actually, it's a default class because using constants as the
 lookup keys is mostly what we do and the convenience of foo.bar is
 compelling over foo['bar']).  Anyway, our semantics are as Josiah
 outlines, and I can't see much use case for the alternatives.
 
 Can you say, for the record (since nobody else seems to care), if
 d.getorset(key, func) would work in your use cases?

Because I haven't been reading this thread all that closely, you'll have
to remind me what this means.

 Those of you arguing something different: do you have a real use case
 (that you've implemented in real code)?
 
 (again, for the record) getorset provides the minimum needed
 functionality in a clean and intuitive way.  Why go for a complicated
 solution when you simply don't need it?

Ditto above.
-- 
Aahz ([EMAIL PROTECTED])   * http://www.pythoncraft.com/

19. A language that doesn't affect the way you think about programming,
is not worth knowing.  --Alan Perlis
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-20 Thread Guido van Rossum
On 2/20/06, Josiah Carlson [EMAIL PROTECTED] wrote:
 Adam Olsen [EMAIL PROTECTED] wrote:
  Can you say, for the record (since nobody else seems to care), if
  d.getorset(key, func) would work in your use cases?

 It doesn't work for the multiset/accumulation case:

 dd[key] += 1

This is actually a fairly powerful argument for a subclass that
redefines __getitem__ in favor of a new dict method. (Not to mention
that it's much easier to pick a name for the subclass than for the
method. :-) See the new thread I started.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-20 Thread Alex Martelli
On Feb 20, 2006, at 12:38 PM, Aahz wrote:
...
 Can you say, for the record (since nobody else seems to care), if
 d.getorset(key, func) would work in your use cases?

 Because I haven't been reading this thread all that closely, you'll  
 have
 to remind me what this means.

Roughly the same (save for method/function difference) as:

def getorset(d, key, func):
   if key not in d: d[key] = func()
   return d[key]


Alex

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-20 Thread Aahz
On Mon, Feb 20, 2006, Alex Martelli wrote:
 On Feb 20, 2006, at 12:38 PM, Aahz wrote:
 ...
 Can you say, for the record (since nobody else seems to care), if
 d.getorset(key, func) would work in your use cases?

 Because I haven't been reading this thread all that closely, you'll  
 have
 to remind me what this means.
 
 Roughly the same (save for method/function difference) as:
 
 def getorset(d, key, func):
if key not in d: d[key] = func()
return d[key]

That has the problem of looking clumsy, and doubly so for our use case
where it's an attribute-based dict.  Our style relies on the clean look
of code like this:

if order.street:
...

Even as a dict, that doesn't look horrible:

if order['street']:
...

OTOH, this starts looking ugly:

if order.get('street'):
...

And this is just plain bad:

if getattr(order, 'street'):
...

Note that because we have to deal with *both* the possibility that the
attribute/key may not be there *and* that it might be blank -- but both
are semantically equivalent for our application -- there's no other
clean coding style.

Now, I realize this is different from the primary use case for needing
mutable values, but any proposed default dict solution that doesn't
cleanly support my use case is less interesting to me.
-- 
Aahz ([EMAIL PROTECTED])   * http://www.pythoncraft.com/

19. A language that doesn't affect the way you think about programming,
is not worth knowing.  --Alan Perlis
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-20 Thread Adam Olsen
On 2/20/06, Jim Jewett [EMAIL PROTECTED] wrote:
 Adam Olsen asked:
  ... d.getorset(key, func) would work in your use cases?

 It is an improvement over setdefault, because it doesn't
 always evaluate the expensive func.  (But why should every
 call have to pass in the function, when it is a property of
 the dictionary?)

Because usually it's a property of how you use it, not a property of
the dictionary.  The dictionary is just a generic storage mechanism.


 [snip]
 In other words, the program would work correctly if I passed
 in a normal but huge dictionary; I want to avoid that for reasons
 of efficiency.  This isn't the only use for a mapping, but it is
 the only one I've seen where KeyError is expected by the
 program's normal flow.

Looking at your explanation, I agree, getorset is useless for that use case.

However, I'm beginning to think we shouldn't be comparing them.
defaultdict is a powerful but heavyweight option, intended for
complicated behavior.  getorset and setdefault are intended to be
very lightweight, even lighter than the try/except KeyError and if
key not in X: X[key] = default memes we have right now.  getorset's
factory function is only appropriate for preexisting functions, not
user defined ones.

Essentially, I believe getorset should be discussed on its own merits,
independent of the addition of a defaultdict class.  Perhaps
discussion of it (and the deprecation of setdefault) should wait until
after defaultdict has been completed?

--
Adam Olsen, aka Rhamphoryncus
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-20 Thread Barry Warsaw
On Sat, 2006-02-18 at 12:53 +0100, Pierre Barbier de Reuille wrote:

  Guido Over lunch with Alex Martelli, he proposed that a subclass of
  Guido dict with this behavior (but implemented in C) would be a good
  Guido addition to the language.

I agree that .setdefault() is a well-intentioned failure, although I'm
much less concerned about any potential performance impact than the fact
that it's completely unreadable.  And while I like the basic idea, I
also agree that deriving from dict is problematic, both because of the
constructor signature is tough to forward, but also because dict is such
a fundamental type that APIs that return dicts may have to be changed to
allow passing in a factory type.

I'd rather like to see what Pierre proposes, with a few minor
differences.

 Well, first not ot break the current interface, and second because I think it
 reads better I would prefer :
 
   d = {'a': 1}'
   d['b']  # raises KeyError
   d.get('c')  # evaluates to None
   d.default = 42
   d['b']  # evaluates to 42
   d.get('c')  # evaluates to 42

So far so good.

 And to undo the default, you can simply do :
 
   del d.default

Although this I'm not crazy about.  If you let .default be a callable,
you could also write this as

def keyerror(): raise KeyError
d.default = keyerror

or possibly just this as a shortcut:

d.default = KeyError

  The only question in my mind is whether or not getting a non-existent value
  under the influence of a given default value should stick that value in the
  dictionary or not.

Agreed.  I'm not sure whether .get(onearg) should return None
or .default.  I /think/ I want the latter, but I'd have to play with
some real code to know for sure.

-Barry



signature.asc
Description: This is a digitally signed message part
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-20 Thread Barry Warsaw
On Fri, 2006-02-17 at 11:09 -0800, Guido van Rossum wrote:

 Thanks for all the constructive feedback. Here are some responses and
 a new proposal.
 
 - Yes, I'd like to kill setdefault() in 3.0 if not sooner.

A worthy goal, but not possible unless you want to break existing code.
I don't think it's worth a DeprecationWarning either.  Slating it for
removal in 3.0 seems fine.

Everything else about your proposal seems great.

-Barry



signature.asc
Description: This is a digitally signed message part
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-20 Thread Josiah Carlson

Adam Olsen [EMAIL PROTECTED] wrote:
 However, I'm beginning to think we shouldn't be comparing them.
 defaultdict is a powerful but heavyweight option, intended for
 complicated behavior.

Check out Guido's patch.  It's not that heavyweight, and its intended
behavior is to make some operations *more* intuitive, if not a bit
faster in some cases.

Whether or not getorset is introduced, I don't much care, as defaultdict
will cover every use case I've been using setdefault for, as well as
most of my use cases for get.

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-19 Thread Michael Urman
On 2/19/06, Josiah Carlson [EMAIL PROTECTED] wrote:
 My post probably hasn't convinced you, but much of the confusion, I
 believe, is based on Martin's original belief that 'k in dd' should
 always return true if there is a default.  One can argue that way, but
 then you end up on the circular train of thought that gets you to you
 can't do anything useful if that is the case, .popitem() doesn't work,
 len() is undefined,   Keep it simple, keep it sane.

A default factory implementation fundamentally modifies the behavior
of the mapping. There is no single answer to the question what is the
right behavior for contains, len, popitem as that depends on what the
code that consumes the mapping is written like, what it is attempting
to do, and what you are attempting to override it to do. Or, simply,
on why you are providing a default value. Resisting the temptation to
guess the why and just leaving the methods as is seems  the best
choice; overriding __contains__ to return true is much easier than
reversing that behavior would be.

An example when it could theoretically be used, if not particularly
useful. The gettext.install() function was just updated to take a
names parameter which controls which gettext accessor functions it
adds to the builtin namespace. Its implementation looks for method in
names to decide. Passing a default-true dict would allow the future
behavior to be bind all checked names, but only if __contains__
returns True.

Even though it would make a poor base implementation, and these
effects aren't a good candidate for it,  the code style that could
best leverage such a __contains__ exists.

Michael
--
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-19 Thread Ian Bicking
Michael Urman wrote:
 On 2/19/06, Josiah Carlson [EMAIL PROTECTED] wrote:
 
My post probably hasn't convinced you, but much of the confusion, I
believe, is based on Martin's original belief that 'k in dd' should
always return true if there is a default.  One can argue that way, but
then you end up on the circular train of thought that gets you to you
can't do anything useful if that is the case, .popitem() doesn't work,
len() is undefined,   Keep it simple, keep it sane.
 
 
 A default factory implementation fundamentally modifies the behavior
 of the mapping. There is no single answer to the question what is the
 right behavior for contains, len, popitem as that depends on what the
 code that consumes the mapping is written like, what it is attempting
 to do, and what you are attempting to override it to do. Or, simply,
 on why you are providing a default value. Resisting the temptation to
 guess the why and just leaving the methods as is seems  the best
 choice; overriding __contains__ to return true is much easier than
 reversing that behavior would be.

I agree that there is simply no universally correct answer for the 
various uses of default_factory.  I think ambiguity on points like this 
is a sign that something is overly general.

In many of the concrete cases it is fairly clear how these methods 
should work.  In the most obvious case (default_factory=list) what seems 
to be to be the correct implementation is one that no one is proposing, 
that is, x in d means d.get(x).  But that uses the fact that the 
return value of default_factory() is a false value, which we cannot 
assume in general.  And it effects .keys() -- which I would propose 
overriding for multidict (so it only returns keys with non-empty lists 
for values), but I don't see how it could be made correct for 
default_factory.

I just don't see why we should cram all these potential features into 
dict by using a vague feature like default_factory.  Why can't we just 
add a half-dozen new types of collections (to the module of the same 
name)?  Each one will get its own page of documentation, a name, a 
proper __repr__, and well defined meaning for all of these methods that 
it shares with dict only insofar as it makes sense to share.

Note that even if we use defaultdict or autodict or something besides 
changing dict itself, we still won't get a good __contains__, a good 
repr, or any of the other features that specific collection 
implementations will give us.

Isn't there anyone else who sees the various dict-like objects being 
passed around as recipes, and thinks that maybe that's a sign they 
should go in the stdlib?  The best of those recipes aren't 
all-encompassing, they just do one kind of container well.

-- 
Ian Bicking  |  [EMAIL PROTECTED]  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-19 Thread Terry Reedy

Raymond Hettinger [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 [Terry Reedy]
 One is a 'universal dict' that maps every key to something -- the 
 default if
 nothing else.  That should not have the default ever explicitly entered.
 Udict.keys() should only give the keys *not* mapped to the universal 
 value.

 Would you consider it a mapping invariant that k in dd implies k in
 dd.keys()?

 Is the notion of __contains__ at odds with notion of universality?

No and not sure.  I'll leave it to Martin v. Löwis to explain/defend his 
particular notion of a udict.

My main point is that there are multiple legitimate variations of the 
notion of a default dict, so that there is no 'one right way' to design 
one.  I notice that Michael Urman and Ian Bicking said much the same today.

Of course, having said that different variations are useful in different 
situations, I would nowise claim that all variations can serve as drop-in 
replacements for regular dicts everywhere they are now used.

I think that a new default-dict feature should cater to such variations. 
Beyond that, I don't know whether it is better to modify dict (with blank 
hooks) or add a new subclassable default-dict base type.

Terry Jan Reedy



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-19 Thread Josiah Carlson

Michael Urman [EMAIL PROTECTED] wrote:
 
 On 2/19/06, Josiah Carlson [EMAIL PROTECTED] wrote:
  My post probably hasn't convinced you, but much of the confusion, I
  believe, is based on Martin's original belief that 'k in dd' should
  always return true if there is a default.  One can argue that way, but
  then you end up on the circular train of thought that gets you to you
  can't do anything useful if that is the case, .popitem() doesn't work,
  len() is undefined,   Keep it simple, keep it sane.
 
 A default factory implementation fundamentally modifies the behavior
 of the mapping. There is no single answer to the question what is the
 right behavior for contains, len, popitem as that depends on what the
 code that consumes the mapping is written like, what it is attempting
 to do, and what you are attempting to override it to do. Or, simply,
 on why you are providing a default value. Resisting the temptation to
 guess the why and just leaving the methods as is seems  the best
 choice; overriding __contains__ to return true is much easier than
 reversing that behavior would be.

I agree, there is nothing perfect.  But at least in all of my use-cases,
and the majority of the ones I've seen 'in the wild', my previous post
provided an implementation that worked precisely like desired, and
precisely like a regular dictionary, except when accessing a
non-existant key via: value = dd[key] . __contains__, etc., all work
exactly like they do with a non-defaulting dictionary. Iteration via
popitem(), pop(key), items(), iteritems(), __iter__, etc., all work the
way you would expect them. The only nit is that code which iterates
like:

for key in keys:
try:
value = dd[key]
except KeyError:
continue

(where 'keys' has nothing to do with dd.keys(), it is merely a listing
of keys which are desired at this particular point)  However, the
following works like it always did:

for key in keys:
if key not in dd:
continue
value = dd[key]

 An example when it could theoretically be used, if not particularly
 useful. The gettext.install() function was just updated to take a
 names parameter which controls which gettext accessor functions it
 adds to the builtin namespace. Its implementation looks for method in
 names to decide. Passing a default-true dict would allow the future
 behavior to be bind all checked names, but only if __contains__
 returns True.
 
 Even though it would make a poor base implementation, and these
 effects aren't a good candidate for it,  the code style that could
 best leverage such a __contains__ exists.

Indeed, there are cases where an always-true __contains__ exists, and
the pure-Python implementation I previously posted can be easily
modified to offer such a feature.  However, because there are also use
cases for the not-always-true __contains__, picking either as the one
true way seems a bit unnecessary.

Presumably, if one goes into the collections module, the other will too. 
Actually, they could share all of their code except for a simple flag
which determines the always-true __contains__.  With minor work, that
'flag', or really the single bit it would require, may even be
embeddable into the type object.  Arguably, there should be a handful of
these defaulting dictionary-like objects, and for each variant, it
should be documented what their use-cases are, and any gotcha's that
will inevitably come up.

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-18 Thread Martin v. Löwis
Adam Olsen wrote:
 Demo/metaclass/Meta.py:55

That wouldn't break. If you had actually read the code, you would have
seen it is

try:
ga = dict['__getattr__']
except KeyError:
pass

How would it break if dict had a default factory? ga would get the
__getattr__ value, and everything would be fine. The KeyError is
ignored, after all.

 Demo/tkinter/guido/AttrDialog.py:121  # Subclasses override self.classes

Hmm

try:
cl = self.classes[c]
except KeyError:
cl = 'unknown'

So cl wouldn't be 'unknown'. Why would that be a problem?


 Lib/ConfigParser.py:623

   try:
v = map[var]
except KeyError:
raise InterpolationMissingOptionError(
option, section, rest, var)

So there is no InterpolationMissingOptionError. *Of course not*.
The whole point would be to provide a value for all interpolation
variables.

 Lib/random.py:315

This entire functions samples k elements with indices between 0
and len(population). Now, people shouldn't be passing dictionaries
in in the first place; that specific code tests whether there
valid values at indices 0, n//2, and n. If the dictionary
isn't really a sequence (i.e. if it doesn't provide values
at all indices), the function may later fail even if it passes
that test.

With a default-valued dictionary, the function would not fail,
but a large number of samples might be the default value.

 Lib/string.py:191

Same like ConfigParser: the intperpolation will always succeed,
interpolating all values (rather than leaving $identifier in the
string). That would be precisely the expected behaviour.

 Lib/weakref.py:56  # Currently uses UserDict but I assume it will
 switch to dict eventually

Or, rather, UserDict might grow the on_missing feature as well.

That is irrelevant for this issue, though:

o = self.data[key]()
if o is None:
raise KeyError, key  # line 56
else:
return o

So we are looking for lookup failures in self.data, here:
self.dict is initialized to {} in UserDict, with no
default factory. So there cannot be a change in behaviour.


 Perhaps the KeyError shouldn't ever get triggered in this case, I'm
 not sure.  I think that's besides the point though.  The programmer
 clearly expected it would.

No. I now see your problem: An except KeyError does *not* mean
that the programmer clearly expects it will raise an KeyError.
Instead, the programmer expects it *might* raise a KeyError, and
tries to deal with this situation.

If the situation doesn't arise, the code continue just fine.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-18 Thread Walter Dörwald
Guido van Rossum wrote:
 On 2/17/06, Ian Bicking [EMAIL PROTECTED] wrote:
 Guido van Rossum wrote:
  d = {}
  d.default_factory = set
  ...
  d[key].add(value)

 Another option would be:

d = {}
d.default_factory = set
d.get_default(key).add(value)

 Unlike .setdefault, this would use a factory associated with the dictionary, 
 and no default value would get passed in.
 Unlike the proposal, this would not override __getitem__ (not overriding
 __getitem__ is really the only difference with the proposal).  It would be 
 clear reading the code that you were not
 implicitly asserting they key in d was true.

 get_default isn't the best name, but another name isn't jumping out at me 
 at the moment.  Of course, it is not a Pythonic
 argument to say that an existing method should be overridden, or 
 functionality made nameless simply because we can't think
 of a name (looking to anonymous functions of course ;)

 I'm torn. While trying to implement this I came across some ugliness in 
 PyDict_GetItem() -- it would make sense if this also
 called
 on_missing(), but it must return a value without incrementing its
 refcount, and isn't supposed to raise exceptions -- so what to do if 
 on_missing() returns a value that's not inserted in the
 dict?

 If the __getattr__()-like operation that supplies and inserts a
 dynamic default was a separate method, we wouldn't have this problem.

 OTOH most reviewers here seem to appreciate on_missing() as a way to do 
 various other ways of alterning a dict's
 __getitem__() behavior behind a caller's back -- perhaps it could even be 
 (ab)used to
 implement case-insensitive lookup.

I don't like the fact that on_missing()/default_factory can change the 
behaviour of __getitem__, which upto now has been
something simple and understandable.
Why don't we put the on_missing()/default_factory functionality into get() 
instead?

d.get(key, default) does what it did before. d.get(key) invokes on_missing() 
(and dict would have default_factory == type(None))

Bye,
   Walter Dörwald



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-18 Thread Michael Hudson
Guido van Rossum [EMAIL PROTECTED] writes:

 I'm torn. While trying to implement this I came across some ugliness
 in PyDict_GetItem() -- it would make sense if this also called
 on_missing(), but it must return a value without incrementing its
 refcount, and isn't supposed to raise exceptions

This last bit has been a painful lie for quite some time.  I don't
know what can be done about it, though -- avoid the use of
PyDict_GetItem() in situations where you don't expect string only
dicts (so using it on globals and instance dicts would still be ok)?

 -- so what to do if
 on_missing() returns a value that's not inserted in the dict?

Well, like some others I am a bit uncomfortable with changing the
semantics of such an important operation on such an important data
structure.  But then I'm also not that unhappy with setdefault, so I
must be weird.

 If the __getattr__()-like operation that supplies and inserts a
 dynamic default was a separate method, we wouldn't have this problem.

Yes.

 OTOH most reviewers here seem to appreciate on_missing() as a way to
 do various other ways of alterning a dict's __getitem__() behavior
 behind a caller's back -- perhaps it could even be (ab)used to
 implement case-insensitive lookup.

Well, I'm not sure I do.

There seems to be quite a conceptual difference between being able to
make a new kind of dictionary and mess with the behaviour of one that
exists already, but I don't know if that matters in practice (the fact
that you can currently do things like import sys; sys.__dict__.clear()
doesn't seem to cause real problems).

Finally, I'll just note that subclassing to modify the behaviour of a
builtin type has generally been actively discouraged in python so far.
If all dictionary lookups went through a method that you could
override in Python (i.e. subclasses could replace ma_lookup, in
effect) this would be easy to do in Python code.  But they don't, and
bug reports suggesting that they do have been rejected in the past
(and I agree with the rejection, fwiw).

So that rambled a bit.  But in essence: I'd much prefer much prefer an
addtion of a method or a type than modifictaion of existing behaviour.

Cheers,
mwh

-- 
  If you're talking useful, I'm not your bot.
-- Tim Peters, 08 Nov 2001
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-18 Thread Pierre Barbier de Reuille
Quoting [EMAIL PROTECTED]:


 Guido Over lunch with Alex Martelli, he proposed that a subclass of
 Guido dict with this behavior (but implemented in C) would be a good
 Guido addition to the language.

 Instead, why not define setdefault() the way it should have been done in the
 first place?  When you create a dict it has the current behavior.  If you
 then call its setdefault() method that becomes the default value for missing
 keys.

 d = {'a': 1}'
 d['b']  # raises KeyError
 d.get('c')  # evaluates to None
 d.setdefault(42)
 d['b']  # evaluates to 42
 d.get('c')  # evaluates to 42

 For symmetry, setdefault() should probably be undoable: deldefault(),
 removedefault(), nodefault(), default_free(), whatever.

Well, first not ot break the current interface, and second because I think it
reads better I would prefer :

  d = {'a': 1}'
  d['b']  # raises KeyError
  d.get('c')  # evaluates to None
  d.default = 42
  d['b']  # evaluates to 42
  d.get('c')  # evaluates to 42

And to undo the default, you can simply do :

  del d.default

And of course, you can get the current value :

  d.default

But then, as proposed many times, I would rather see a function call. Like :

d.default = lambda key: 42

The argument of the function is the current key. It would allow things 
like that
:

d.default = time_comsuming_operation

where time_comsuming_operation get a single argument.


 The only question in my mind is whether or not getting a non-existent value
 under the influence of a given default value should stick that value in the
 dictionary or not.

 down-with-more-builtins-ly, y'rs,

 Skip
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/pierre.barbier%40cirad.fr



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-18 Thread James Y Knight
On Feb 18, 2006, at 2:33 AM, Martin v. Löwis wrote:
 I don't understand. In the rationale of PEP 333, it says
 The rationale for requiring a dictionary is to maximize portability
 between servers. The alternative would be to define some subset of a
 dictionary's methods as being the standard and portable interface.

 That rationale is not endangered: if the environment continues to
 be a dict exactly, servers continue to be guaranteed what precise
 set of operations is available on the environment.

Yes it is endangered.

 Well, as you say: you get a KeyError if there is an error with the  
 key.
 With a default_factory, there isn't normally an error with the key.

But there should be. Consider the case of two servers. One which  
takes all the items out of the dictionary (using items()) and puts  
them in some other data structure. Then it checks if the Date  
header has been set. It was not, so it adds it. Consider another  
similar server which checks if the Date header has been set on the  
dict passed in by the user. The default_factory then makes one up.  
Different behavior due to internal implementation details of how the  
server uses the dict object, which is what the restriction to  
_exactly_ dict prevents.

Consider another server which takes the dict instance and transports  
it across thread boundaries, from the wsgi-app's thread to the main  
server thread. Because WSGI specifies that you can only use 'dict',  
and the server checked that type(obj) == dict, it is guaranteed that  
using the dict won't run thread-unsafe code. That is now broken,  
since dict.__getitem__ can now invoke arbitrary user code. That is a  
major change.

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-18 Thread Martin v. Löwis
James Y Knight wrote:
 But there should be. Consider the case of two servers. One which  takes
 all the items out of the dictionary (using items()) and puts  them in
 some other data structure. Then it checks if the Date  header has been
 set. It was not, so it adds it. Consider another  similar server which
 checks if the Date header has been set on the  dict passed in by the
 user. The default_factory then makes one up.  Different behavior due to
 internal implementation details of how the  server uses the dict object,
 which is what the restriction to  _exactly_ dict prevents.

Right. I would claim that this is an artificial example: you can't
provide a HTTP_DATE value in a default_factory implementation, since
you don't know what the key is.

However, you are now making up a different rationale from the one the
PEP specifies: The PEP says that you need an exact dict so that
everybody knows precisely how the  dictionary behaves; instead of having
to define which precise subset of the dict API  is to be used.

*That* goal is still achieved: everybody knows that the dict might
have an on_missing/default_factory implementation. So to find out
whether HTTP_DATE has a value (which might be defaulted), you need
to invoke d['HTTP_DATE'].

 Consider another server which takes the dict instance and transports  it
 across thread boundaries, from the wsgi-app's thread to the main  server
 thread. Because WSGI specifies that you can only use 'dict',  and the
 server checked that type(obj) == dict, it is guaranteed that  using the
 dict won't run thread-unsafe code. That is now broken,  since
 dict.__getitem__ can now invoke arbitrary user code. That is a  major
 change.

Not at all. dict.__getitem__ could always invoke arbitrary user code,
through __hash__.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-18 Thread Adam Olsen
On 2/18/06, James Y Knight [EMAIL PROTECTED] wrote:
 On Feb 18, 2006, at 2:33 AM, Martin v. Löwis wrote:
  Well, as you say: you get a KeyError if there is an error with the
  key.
  With a default_factory, there isn't normally an error with the key.

 But there should be. Consider the case of two servers. One which
 takes all the items out of the dictionary (using items()) and puts
 them in some other data structure. Then it checks if the Date
 header has been set. It was not, so it adds it. Consider another
 similar server which checks if the Date header has been set on the
 dict passed in by the user. The default_factory then makes one up.
 Different behavior due to internal implementation details of how the
 server uses the dict object, which is what the restriction to
 _exactly_ dict prevents.

It just occured to me, what affect does this have on repr?  Does it
attempt to store the default_factory in the representation, or does it
remove it?  Is it even possible to store a reference to a builtin such
as list and have eval restore it?

--
Adam Olsen, aka Rhamphoryncus
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-18 Thread Bernhard Herzog
Guido van Rossum [EMAIL PROTECTED] writes:

 If the __getattr__()-like operation that supplies and inserts a
 dynamic default was a separate method, we wouldn't have this problem.

Why implement it in the dictionary type at all?  If, for intance, the
default value functionality were provided as a decorator, it could be
used with all kinds of mappings.  I.e. you could have something along
these lines:

class defaultwrapper(object):

def __init__(self, base, factory):
self.__base = base
self.__factory = factory

def __getitem__(self, key):
try:
return self.__base[key]
except KeyError:
value = self.__factory()
self.__base[key] = value
return value

def __getattr__(self, attr):
return getattr(self.__base, attr)


def test():
dd = defaultwrapper({}, list)
dd[abc].append(1)
dd[abc].append(2)
dd[def].append(1)
assert sorted(dd.keys()) == [abc, def]
assert sorted(dd.values()) == [[1], [1, 2]]
assert sorted(dd.items()) == [(abc, [1, 2]), (def, [1])]
assert dd.has_key(abc)
assert not dd.has_key(xyz)


The precise semantics would have to be determined yet, of course.

 OTOH most reviewers here seem to appreciate on_missing() as a way to
 do various other ways of alterning a dict's __getitem__() behavior
 behind a caller's back -- perhaps it could even be (ab)used to
 implement case-insensitive lookup.

case-insensitive lookup could be implemented with another
wrapper/decorator.  If you need both case-insitivity and a default
value, you can easily stack the decorators.

   Bernhard

-- 
Intevation GmbH http://intevation.de/
Skencil   http://skencil.org/
Thuban  http://thuban.intevation.org/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-18 Thread Phillip J. Eby
At 01:44 PM 02/18/2006 -0500, James Y Knight wrote:
On Feb 18, 2006, at 2:33 AM, Martin v. Löwis wrote:
  I don't understand. In the rationale of PEP 333, it says
  The rationale for requiring a dictionary is to maximize portability
  between servers. The alternative would be to define some subset of a
  dictionary's methods as being the standard and portable interface.
 
  That rationale is not endangered: if the environment continues to
  be a dict exactly, servers continue to be guaranteed what precise
  set of operations is available on the environment.

Yes it is endangered.

So we'll update the spec to say you can't use a dict that has the default 
set.  It's not reasonable to expect that language changes might not require 
updates to a PEP.  Certainly, we don't have to worry about being backward 
compatible when it's only Python 2.5 that's affected by the change.  :)

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-18 Thread Greg Ewing
Would people perhaps feel better if defaultdict
*wasn't* a subclass of dict, but a distinct mapping
type of its own? That would make it clearer that it's
not meant to be a drop-in replacement for a dict
in arbitrary contexts.

Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-18 Thread Raymond Hettinger
[Greg Ewing]
 Would people perhaps feel better if defaultdict
 *wasn't* a subclass of dict, but a distinct mapping
 type of its own? That would make it clearer that it's
 not meant to be a drop-in replacement for a dict
 in arbitrary contexts.

Absolutely.  That's the right way to avoid Liskov violations from altered 
invariants and API changes.  Besides, with Python's propensity for duck typing, 
there's no reason to subclass when we don't have to.


Raymond


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-18 Thread Greg Ewing
Bengt Richter wrote:

 My guess is that realistically default_factory will be used
 to make clean code for filling a dict, and then turning the factory
 off if it's to be passed into unknown contexts.

This suggests that maybe the autodict behaviour shouldn't
be part of the dict itself, but provided by a wrapper
around the dict.

The you can fill the dict through the wrapper, and still
have a normal dict underneath to use for other purposes.

Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-18 Thread Bengt Richter
On Sat, 18 Feb 2006 10:44:15 +0100 (CET), =?iso-8859-1?Q?Walter_D=F6rwald?= 
[EMAIL PROTECTED] wrote:

Guido van Rossum wrote:
 On 2/17/06, Ian Bicking [EMAIL PROTECTED] wrote:
 Guido van Rossum wrote:
  d =3D {}
  d.default_factory =3D set
  ...
  d[key].add(value)

 Another option would be:

d =3D {}
d.default_factory =3D set
d.get_default(key).add(value)

 Unlike .setdefault, this would use a factory associated with the diction=
ary, and no default value would get passed in.
 Unlike the proposal, this would not override __getitem__ (not overriding
 __getitem__ is really the only difference with the proposal).  It would =
be clear reading the code that you were not
 implicitly asserting they key in d was true.

 get_default isn't the best name, but another name isn't jumping out at=
 me at the moment.  Of course, it is not a Pythonic
 argument to say that an existing method should be overridden, or functio=
nality made nameless simply because we can't think
 of a name (looking to anonymous functions of course ;)

 I'm torn. While trying to implement this I came across some ugliness in P=
yDict_GetItem() -- it would make sense if this also
 called
 on_missing(), but it must return a value without incrementing its
 refcount, and isn't supposed to raise exceptions -- so what to do if on_m=
issing() returns a value that's not inserted in the
 dict?

 If the __getattr__()-like operation that supplies and inserts a
 dynamic default was a separate method, we wouldn't have this problem.

 OTOH most reviewers here seem to appreciate on_missing() as a way to do v=
arious other ways of alterning a dict's
 __getitem__() behavior behind a caller's back -- perhaps it could even be=
 (ab)used to
 implement case-insensitive lookup.

I don't like the fact that on_missing()/default_factory can change the beha=
viour of __getitem__, which upto now has been
something simple and understandable.
Why don't we put the on_missing()/default_factory functionality into get() =
instead?

d.get(key, default) does what it did before. d.get(key) invokes on_missing(=
) (and dict would have default_factory =3D=3D type(None))

OTOH, I forgot why it was desirable in the first place to overload d[k]
with defaulting logic. E.g., why wouldn't d.defaulting[k] be ok to write
when you want the d.default_factory action?

on_missing feels more like a tracing hook though, so maybe it could always
act either way if defined.

Also, for those wanting to avoid lambda:42 as factory, would a callable test
cost a lot? Of course then the default_factory name might require revision.

Regards,
Bengt Richter

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-18 Thread Steve Holden
Martin v. Löwis wrote:
 Guido van Rossum wrote:
 
Feedback?
 
 
 I would like this to be part of the standard dictionary type,
 rather than being a subtype.
 
 d.setdefault([]) (one argument) should install a default value,
 and d.cleardefault() should remove that setting; d.default
 should be read-only. Alternatively, d.default could be assignable
 and del-able.
 
The issue with setting the default this way is that a copy would have to 
be created if the behavior was to differ from the sometimes-confusing 
default argument behavior for functions.


 Also, I think has_key/in should return True if there is a default.
 
It certainly seems desirable to see True where d[some_key] doesn't raise 
an exception, but one could argue either way.

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006  www.python.org/pycon/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-18 Thread Steve Holden
Guido van Rossum wrote:
 On 2/16/06, Guido van Rossum [EMAIL PROTECTED] wrote:
 
Over lunch with Alex Martelli, he proposed that a subclass of dict
with this behavior (but implemented in C) would be a good addition to
the language. It looks like it wouldn't be hard to implement. It could
be a builtin named defaultdict. The first, required, argument to the
constructor should be the default value. Remaining arguments (even
keyword args) are passed unchanged to the dict constructor.
 
 
 Thanks for all the constructive feedback. Here are some responses and
 a new proposal.
 
 - Yes, I'd like to kill setdefault() in 3.0 if not sooner.
 
 - It would indeed be nice if this was an optional feature of the
 standard dict type.
 
 - I'm ignoring the request for other features (ordering, key
 transforms). If you want one of these, write a PEP!
 
 - Many, many people suggested to use a factory function instead of a
 default value. This is indeed a much better idea (although slightly
 more cumbersome for the simplest cases).
 
One might think about calling it if it were callable, otherwise using it 
literally. Of course this would require jiggery-pokery int eh cases 
where you actually *wantes* the default value to be a callable (you'd 
have to provide a callable to return the callable as a default).

 - Some people seem to think that a subclass constructor signature must
 match the base class constructor signature. That's not so. The
 subclass constructor must just be careful to call the base class
 constructor with the correct arguments. Think of the subclass
 constructor as a factory function.
 
True, but then this does get in the way of treating the base dict and 
its defaulting subtype polymorphically. That might not be a big issue.

 - There's a fundamental difference between associating the default
 value with the dict object, and associating it with the call. So
 proposals to invent a better name/signature for setdefault() don't
 compete. (As to one specific such proposal, adding an optional bool as
 the 3rd argument to get(), I believe I've explained enough times in
 the past that flag-like arguments that always get a constant passed in
 at the call site are a bad idea and should usually be refactored into
 two separate methods.)
 
 - The inconsistency introduced by __getitem__() returning a value for
 keys while get(), __contains__(), and keys() etc. don't show it,
 cannot be resolved usefully. You'll just have to live with it.
 Modifying get() to do the same thing as __getitem__() doesn't seem
 useful -- it just takes away a potentially useful operation.
 
 So here's a new proposal.
 
 Let's add a generic missing-key handling method to the dict class, as
 well as a default_factory slot initialized to None. The implementation
 is like this (but in C):
 
 def on_missing(self, key):
   if self.default_factory is not None:
 value = self.default_factory()
 self[key] = value
 return value
   raise KeyError(key)
 
 When __getitem__() (and *only* __getitem__()) finds that the requested
 key is not present in the dict, it calls self.on_missing(key) and
 returns whatever it returns -- or raises whatever it raises.
 __getitem__() doesn't need to raise KeyError any more, that's done by
 on_missing().
 
 The on_missing() method can be overridden to implement any semantics
 you want when the key isn't found: return a value without inserting
 it, insert a value without copying it, only do it for certain key
 types/values, make the default incorporate the key, etc.
 
 But the default implementation is designed so that we can write
 
 d = {}
 d.default_factory = list
 
 to create a dict that inserts a new list whenever a key is not found
 in __getitem__(), which is most useful in the original use case:
 implementing a multiset so that one can write
 
 d[key].append(value)
 
 to add a new key/value to the multiset without having to handle the
 case separately where the key isn't in the dict yet. This also works
 for sets instead of lists:
 
 d = {}
 d.default_factory = set
 ...
 d[key].add(value)
 
This seems like a very good compromise.

[non-functional alternatives ...]
 
regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006  www.python.org/pycon/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-18 Thread Josiah Carlson

Greg Ewing [EMAIL PROTECTED] wrote:
 Bengt Richter wrote:
 
  My guess is that realistically default_factory will be used
  to make clean code for filling a dict, and then turning the factory
  off if it's to be passed into unknown contexts.
 
 This suggests that maybe the autodict behaviour shouldn't
 be part of the dict itself, but provided by a wrapper
 around the dict.
 
 The you can fill the dict through the wrapper, and still
 have a normal dict underneath to use for other purposes.

I prefer this to changing dictionaries directly.  The actual wrapper
could sit in the collections module, ready for subclassing/replacement
of the on_missing method.

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-18 Thread Raymond Hettinger
  Also, I think has_key/in should return True if there is a default.

 It certainly seems desirable to see True where d[some_key]
 doesn't raise an exception, but one could argue either way.

Some things can be agreed by everyone:

* if __contains__ always returns True, then it is a useless feature (since 
scripts containing a line such as if k in dd can always eliminate that line 
without affecting the algorithm).

* if defaultdicts are supposed to be drop-in dict substitutes, then having
__contains__ always return True will violate basic dict invariants:
   del d[some_key]
   assert some_key not in d


Raymond 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-18 Thread Steve Holden
Martin v. Löwis wrote:
 Adam Olsen wrote:
 
Still -1.  It's better, but it violates the principle of encapsulation
by mixing how-you-use-it state with what-it-stores state.  In doing
that it has the potential to break an API documented as accepting a
dict.  Code that expects d[key] to raise an exception (and catches the
resulting KeyError) will now silently succeed.
 
 
 Of course it will, and without quotes. That's the whole point.
 
 
I believe that necessitates a PEP to document it.
 
 
 You are missing the rationale of the PEP process. The point is
 *not* documentation. The point of the PEP process is to channel
 and collect discussion, so that the BDFL can make a decision.
 The BDFL is not bound at all to the PEP process.
 
 To document things, we use (or should use) documentation.
 

One could wish this ideal had been the case for the import extensions 
defined in PEP 302.

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006  www.python.org/pycon/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-18 Thread Terry Reedy
 Quoting [EMAIL PROTECTED]:
 The only question in my mind is whether or not getting a non-existent 
 value
 under the influence of a given default value should stick that value in 
 the
 dictionary or not.

It seems to me that there are at least two types of default dicts, which 
have opposite answers to that question.

One is a 'universal dict' that maps every key to something -- the default 
if nothing else.  That should not have the default ever explicitly entered. 
Udict.keys() should only give the keys *not* mapped to the universal value.

Another is the accumlator dict.  The default value is the identity (0, [], 
or whatever) for the type of accumulation.  An adict must have the identity 
added, even though that null will usually be immedially incremented by +=1 
or .append(ob) or whatever.

Guido's last proposal was for the default default_dict to cater to the 
second type (and others needing the same behavior) while catering to the 
first by making the default fill-in method over-rideable.

It we go with, for instance, wrappers in the collections module instead of 
modification of dict, then perhaps there should be at least two wrappers 
included, with each of these two behaviors.

Terry Jan Reedy



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-18 Thread Martin v. Löwis
Raymond Hettinger wrote:
Also, I think has_key/in should return True if there is a default.
 * if __contains__ always returns True, then it is a useless feature (since 
 scripts containing a line such as if k in dd can always eliminate that line 
 without affecting the algorithm).

If you mean if __contains__ always returns True for a default dict,
then it is a useless feature, I disagree. The code using if k in dd
cannot be eliminated if you don't know that you have a default dict.

 * if defaultdicts are supposed to be drop-in dict substitutes, then having
 __contains__ always return True will violate basic dict invariants:
del d[some_key]
assert some_key not in d

If you have a default value, you cannot ultimately del a key. This
sequence is *not* a basic mapping invariant. If it was, then it would
be also an invariant that, after del d[some_key], d[some_key] will
raise a KeyError. This kind of invariant doesn't take into account
that there might be a default value.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-18 Thread Raymond Hettinger
[Martin v. Löwis]
 If you have a default value, you cannot ultimately del a key. This
 sequence is *not* a basic mapping invariant.

You believe that key deletion is not basic to mappings?


 This kind of invariant doesn't take into account
 that there might be a default value.

Precisely.  Therefore, a defaultdict subclass violates the Liskov Substitution 
Principle.

Of course, the __del__ followed __contains__ sequence is not the only invariant 
that is thrown-off.  There are plenty of examples.  Here's one that is 
absolutely basic to the method's contract:

k, v = dd.popitem()
assert k not in dd

Any code that was expecting a dictionary and uses popitem() as a means of 
looping over and consuming entries will fail.

No one should kid themselves that a default dictionary is a drop-in substitute. 
Much of the dict's API has an ambiguous meaning when applied to defaultdicts.

If all keys are in-theory predefined, what is the meaning of len(dd)?

Should dd.items() include any entries where the value is equal to the default 
or 
should the collection never store those?  If the former, then how do you access 
the entries without looping over the whole contents?  If the latter, then do 
you 
worry that dd[v]=k does not imply (k,v) in dd.items()?


Raymond 

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-18 Thread Martin v. Löwis
Raymond Hettinger wrote:
 If you have a default value, you cannot ultimately del a key. This
 sequence is *not* a basic mapping invariant.
 
 
 You believe that key deletion is not basic to mappings?

No, not in the sense that the key will go away through deletion.
I view a mapping as a modifiable partial function. There is some
initial key/value association (in a classic mapping, it is initially
empty), and then there are modifications. Key deletion means to
reset the key to the initial association.

 Of course, the __del__ followed __contains__ sequence is not the only
 invariant that is thrown-off.  There are plenty of examples.  Here's one
 that is absolutely basic to the method's contract:
 
k, v = dd.popitem()
assert k not in dd
 
 Any code that was expecting a dictionary and uses popitem() as a means
 of looping over and consuming entries will fail.

Well, code that loops over a dictionary using popitem typically
terminates when the dictionary becomes false (or its length becomes
zero). That code wouldn't be affected by the behaviour of in.

 No one should kid themselves that a default dictionary is a drop-in
 substitute. Much of the dict's API has an ambiguous meaning when applied
 to defaultdicts.

Right. But it is only ambiguous until specified. Of course, in the face
of ambiguity, refuse the temptation to guess.

 If all keys are in-theory predefined, what is the meaning of len(dd)?

Taking my definition from the beginning of the message, it is the number
of keys that have been modified from the initial mapping.

 Should dd.items() include any entries where the value is equal to the
 default or should the collection never store those?

It should include all modified items, and none of the unmodified ones.
Explicitly assigning the default value still makes the entry modified;
you need to del it to set it back to unmodified.

 If the former, then
 how do you access the entries without looping over the whole contents? 

Not sure I understand the question. You use d[k] to access an entry.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-18 Thread Raymond Hettinger
[Terry Reedy]
 One is a 'universal dict' that maps every key to something -- the default if 
 nothing else.  That should not have the default ever explicitly entered. 
 Udict.keys() should only give the keys *not* mapped to the universal value.

Would you consider it a mapping invariant that k in dd implies k in 
dd.keys()?

Is the notion of __contains__ at odds with notion of universality?


Raymond 

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-18 Thread Josiah Carlson
Raymond Hettinger [EMAIL PROTECTED] wrote:
 [Martin v. Löwis]
  This kind of invariant doesn't take into account
  that there might be a default value.
 
 Precisely.  Therefore, a defaultdict subclass violates the Liskov 
 Substitution 
 Principle.

class defaultdict(dict):
def __getitem__(self, key):
try:
return dict.__getitem__(self, key)
except KeyError:
return self.on_missing(key)
def on_missing(self, key):
if not hasattr(self, 'default') or not callable(self.default):
raise KeyError, key
r = self[key] = self.default()
return r

In my opinion, the above implementation as a subclass does the right
thing in regards to __del__, __contains__, get, pop, popitem, __len__,
has_key, and anything else I can think of.  Does it violate the Liskov
Substitution Principle?  Yes, but only if user code relies on dd[key]
raising a KeyError on a lack of a key.  This can be easily remedied by
removing the default when it is unneeded, at which point, you get your
Liskov Substitution.


 Of course, the __del__ followed __contains__ sequence is not the only 
 invariant 
 that is thrown-off.  There are plenty of examples.  Here's one that is 
 absolutely basic to the method's contract:
 
 k, v = dd.popitem()
 assert k not in dd
 
 Any code that was expecting a dictionary and uses popitem() as a means of 
 looping over and consuming entries will fail.


 a = defaultdict()
 a.default = list
 a['hello']
[]
 k, v = a.popitem()
 assert k not in a
 

Seems to work for the above implementation.


 No one should kid themselves that a default dictionary is a drop-in 
 substitute. 
 Much of the dict's API has an ambiguous meaning when applied to defaultdicts.

Actually, if one is careful, the dict's API is completely unchanged,
except for direct access to the object via b = a[i].

 del a['hello']
Traceback (most recent call last):
  File stdin, line 1, in ?
KeyError: 'hello'
 'hello' in a
False
 a.get('hello')
 a.pop('hello')
Traceback (most recent call last):
  File stdin, line 1, in ?
KeyError: 'pop(): dictionary is empty'
 a.popitem()
Traceback (most recent call last):
  File stdin, line 1, in ?
KeyError: 'popitem(): dictionary is empty'
 len(a)
0
 a.has_key('hello')
False

 If all keys are in-theory predefined, what is the meaning of len(dd)?

It depends on the sequence of actions.  Play around with the above
defaultdict implementation.  From what I understood of Guido's original
post, this is essentially what he was proposing, only implemented in C.

 Should dd.items() include any entries where the value is equal to the default 
 or 
 should the collection never store those?

Yes, it should store any value which was stored via 'dd[k]=v', or any
default value created via access by 'v=dd[k]' .

 If the former, then how do you access 
 the entries without looping over the whole contents?

Presumably one is looking for a single kind of default (empty list, 0,
etc.) because one wanted to accumulate into them, similar to one of the
following...

for item, value in input:
try:
d[item] += value
#or d[item].append(value)
except KeyError:
d[item] = value
#or d[item] = [value]

which becomes

for item in input:
dd[item] += 1
#or dd[item].append(value)

Once accumulation has occurred, iteration over them via .iteritems(),
.items(), .popitem(), etc., would progress exactly the same way as with
a regular dictionary.  If the code which is using the accumulated data
does things like...

for key in wanted_keys:
try:
value = dd[key]
except KeyError:
continue
#do something nontrivial with value

rather than...

for key in wanted_keys:
if key not in dd:
continue
value = dd[key]
#do something nontrivial with value

Then the user has at least three options to make it 'work right':
1. User can change to using 'in' to iterate rather than relying on a
KeyError.
2. User could remember to remove the default.
3. User can create a copy of the default dictionary via dict(dd) and
pass it into the code which relies on the non-defaulting dictionary.


 If the latter, then do you 
 worry that dd[v]=k does not imply (k,v) in dd.items()?

I personally wouldn't want the latter.

My post probably hasn't convinced you, but much of the confusion, I
believe, is based on Martin's original belief that 'k in dd' should
always return true if there is a default.  One can argue that way, but
then you end up on the circular train of thought that gets you to you
can't do anything useful if that is the case, .popitem() doesn't work,
len() is undefined,   Keep it simple, keep it sane.

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 

Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Georg Brandl
Guido van Rossum wrote:

   d = DefaultDict([])
 
 can be written as simply
 
   d[key].append(value)

 Feedback?

Probably a good idea, has been proposed multiple times on clpy.
One good thing would be to be able to specify either a default value
or a factory function.

While at it, other interesting dict subclasses could be:
* sorteddict, practically reinvented by every larger project
* keytransformdict, such as d = keytransformdict(str.lower).

Georg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Walter Dörwald
Guido van Rossum wrote:

 A bunch of Googlers were discussing the best way of doing the
 following (a common idiom when maintaining a dict of lists of values
 relating to a key, sometimes called a multimap):
 
   if key not in d: d[key] = []
   d[key].append(value)
 
 An alternative way to spell this uses setdefault(), but it's not very 
 readable:
 
   d.setdefault(key, []).append(value)
 
 and it also suffers from creating an unnecessary list instance.
 (Timings were inconclusive; the approaches are within 5-10% of each
 other in speed.)
 
 My conclusion is that setdefault() is a failure -- it was a
 well-intentioned construct, but doesn't actually create more readable
 code.
 
 Google has an internal data type called a DefaultDict which gets
 passed a default value upon construction. Its __getitem__ method,
 instead of raising KeyError, inserts a shallow copy (!) of the given
 default value into the dict when the value is not found. So the above
 code, after
 
   d = DefaultDict([])
 
 can be written as simply
 
   d[key].append(value)

Using a shallow copy of the default seems a bit too magical to me. How 
would this be done? Via copy.copy?

And passing [] to the constructor of dict has a different meaning already.

Fetching the default via a static/class method would solve both problems:

class default_dict(dict):
def __getitem__(self, key):
   if key in self:
  return dict.__getitem__(self, key)
   else:
  default = self.getdefault()
  self[key] = default
  return default

class multi_map(default_dict):
@staticmethod
def getdefault(self):
   return []

class counting_dict(default_dict):
@staticmethod
def getdefault(self):
   return 0

 [...]

Bye,
Walter Dörwald
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Bengt Richter
On Thu, 16 Feb 2006 13:11:49 -0800, Guido van Rossum [EMAIL PROTECTED] wrote:

A bunch of Googlers were discussing the best way of doing the
following (a common idiom when maintaining a dict of lists of values
relating to a key, sometimes called a multimap):

  if key not in d: d[key] = []
  d[key].append(value)

An alternative way to spell this uses setdefault(), but it's not very readable:

  d.setdefault(key, []).append(value)

and it also suffers from creating an unnecessary list instance.
(Timings were inconclusive; the approaches are within 5-10% of each
other in speed.)

My conclusion is that setdefault() is a failure -- it was a
well-intentioned construct, but doesn't actually create more readable
code.

Google has an internal data type called a DefaultDict which gets
passed a default value upon construction. Its __getitem__ method,
instead of raising KeyError, inserts a shallow copy (!) of the given
default value into the dict when the value is not found. So the above
code, after

  d = DefaultDict([])

can be written as simply

  d[key].append(value)

Wouldn't it be more generally powerful to pass type or factory function
to use to instantiate a default object when a missing key is encountered, e.g.

   d = DefaultDict(list)

then

   d[key].append(value)

but then you can also do

   d = DefaultDict(dict)
   d[key].update(a=123)

or

   class Foo(object): pass
   d = DefaultDict(Foo)
   d[key].phone = '415-555-1212'

etc. No worries about generalizing shallow copying either ;-)

  
Note that of all the possible semantics for __getitem__ that could
have produced similar results (e.g. not inserting the default in the
underlying dict, or not copying the default value), the chosen
semantics are the only ones that makes this example work.

Over lunch with Alex Martelli, he proposed that a subclass of dict
with this behavior (but implemented in C) would be a good addition to
the language. It looks like it wouldn't be hard to implement. It could
be a builtin named defaultdict. The first, required, argument to the
constructor should be the default value. Remaining arguments (even
keyword args) are passed unchanged to the dict constructor.

Some more design subtleties:

- key in d still returns False if the key isn't there
- d.get(key) still returns None if the key isn't there
- d.default should be a read-only attribute giving the default value

Feedback?

See above.

Regards,
Bengt Richter

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Georg Brandl
Thomas Heller wrote:

 Probably a good idea, has been proposed multiple times on clpy.
 One good thing would be to be able to specify either a default value
 or a factory function.
 
 While at it, other interesting dict subclasses could be:
 * sorteddict, practically reinvented by every larger project
 
 You mean ordereddict, not sorteddict, I hope.

Well, yes.

 * keytransformdict, such as d = keytransformdict(str.lower).
 
 Not sure what you mean by that.
 
 What *I* would like is probably more ambitious:  I want a dict that allows 
 case-insensitive
 lookup of string keys

This is exactly what this would do. All keys are transformed to lowercase when
setting and looking up.

 plus ideally I want to use it as class or instance dictionary.
 Use case: COM wrappers.

regards,
Georg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Paul Moore
On 2/17/06, Raymond Hettinger [EMAIL PROTECTED] wrote:
  Over lunch with Alex Martelli, he proposed that a subclass of dict
  with this behavior (but implemented in C) would be a good addition to
  the language

 I would like to add something like this to the collections module,

+1

 but a PEP is probably needed to deal with issues like:

+0 (You're probably right, but I fear there's no perfect answer, so
discussions could go round in circles...)

Paul.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Fuzzyman




Martin v. Lwis wrote:

  Guido van Rossum wrote:
  
  
Feedback?

  
  
I would like this to be part of the standard dictionary type,
rather than being a subtype.

d.setdefault([]) (one argument) should install a default value,
and d.cleardefault() should remove that setting; d.default
should be read-only. Alternatively, d.default could be assignable
and del-able.

Also, I think has_key/in should return True if there is a default.

  

And exactly what use would it then be ?

Michael Foord


  Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk

  




___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Fredrik Lundh
Guido van Rossum wrote:

 A bunch of Googlers were discussing the best way of doing the
 following (a common idiom when maintaining a dict of lists of values
 relating to a key, sometimes called a multimap):

   if key not in d: d[key] = []
   d[key].append(value)

/.../

 Feedback?

+1.  check it in, already (as collections.defaultdict, perhaps?)

alternatively, you could specialize even further: collections.multimap,
which deals with list values only (that shallow copy thing feels a bit
questionable, but all alternatives feel slightly overgeneralized...)

/F



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Thomas Wouters
On Thu, Feb 16, 2006 at 01:11:49PM -0800, Guido van Rossum wrote:

 Over lunch with Alex Martelli, he proposed that a subclass of dict
 with this behavior (but implemented in C) would be a good addition to
 the language. It looks like it wouldn't be hard to implement. It could
 be a builtin named defaultdict. The first, required, argument to the
 constructor should be the default value. Remaining arguments (even
 keyword args) are passed unchanged to the dict constructor.

Should a dict subclass really change the constructor/initializer signature
in an incompatible way?

-- 
Thomas Wouters [EMAIL PROTECTED]

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Fredrik Lundh
Martin v. Löwis wrote:

 Also, I think has_key/in should return True if there is a default.

and keys should return all possible key values!

/F



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Raymond Hettinger
 My conclusion is that setdefault() is a failure -- it was a
 well-intentioned construct, but doesn't actually create more readable
 code.

It was an across the board failure:  naming, clarity, efficiency.
Can we agree to slate dict.setdefault() to disappear in Py3.0?


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Phillip J. Eby
At 10:10 AM 02/17/2006 +0100, Georg Brandl wrote:
Guido van Rossum wrote:

d = DefaultDict([])
 
  can be written as simply
 
d[key].append(value)

  Feedback?

Probably a good idea, has been proposed multiple times on clpy.
One good thing would be to be able to specify either a default value
or a factory function.

+1 on factory function, e.g. DefaultDict(list).  A default value isn't 
very useful, because for immutable defaults, setdefault() works well 
enough.  If what you want is a copy of some starting object, you can always 
do something like DefaultDict({1:2,3:4}.copy).

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Georg Brandl
Fredrik Lundh wrote:
 Raymond Hettinger wrote:
 
 I would like to add something like this to the collections module, but a PEP 
 is
 probably needed to deal with issues like:
 
 frankly, now that Guido is working 50% on Python, do we really have to use
 the full PEP process also for simple things like this?
 
 I'd say we let the BDFL roam free.
 
 (if he adds something really lousy, it can always be tweaked/removed before
 the next final release.  not every checkin needs to be final...).

+1.

Georg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Adam Olsen
On 2/16/06, Guido van Rossum [EMAIL PROTECTED] wrote:
 A bunch of Googlers were discussing the best way of doing the
 following (a common idiom when maintaining a dict of lists of values
 relating to a key, sometimes called a multimap):

   if key not in d: d[key] = []
   d[key].append(value)

 An alternative way to spell this uses setdefault(), but it's not very 
 readable:

   d.setdefault(key, []).append(value)

I'd like to see it done passing a factory function (and with a better name):

d.getorset(key, list).append(value)

The name is slightly odd but it is effective.  Plus it avoids creating
a new class when a slight tweak to an existing one will do.


 Over lunch with Alex Martelli, he proposed that a subclass of dict
 with this behavior (but implemented in C) would be a good addition to
 the language. It looks like it wouldn't be hard to implement. It could
 be a builtin named defaultdict. The first, required, argument to the
 constructor should be the default value. Remaining arguments (even
 keyword args) are passed unchanged to the dict constructor.

-1 (atleast until you can explain why that's better than .getorset())

--
Adam Olsen, aka Rhamphoryncus
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Nick Coghlan
Phillip J. Eby wrote:
 At 10:10 AM 02/17/2006 +0100, Georg Brandl wrote:
 Guido van Rossum wrote:

   d = DefaultDict([])

 can be written as simply

   d[key].append(value)
 Feedback?
 Probably a good idea, has been proposed multiple times on clpy.
 One good thing would be to be able to specify either a default value
 or a factory function.
 
 +1 on factory function, e.g. DefaultDict(list).  A default value isn't 
 very useful, because for immutable defaults, setdefault() works well 
 enough.  If what you want is a copy of some starting object, you can always 
 do something like DefaultDict({1:2,3:4}.copy).

+1 here, too (for permitting a factory function only).

This doesn't really limit usage, as you can still supply 
DefaultDict(partial(copy, x)) or DefaultDict(partial(deepcopy, x)), or (heaven 
forbid) a lambda expression. . .

As others have mentioned, the basic types are all easy, since the typename can 
be used directly.

+1 on supplying that factory function to the constructor, too (the default 
value is a fundamental part of the defaultdict). That is, I'd prefer:

   d = defaultdict(func)
   # The defaultdict is fully defined, but not yet populated
   d.update(init_values)

over:

   d = defaultdict(init_values)
   # The defaultdict is partially populated, but not yet fully defined!
   d.default(func)

That is, something that is the same the normal dict except for:

 def __init__(self, default):
 self.default = default

 def __getitem__(self, key):
 return self.get(key, self.default())

Considering some of Raymond's questions in light of the above
 * implications of a __getitem__ succeeding while get(value, x) returns x 
 (possibly different from the overall default)
 * implications of a __getitem__ succeeding while __contains__ would fail

These behaviours seem reasonable for a default dictionary - containment is 
based on whether or not the key actually exists in the dictionary as it 
currently stands, and the default is really a default default that can be 
overridden using 'get'.

 * whether to add this to the collections module (I would say yes)
 * whether to allow default functions as well as default values (so you could 
 instantiate a new default list)

My preference is for factory functions only, to eliminate ambiguity.

# bag like behavior
dd = collections.default_dict(int)
for elem in collection:
 dd[elem] += 1

# setdefault-like behavior
dd = collections.default_dict(list)
for page_number, page in enumerate(book):
 for word in page.split():
 dd[word].append(word)



-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Nick Coghlan
Adam Olsen wrote:
 Over lunch with Alex Martelli, he proposed that a subclass of dict
 with this behavior (but implemented in C) would be a good addition to
 the language. It looks like it wouldn't be hard to implement. It could
 be a builtin named defaultdict. The first, required, argument to the
 constructor should be the default value. Remaining arguments (even
 keyword args) are passed unchanged to the dict constructor.
 
 -1 (atleast until you can explain why that's better than .getorset())

Because the default default is a fundamental characteristic of the default 
dictionary (meaning it works with normal indexing syntax), whereas getorset 
makes it a characteristic of the method call.

Besides, if there are going to be any method changes on normal dicts, I'd 
rather see a boolean third argument set to the get method.

That is (for a normal dict):

   def get(self, key, *args):
   set = False
   no_default = False
   if len(args) == 2:
   default, set = args
   elif args:
   default, = args
   else:
   no_default = True

   if key in self:
   return self[key]
   if no_default:
   raise KeyError(repr(key))
   if set:
   self[key] = default
   return default

Using Guido's original example:

   d.get(key, [], True).append(value)

I don't really think this is a replacement for defaultdict, though.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Fredrik Lundh
Nick Coghlan wrote:

 Using Guido's original example:

   d.get(key, [], True).append(value)

hmm.  are you sure you didn't just reinvent setdefault ?

/F 



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread skip

Guido Over lunch with Alex Martelli, he proposed that a subclass of
Guido dict with this behavior (but implemented in C) would be a good
Guido addition to the language.

Instead, why not define setdefault() the way it should have been done in the
first place?  When you create a dict it has the current behavior.  If you
then call its setdefault() method that becomes the default value for missing
keys.

d = {'a': 1}'
d['b']  # raises KeyError
d.get('c')  # evaluates to None
d.setdefault(42)
d['b']  # evaluates to 42
d.get('c')  # evaluates to 42

For symmetry, setdefault() should probably be undoable: deldefault(),
removedefault(), nodefault(), default_free(), whatever.

The only question in my mind is whether or not getting a non-existent value
under the influence of a given default value should stick that value in the
dictionary or not.

down-with-more-builtins-ly, y'rs,

Skip
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Nick Coghlan
Fredrik Lundh wrote:
 Nick Coghlan wrote:
 
 Using Guido's original example:

   d.get(key, [], True).append(value)
 
 hmm.  are you sure you didn't just reinvent setdefault ?

I'm reasonably sure I copied it on purpose, only with a name that isn't 100% 
misleading as to what it does ;)

I think collections.defaultdict is a better approach, though.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Georg Brandl
[EMAIL PROTECTED] wrote:
 Guido Over lunch with Alex Martelli, he proposed that a subclass of
 Guido dict with this behavior (but implemented in C) would be a good
 Guido addition to the language.
 
 Instead, why not define setdefault() the way it should have been done in the
 first place?  When you create a dict it has the current behavior.  If you
 then call its setdefault() method that becomes the default value for missing
 keys.

That puts it off until 3.0.

From what I read I think defaultdict won't become builtin anyway.

Georg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Jack Diederich
On Thu, Feb 16, 2006 at 01:11:49PM -0800, Guido van Rossum wrote:
[snip]
 Google has an internal data type called a DefaultDict which gets
 passed a default value upon construction. Its __getitem__ method,
 instead of raising KeyError, inserts a shallow copy (!) of the given
 default value into the dict when the value is not found. So the above
 code, after
 
   d = DefaultDict([])
 
 can be written as simply
 
   d[key].append(value)
 
 Note that of all the possible semantics for __getitem__ that could
 have produced similar results (e.g. not inserting the default in the
 underlying dict, or not copying the default value), the chosen
 semantics are the only ones that makes this example work.

Having __getitem__ insert the returned default value allows it to
work with a larger variety of classes.  My own ForgivingDict does not
do this and works fine for ints and lists but not much else.

fd = ForgivingDict(list)
fd[key] += [val] # extends the list and does a __setitem__

The += operator isn't useful for dicts.

How can you make a defaultdict with a defaultdict as the default?
My head asploded when I tried it with the constructor arg.
It does seem possible with the 'd.default = func' syntax

# empty defaultdict constructor
d = defaultdict()
d.default = d
tree = defaultdict()
tree.default = d.copy

-jackdied
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread CM
+1It's about time!- COn 2/16/06, Guido van Rossum [EMAIL PROTECTED] wrote:
A bunch of Googlers were discussing the best way of doing thefollowing (a common idiom when maintaining a dict of lists of valuesrelating to a key, sometimes called a multimap):if key not in d: d[key] = []
d[key].append(value)An alternative way to spell this uses setdefault(), but it's not very readable:d.setdefault(key, []).append(value)and it also suffers from creating an unnecessary list instance.
(Timings were inconclusive; the approaches are within 5-10% of eachother in speed.)My conclusion is that setdefault() is a failure -- it was awell-intentioned construct, but doesn't actually create more readable
code.Google has an internal data type called a DefaultDict which getspassed a default value upon construction. Its __getitem__ method,instead of raising KeyError, inserts a shallow copy (!) of the given
default value into the dict when the value is not found. So the abovecode, afterd = DefaultDict([])can be written as simplyd[key].append(value)Note that of all the possible semantics for __getitem__ that could
have produced similar results (e.g. not inserting the default in theunderlying dict, or not copying the default value), the chosensemantics are the only ones that makes this example work.Over lunch with Alex Martelli, he proposed that a subclass of dict
with this behavior (but implemented in C) would be a good addition tothe language. It looks like it wouldn't be hard to implement. It couldbe a builtin named defaultdict. The first, required, argument to the
constructor should be the default value. Remaining arguments (evenkeyword args) are passed unchanged to the dict constructor.Some more design subtleties:- key in d still returns False if the key isn't there
- d.get(key) still returns None if the key isn't there- d.default should be a read-only attribute giving the default valueFeedback?Guido van Rossum (home page: 
http://www.python.org/~guido/)___Python-Dev mailing listPython-Dev@python.org
http://mail.python.org/mailman/listinfo/python-devUnsubscribe: http://mail.python.org/mailman/options/python-dev/monpublic%40gmail.com
-- A programmer learning programming from Perl is like a chemistry student learning the definition of 'exothermic' with dynamite. - evilpenguin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Terry Reedy

Fredrik Lundh [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 Raymond Hettinger wrote:

 I would like to add something like this to the collections module, but a 
 PEP is
 probably needed to deal with issues like:

 frankly, now that Guido is working 50% on Python, do we really have to 
 use
 the full PEP process also for simple things like this?

 I'd say we let the BDFL roam free.

PEPs are useful for question-answering purposes even after approval.  The 
design phase can be cut short by simply posting the approved design doc.




___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Ian Bicking
Raymond Hettinger wrote:
Over lunch with Alex Martelli, he proposed that a subclass of dict
with this behavior (but implemented in C) would be a good addition to
the language
 
 
 I would like to add something like this to the collections module, but a PEP 
 is 
 probably needed to deal with issues like:
 
 * implications of a __getitem__ succeeding while get(value, x) returns x 
 (possibly different from the overall default)
 * implications of a __getitem__ succeeding while __contains__ would fail
 * whether to add this to the collections module (I would say yes)
 * whether to allow default functions as well as default values (so you could 
 instantiate a new default list)
 * comparing all the existing recipes and third-party modules that have 
 already 
 done this
 * evaluating its fitness for common use cases (i.e. bags and dict of lists).

It doesn't seem that useful for bags, assuming we're talking about an 
{object: count} implementation of bags; bags should really have a more 
set-like interface than a dict-like interface.

A dict of lists typically means a multi-valued dict.  In that case it 
seems like x[key_not_found] should return the empty list, as that means 
zero values; even though zero values also means that 
x.has_key(key_not_found) should return False as well.  *but* getting 
x[key_not_found] does not (for a multi-valued dict) mean that suddently 
has_key should return true.  I find the side-effect nature of 
__getitem__ as proposed in default_dict to be rather confusing, and when 
reading code it will very much break my expectations.  I assume that 
attribute access and [] access will not have side effects.  Coming at it 
from that direction, I'm -1, though I'm +1 on dealing with the specific 
use case that started this (x.setdefault(key, []).append(value)).

An implementation targetted specifically at multi-valued dictionaries 
seems like it would be better.  Incidentally, on Web-SIG we've discussed 
wsgiref, and it includes a mutli-values, ordered, case-insensitive 
dictionary.  Such a dictionary(ish) object has clear applicability for 
HTTP headers, but certainly it is something I've used many times 
elsewhere.  In a case-sensitive form it applies to URL variables. 
Really there's several combinations of features, each with different uses.

So we have now...

dicts: unordered, key:value (associative), single-value
sets: unordered, not key:value, single-value
lists: ordered, not key:value, multi-value

We don't have...

bags: unordered, not key:value, multi-value
multi-dict: unordered, key:value, multi-value
ordered-dict: ordered, key:value, single-value
ordered-multi-dict: ordered, key:value, single-value

For all key:value collections, normalized keys can be useful.  (Though 
notably the wsgiref Headers object does not have normalized keys, but 
instead does case-insensitive comparisons.)  I don't know where 
dict-of-dict best fits in here.



-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Fredrik Lundh
Terry Reedy wrote:

  I'd say we let the BDFL roam free.

 PEPs are useful for question-answering purposes even after approval.  The
 design phase can be cut short by simply posting the approved design doc.

not for trivialities.  it'll take Guido more time to write a PEP than to
implement the damn thing.  is that really a good use of his time ?

why is python-dev suddenly full of control freaks ?

/F



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread skip

 Also, I think has_key/in should return True if there is a default.

Fredrik and keys should return all possible key values!

I think keys() and in should reflect reality.  Only when you do something
like

x = d['nonexistent']

or

x = d.get('nonexistent')

should the default value come into play.

Skip
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Guido van Rossum
On 2/16/06, Guido van Rossum [EMAIL PROTECTED] wrote:
 Over lunch with Alex Martelli, he proposed that a subclass of dict
 with this behavior (but implemented in C) would be a good addition to
 the language. It looks like it wouldn't be hard to implement. It could
 be a builtin named defaultdict. The first, required, argument to the
 constructor should be the default value. Remaining arguments (even
 keyword args) are passed unchanged to the dict constructor.

Thanks for all the constructive feedback. Here are some responses and
a new proposal.

- Yes, I'd like to kill setdefault() in 3.0 if not sooner.

- It would indeed be nice if this was an optional feature of the
standard dict type.

- I'm ignoring the request for other features (ordering, key
transforms). If you want one of these, write a PEP!

- Many, many people suggested to use a factory function instead of a
default value. This is indeed a much better idea (although slightly
more cumbersome for the simplest cases).

- Some people seem to think that a subclass constructor signature must
match the base class constructor signature. That's not so. The
subclass constructor must just be careful to call the base class
constructor with the correct arguments. Think of the subclass
constructor as a factory function.

- There's a fundamental difference between associating the default
value with the dict object, and associating it with the call. So
proposals to invent a better name/signature for setdefault() don't
compete. (As to one specific such proposal, adding an optional bool as
the 3rd argument to get(), I believe I've explained enough times in
the past that flag-like arguments that always get a constant passed in
at the call site are a bad idea and should usually be refactored into
two separate methods.)

- The inconsistency introduced by __getitem__() returning a value for
keys while get(), __contains__(), and keys() etc. don't show it,
cannot be resolved usefully. You'll just have to live with it.
Modifying get() to do the same thing as __getitem__() doesn't seem
useful -- it just takes away a potentially useful operation.

So here's a new proposal.

Let's add a generic missing-key handling method to the dict class, as
well as a default_factory slot initialized to None. The implementation
is like this (but in C):

def on_missing(self, key):
  if self.default_factory is not None:
value = self.default_factory()
self[key] = value
return value
  raise KeyError(key)

When __getitem__() (and *only* __getitem__()) finds that the requested
key is not present in the dict, it calls self.on_missing(key) and
returns whatever it returns -- or raises whatever it raises.
__getitem__() doesn't need to raise KeyError any more, that's done by
on_missing().

The on_missing() method can be overridden to implement any semantics
you want when the key isn't found: return a value without inserting
it, insert a value without copying it, only do it for certain key
types/values, make the default incorporate the key, etc.

But the default implementation is designed so that we can write

d = {}
d.default_factory = list

to create a dict that inserts a new list whenever a key is not found
in __getitem__(), which is most useful in the original use case:
implementing a multiset so that one can write

d[key].append(value)

to add a new key/value to the multiset without having to handle the
case separately where the key isn't in the dict yet. This also works
for sets instead of lists:

d = {}
d.default_factory = set
...
d[key].add(value)

I went through several iterations to obtain this design; my first
version of on_missing() would just raise KeyError(key), requiring you
to always provide a subclass; this is more minimalistic but less
useful and would probably raise the bar for using the feature to some
extent.

To saev you attempts to simplify this, here are some near-misses I
considered that didn't quite work out:

- def on_missing(self, key):
if self.default_factory is not None:
  return self.default_factory()
raise KeyError(key)

This would require the multiset example to subclass, since
default_factory doesn't see the key so it can't insert it.

- def on_missing(self, key):
if self.default_factory is not None:
  return self.default_factory(key)
raise KeyError(key)

This appears to fix that problem, but now you can't write
d.default_value = list since (a) list(key) doesn't return an empty
list, and (b) it also doesn't insert the key into the dict; attempting
to assign a callback function to default_factory that solves these
issues fail because the callback doesn't have access to the dict
instance (unless there's only one).

- Do away with on_missing() and just include its body at the end of
__getitem__(), to be invoked when the key isn't found.

This is less general in case you want different default semantics
(e.g. not inserting the default, or making the default a function of
the key) -- you'd have to override __getitem__() for that, which 

Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Fred L. Drake, Jr.
On Friday 17 February 2006 14:09, Guido van Rossum wrote:
  So here's a new proposal.

I like the version you came up with.  It has sufficient functionality to make 
it easy to use, and enough flexibility to be useful in more specialized 
cases.  I'm quite certain it would handle all the cases I've actually dealt 
with where I wanted a variation of a mapping with default values.


  -Fred

-- 
Fred L. Drake, Jr.   fdrake at acm.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Alex Martelli
On 2/16/06, Guido van Rossum [EMAIL PROTECTED] wrote:
 A bunch of Googlers were discussing the best way of doing the
   ...
Wow, what a great discussion!  As you'll recall, I had also mentioned
the callable factory as a live possibility, and there seems to be a
strong sentiment in favor of that; not really a weakness case for
HOFs, as you feared it might be during the lunchtime discussion.

Out of all I've read here, I like the idea of having a
collections.autodict (a much nicer name than defaultdict, a better
collocation for 2.5 than the builtins). One point I think nobody has
made is that whenever reasonably possible the setting of a callback
(the callable factory here) should include *a and **k to use when
calling back.  So, for example:

ad = collections.autodict(copy.copy, whatever)

would easily cover the use case of Google's DefaultDict (yes, partial
would also cover this use case, but having *a and **k is usefully more
general).  If you're convinced copy.copy is an overwhelmingly popular
use case (I'm not, personally), then this specific idiom might also be
summarized in a classmethod, a la

ad = collections.autodict.by_copy(whatever)


This way, all autodicts would start out empty (and be filled by update
if needed).  An alternative would be to have autodict's ctor have the
same signature as dict's, with a separate .set_initial method to pass
the factory (and *a, **k) -- this way an autodict might start out
populated, but would always start with some default factory, such as
lambda:None I guess. I think the first alternative (autodict always
starts empty, but with a specifically chosen factory [including *a,
**k]) is more useful.


Alex
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Guido van Rossum
On 2/17/06, Alex Martelli [EMAIL PROTECTED] wrote:
 On 2/16/06, Guido van Rossum [EMAIL PROTECTED] wrote:
  A bunch of Googlers were discussing the best way of doing the
...
 Wow, what a great discussion!  As you'll recall, I had also mentioned
 the callable factory as a live possibility, and there seems to be a
 strong sentiment in favor of that; not really a weakness case for
 HOFs, as you feared it might be during the lunchtime discussion.

:-)

You seem to have missed my revised proposal.

 Out of all I've read here, I like the idea of having a
 collections.autodict (a much nicer name than defaultdict, a better
 collocation for 2.5 than the builtins). One point I think nobody has
 made is that whenever reasonably possible the setting of a callback
 (the callable factory here) should include *a and **k to use when
 calling back.

That's your C/C++ brain talking. :-)

If you need additional data passed to a callback (to be provided at
the time the callback is *set*, not when it is *called*) the customary
approach is to make the callback a parameterless lambda; you can also
use a bound method, etc. There's no need to complicate ever piece of
code that calls a callback with the machinery to store and use
arbirary arguments and keyword arguments.

I forgot to mention in my revised proposal that the API for setting
the default_factory is slightly odd:

  d = {}   # or dict()
  d.default_factory = list

rather than

  d = dict(default_factory=list)

This is of course because we cut off that way when we defined what
arbitrary keyword arguments to the dict constructor would do. My
original proposal solved this by creating a subclass. But there were
several suggestions that this would be fine functionality to add to
the standard dict type -- and then I really don't see any other way to
do this. (Yes, I could have a set_default_factory() method -- but a
simple settable attribute seems more pythonic!)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Thomas Heller
Guido van Rossum wrote:
 So here's a new proposal.
 
 Let's add a generic missing-key handling method to the dict class, as
 well as a default_factory slot initialized to None. The implementation
 is like this (but in C):
 
 def on_missing(self, key):
   if self.default_factory is not None:
 value = self.default_factory()
 self[key] = value
 return value
   raise KeyError(key)
 
 When __getitem__() (and *only* __getitem__()) finds that the requested
 key is not present in the dict, it calls self.on_missing(key) and
 returns whatever it returns -- or raises whatever it raises.
 __getitem__() doesn't need to raise KeyError any more, that's done by
 on_missing().

Will this also work when PyDict_GetItem() does not find the key?

Thomas

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Thomas Heller
[cc to py-dev again]

Guido van Rossum wrote:
 On 2/17/06, Thomas Heller [EMAIL PROTECTED] wrote:
 Guido van Rossum wrote:
 So here's a new proposal.

 Let's add a generic missing-key handling method to the dict class, as
 well as a default_factory slot initialized to None. The implementation
 is like this (but in C):

 def on_missing(self, key):
   if self.default_factory is not None:
 value = self.default_factory()
 self[key] = value
 return value
   raise KeyError(key)

 When __getitem__() (and *only* __getitem__()) finds that the requested
 key is not present in the dict, it calls self.on_missing(key) and
 returns whatever it returns -- or raises whatever it raises.
 __getitem__() doesn't need to raise KeyError any more, that's done by
 on_missing().
 Will this also work when PyDict_GetItem() does not find the key?
 
 Ouch, tricky. It should, of course, but the code will be a tad tricky
 because it's not supposed to inc the refcount. Thanks for reminding
 me!
 

Ahem, I'm still looking for ways to 'overtake' the dict to implement
weird and fancy things.  Can on_missing be overridden in subclasses (writing
the subclass in C would not be a problem)?

Thanks,

Thomas

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Guido van Rossum
On 2/17/06, Thomas Heller [EMAIL PROTECTED] wrote:
 Ahem, I'm still looking for ways to 'overtake' the dict to implement
 weird and fancy things.  Can on_missing be overridden in subclasses (writing
 the subclass in C would not be a problem)?

Why ahem?

The answer is yes.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Ian Bicking
Guido van Rossum wrote:
 d = {}
 d.default_factory = set
 ...
 d[key].add(value)

Another option would be:

   d = {}
   d.default_factory = set
   d.get_default(key).add(value)

Unlike .setdefault, this would use a factory associated with the 
dictionary, and no default value would get passed in.  Unlike the 
proposal, this would not override __getitem__ (not overriding 
__getitem__ is really the only difference with the proposal).  It would 
be clear reading the code that you were not implicitly asserting they 
key in d was true.

get_default isn't the best name, but another name isn't jumping out at 
me at the moment.  Of course, it is not a Pythonic argument to say that 
an existing method should be overridden, or functionality made nameless 
simply because we can't think of a name (looking to anonymous functions 
of course ;)

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Aahz
On Fri, Feb 17, 2006, Guido van Rossum wrote:

 But the default implementation is designed so that we can write
 
 d = {}
 d.default_factory = list

+1

I actually like the fact that you're forced to use a separate statement
for setting the default_factory.  From my POV, this can go into 2.5.

(I was only +0 on the previous proposal and I was -1 on making it a
built-in; this extension is much nicer.)
-- 
Aahz ([EMAIL PROTECTED])   * http://www.pythoncraft.com/

19. A language that doesn't affect the way you think about programming,
is not worth knowing.  --Alan Perlis
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Adam Olsen
On 2/17/06, Guido van Rossum [EMAIL PROTECTED] wrote:
 - There's a fundamental difference between associating the default
 value with the dict object, and associating it with the call. So
 proposals to invent a better name/signature for setdefault() don't
 compete.

That's a feature, not a bug. :)  See below.


 - The inconsistency introduced by __getitem__() returning a value for
 keys while get(), __contains__(), and keys() etc. don't show it,
 cannot be resolved usefully. You'll just have to live with it.
 Modifying get() to do the same thing as __getitem__() doesn't seem
 useful -- it just takes away a potentially useful operation.

Again, see below.


 So here's a new proposal.

 Let's add a generic missing-key handling method to the dict class, as
 well as a default_factory slot initialized to None. The implementation
 is like this (but in C):

 def on_missing(self, key):
   if self.default_factory is not None:
 value = self.default_factory()
 self[key] = value
 return value
   raise KeyError(key)

 When __getitem__() (and *only* __getitem__()) finds that the requested
 key is not present in the dict, it calls self.on_missing(key) and
 returns whatever it returns -- or raises whatever it raises.
 __getitem__() doesn't need to raise KeyError any more, that's done by
 on_missing().

Still -1.  It's better, but it violates the principle of encapsulation
by mixing how-you-use-it state with what-it-stores state.  In doing
that it has the potential to break an API documented as accepting a
dict.  Code that expects d[key] to raise an exception (and catches the
resulting KeyError) will now silently succeed.  I believe that
necessitates a PEP to document it.

It's also makes it harder to read code.  You may expect d[key] to
raise an exception, but it won't because of a single line up several
pages (or in another file entierly!)

d.getorset(key, func) has no such problems and has a much simpler specification:

def getorset(self, key, func):
  try:
return self[key]
  except KeyError:
value = self[key] = func()
return value

--
Adam Olsen, aka Rhamphoryncus
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Martin v. Löwis
Fuzzyman wrote:
Also, I think has_key/in should return True if there is a default.

  

 And exactly what use would it then be ?

Code that checks

if d.has_key(k):
  print d[k]

would work correctly. IOW, you could use a dictionary with a default
key just as if it were a normal dictionary - which is a useful
property, IMO.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Guido van Rossum
On 2/17/06, Adam Olsen [EMAIL PROTECTED] wrote:
 It's also makes it harder to read code.  You may expect d[key] to
 raise an exception, but it won't because of a single line up several
 pages (or in another file entierly!)

Such are the joys of writing polymorphic code. I don't really see how
you can avoid this kind of confusion -- I could have given you some
other mapping object that does weird stuff.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Martin v. Löwis
Adam Olsen wrote:
 Still -1.  It's better, but it violates the principle of encapsulation
 by mixing how-you-use-it state with what-it-stores state.  In doing
 that it has the potential to break an API documented as accepting a
 dict.  Code that expects d[key] to raise an exception (and catches the
 resulting KeyError) will now silently succeed.

Of course it will, and without quotes. That's the whole point.

 I believe that necessitates a PEP to document it.

You are missing the rationale of the PEP process. The point is
*not* documentation. The point of the PEP process is to channel
and collect discussion, so that the BDFL can make a decision.
The BDFL is not bound at all to the PEP process.

To document things, we use (or should use) documentation.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Guido van Rossum
On 2/17/06, Ian Bicking [EMAIL PROTECTED] wrote:
 Guido van Rossum wrote:
  d = {}
  d.default_factory = set
  ...
  d[key].add(value)

 Another option would be:

d = {}
d.default_factory = set
d.get_default(key).add(value)

 Unlike .setdefault, this would use a factory associated with the
 dictionary, and no default value would get passed in.  Unlike the
 proposal, this would not override __getitem__ (not overriding
 __getitem__ is really the only difference with the proposal).  It would
 be clear reading the code that you were not implicitly asserting they
 key in d was true.

 get_default isn't the best name, but another name isn't jumping out at
 me at the moment.  Of course, it is not a Pythonic argument to say that
 an existing method should be overridden, or functionality made nameless
 simply because we can't think of a name (looking to anonymous functions
 of course ;)

I'm torn. While trying to implement this I came across some ugliness
in PyDict_GetItem() -- it would make sense if this also called
on_missing(), but it must return a value without incrementing its
refcount, and isn't supposed to raise exceptions -- so what to do if
on_missing() returns a value that's not inserted in the dict?

If the __getattr__()-like operation that supplies and inserts a
dynamic default was a separate method, we wouldn't have this problem.

OTOH most reviewers here seem to appreciate on_missing() as a way to
do various other ways of alterning a dict's __getitem__() behavior
behind a caller's back -- perhaps it could even be (ab)used to
implement case-insensitive lookup.

I'm not going to do a point-by-point to your longer post (I don't have
the time); let's (again) agree to disagree and I'll sleep on it.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Ian Bicking
Guido van Rossum wrote:
 On 2/17/06, Adam Olsen [EMAIL PROTECTED] wrote:
 
It's also makes it harder to read code.  You may expect d[key] to
raise an exception, but it won't because of a single line up several
pages (or in another file entierly!)
 
 
 Such are the joys of writing polymorphic code. I don't really see how
 you can avoid this kind of confusion -- I could have given you some
 other mapping object that does weird stuff.

The way you avoid confusion is by not working with code or programmers 
who write bad code.  Python and polymorphic code in general pushes the 
responsibility for many errors from the language structure onto the 
programmer -- it is the programmers' responsibility to write good code. 
  Python has never kept people from writing obcenely horrible code.  We 
ought to have an obfuscated Python contest just to prove that point -- 
it is through practice and convention that readable Python code happens, 
not through the restrictions of the language.  (Honestly, I think such a 
contest would be a good idea.)

I know *I* at least don't like code that mixes up access and 
modification.  Maybe not everyone does (or maybe not everyone thinks of 
getitem as access, but that's unlikely).  I will assert that it is 
Pythonic to keep access and modification separate, which is why methods 
and attributes are different things, and why assignment is not an 
expression, and why functions with side effects typically return None, 
or have names that are very explicit about the side effect, with names 
containing command verbs like update or set.  All of these 
distinguish access from modification.

Note that all of what I'm saying *only* applies to the overriding of 
__getitem__, not the addition of any new method.  I think multidict is 
better for the places it applies, but I see no problem at all with a new 
method on dictionaries that calls on_missing.

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Martin v. Löwis
Guido van Rossum wrote:
 I'm torn. While trying to implement this I came across some ugliness
 in PyDict_GetItem() -- it would make sense if this also called
 on_missing(), but it must return a value without incrementing its
 refcount, and isn't supposed to raise exceptions -- so what to do if
 on_missing() returns a value that's not inserted in the dict?

I think there should be a guideline to use
PyObject_GetItem/PyMapping_GetItemString normally, i.e. in all cases
where you would write d[k] in Python code.

It should be considered a bug if PyDict_GetItem is used in a place
that should invoke defaulting; IOW, the function should be reserved
to really low-level cases (e.g. if it is known that the dict doesn't
have any defaulting, e.g. the string interned dictionary).

There should be a policy whether name-lookup invokes defaulting
(i.e. __dict__ access); I think it should. This would cause
__getattr__ to have no effect if the object's dictionary has
a default factory (unless that raises a KeyError).

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Adam Olsen
On 2/17/06, Guido van Rossum [EMAIL PROTECTED] wrote:
 On 2/17/06, Adam Olsen [EMAIL PROTECTED] wrote:
  It's also makes it harder to read code.  You may expect d[key] to
  raise an exception, but it won't because of a single line up several
  pages (or in another file entierly!)

 Such are the joys of writing polymorphic code. I don't really see how
 you can avoid this kind of confusion -- I could have given you some
 other mapping object that does weird stuff.

You could pass a float in as well.  But if the function is documented
as taking a dict, and the programmer expects a dict.. that now has to
be changed to dict without a default.  Or they have to code
defensively since d[key] may or may not raise KeyError, so they must
avoid depending on it either way.

--
Adam Olsen, aka Rhamphoryncus
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Guido van Rossum
On 2/17/06, Adam Olsen [EMAIL PROTECTED] wrote:
  Such are the joys of writing polymorphic code. I don't really see how
  you can avoid this kind of confusion -- I could have given you some
  other mapping object that does weird stuff.

 You could pass a float in as well.  But if the function is documented
 as taking a dict, and the programmer expects a dict.. that now has to
 be changed to dict without a default.  Or they have to code
 defensively since d[key] may or may not raise KeyError, so they must
 avoid depending on it either way.

I'd like to see a real-life example of code that would break this way.
I believe that *most* code that takes a dict will work just fine if
that dict has a default factory.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Martin v. Löwis
Adam Olsen wrote:
 You could pass a float in as well.  But if the function is documented
 as taking a dict, and the programmer expects a dict.. that now has to
 be changed to dict without a default.  Or they have to code
 defensively since d[key] may or may not raise KeyError, so they must
 avoid depending on it either way.

Can you give an example of real, existing code that will break
if a such a dict is passed?

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Adam Olsen
On 2/17/06, Martin v. Löwis [EMAIL PROTECTED] wrote:
 Adam Olsen wrote:
  Still -1.  It's better, but it violates the principle of encapsulation
  by mixing how-you-use-it state with what-it-stores state.  In doing
  that it has the potential to break an API documented as accepting a
  dict.  Code that expects d[key] to raise an exception (and catches the
  resulting KeyError) will now silently succeed.

 Of course it will, and without quotes. That's the whole point.

Consider these two pieces of code:

if key in d:
  dosomething(d[key])
else:
  dosomethingelse()

try:
  dosomething(d[key])
except KeyError:
  dosomethingelse()

Before they were the same (assuming dosomething() won't raise
KeyError).  Now they would behave differently.

The latter is even the prefered form, since it only invokes a single
dict lookup:

On 2/16/06, Delaney, Timothy (Tim) [EMAIL PROTECTED] wrote:
 try:
 v = d[key]
 except:
 v = d[key] = value

Obviously this example could be changed to use default_factory, but I
find it hard to believe the only use of that pattern is to set default
keys.

Of course you could just assume that of all the people passing your
function a dict, none of them will ever use the default_factory when
they build the dict.  Should be easy, right?

--
Adam Olsen, aka Rhamphoryncus
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Ian Bicking
Martin v. Löwis wrote:
I know *I* at least don't like code that mixes up access and 
modification.  Maybe not everyone does (or maybe not everyone thinks of 
getitem as access, but that's unlikely).  I will assert that it is 
Pythonic to keep access and modification separate, which is why methods 
and attributes are different things, and why assignment is not an 
expression, and why functions with side effects typically return None, 
or have names that are very explicit about the side effect, with names 
containing command verbs like update or set.  All of these 
distinguish access from modification.
 
 
 Do you never write
 
  d[some_key].append(some_value)
 
 This is modification and access, all in a single statement, and all
 without assignment operator.

(d[some_key]) is access.  (...).append(some_value) is modification. 
Expressions are compound; of course you can mix both access and 
modification in a single expression.  d[some_key] is access that returns 
something, and .append(some_value) modifies that something, it doesn't 
modify d.

 I don't see the setting of the default value as a modification.
 The default value has been there, all the time. It only is incarnated
 lazily.

It is lazily incarnated for multidict, because there is no *noticeable* 
side effect -- if there is any internal side effects that is an 
implementation detail.  However for default_factory=list, the result of 
.keys(), .has_key(), and .items() changes when you do d[some_key].

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Ian Bicking
Adam Olsen wrote:
 The latter is even the prefered form, since it only invokes a single
 dict lookup:
 
 On 2/16/06, Delaney, Timothy (Tim) [EMAIL PROTECTED] wrote:
 
try:
v = d[key]
except:
v = d[key] = value
 
 
 Obviously this example could be changed to use default_factory, but I
 find it hard to believe the only use of that pattern is to set default
 keys.

I'd go further -- I doubt many cases where try:except KeyError: is used 
could be refactored to use default_factory -- default_factory can only 
be used to set default keys to something that can be determined sometime 
close to the time the dictionary is created, and that the default is not 
dependent on the context in which the key is fetched, and that default 
value will not cause unintended side effects if the dictionary leaks out 
of the code where it was initially used (like if the dictionary is 
returned to someone).  Any default factory is more often an algorithmic 
detail than truly part of the nature of the dictionary itself.

For instance, here is something I do often:

try:
 value = cache[key]
except KeyError:
 ... calculate value ...
 cache[key] = value

Realistically, factoring ... calculate value ... into a factory that 
calculates the value would be difficult, produce highly unreadable code, 
perform worse, and have more bugs.  For simple factories like list and 
dict the factory works okay.  For immutable values like 0 and None, 
the factory (lambda : 0 and lambda : None) is a wasteful way to create a 
default value (because storing the value in the dictionary is 
unnecessary).  For non-trivial factories the whole thing falls apart, 
and one can just hope that no one will try to use this feature and will 
instead stick with the try:except KeyError: technique.


-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Michael Chermside
Martin v. Löwis writes:
 You are missing the rationale of the PEP process. The point is
 *not* documentation. The point of the PEP process is to channel
 and collect discussion, so that the BDFL can make a decision.
 The BDFL is not bound at all to the PEP process.

 To document things, we use (or should use) documentation.

You are oversimplifying significantly. The purpose of the PEP
process is to lay out and document the reasoning that went
into forming the decision. The BDFL is *allowed* to be
capricious, but he's sensible enough to choose not to: in
cases where it matters, he tries to document the reasoning
behind his decisions. In fact, he does better than that... he
gets the PEP author to document it for him!

The PEP (whether accepted, rejected, or in-progress) serves
as the official documentation of how the decision was made
(or of what option it is that is still undecided). If a
_trivial_ decision is already made, there's no need for a
PEP, but if a difficult decision has been made, then
documenting it in a PEP saves years of having to justify
it to newbies.

-- Michael Chermside

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Martin v. Löwis
Adam Olsen wrote:
 Consider these two pieces of code:
 
 if key in d:
   dosomething(d[key])
 else:
   dosomethingelse()
 
 try:
   dosomething(d[key])
 except KeyError:
   dosomethingelse()
 
 Before they were the same (assuming dosomething() won't raise
 KeyError).  Now they would behave differently.

I personally think they should continue to do the same thing,
i.e. in should return True if there is a default; in the
current proposal, it should invoke the default factory.

But that's beside the point: Where is the real example
where this difference would matter? (I'm not asking for
a realistic example, I'm asking for a real one)

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Martin v. Löwis
Ian Bicking wrote:
 It is lazily incarnated for multidict, because there is no *noticeable*
 side effect -- if there is any internal side effects that is an
 implementation detail.  However for default_factory=list, the result of
 .keys(), .has_key(), and .items() changes when you do d[some_key].

That's why I think has_key and in should return True for any key.
This leaves keys(), items(), and values(). From a pure point of
view, they should return infinite sets. Practicality beats purity,
so yes, d[k] could be considered a modifying operation.

If you look carefully, you find that many access operations also
have side effects. For example, .read() on a file not only returns
some data, but also advances the file position. Queue.get not
only returns the next item, but also removes it from the queue.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Guido van Rossum
On 2/17/06, Martin v. Löwis [EMAIL PROTECTED] wrote:
 That's why I think has_key and in should return True for any key.
 This leaves keys(), items(), and values(). From a pure point of
 view, they should return infinite sets. Practicality beats purity,
 so yes, d[k] could be considered a modifying operation.

I think practicality beats purity even for has_key/in; IMO these
operations are more useful when they match keys() instead of always
returning True. But someone should start writing some code to play
with this.

I have a working patch (including a hack for PyDict_GetItem()):
python.org/sf/1433928

So there's no excuse to be practical now.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Adam Olsen
On 2/17/06, Martin v. Löwis [EMAIL PROTECTED] wrote:
 Adam Olsen wrote:
  You could pass a float in as well.  But if the function is documented
  as taking a dict, and the programmer expects a dict.. that now has to
  be changed to dict without a default.  Or they have to code
  defensively since d[key] may or may not raise KeyError, so they must
  avoid depending on it either way.

 Can you give an example of real, existing code that will break
 if a such a dict is passed?

I only got halfway through the grep KeyError results, but..

Demo/metaclass/Meta.py:55
Demo/tkinter/guido/AttrDialog.py:121  # Subclasses override self.classes
Lib/ConfigParser.py:623
Lib/random.py:315
Lib/string.py:191
Lib/weakref.py:56  # Currently uses UserDict but I assume it will
switch to dict eventually

And the pièce de résistance..
Doc/tools/anno-api.py:51

It has this:
try:
info = rcdict[s]
except KeyError:
sys.stderr.write(No refcount data for %s\n % s)
else:
...
rcdict is loaded from refcounts.load().  refcounts.load() calls
refcounts.loadfile(), which has this (inside a loop):
try:
entry = d[function]
except KeyError:
entry = d[function] = Entry(function)
A prime candidate for a default.

Perhaps the KeyError shouldn't ever get triggered in this case, I'm
not sure.  I think that's besides the point though.  The programmer
clearly expected it would.

--
Adam Olsen, aka Rhamphoryncus
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Nick Coghlan
Adam Olsen wrote:
 And the pièce de résistance..
 Doc/tools/anno-api.py:51
 
 It has this:
 try:
 info = rcdict[s]
 except KeyError:
 sys.stderr.write(No refcount data for %s\n % s)
 else:
 ...
 rcdict is loaded from refcounts.load().  refcounts.load() calls
 refcounts.loadfile(), which has this (inside a loop):
 try:
 entry = d[function]
 except KeyError:
 entry = d[function] = Entry(function)
 A prime candidate for a default.
 
 Perhaps the KeyError shouldn't ever get triggered in this case, I'm
 not sure.  I think that's besides the point though.  The programmer
 clearly expected it would.

Assuming the following override:

   class EntryDict(dict):
   def on_missing(self, key):
   value = Entry(key)
   self[key] = value
   return value

Then what it means is that the behaviour of missing functions get an empty 
refcount entry propagates to the rcdict code.

So the consequence is that the code in anno-api will never print an error 
message - all functions are deemed to have associated refcount data in 
refcount.dat.

But that would be a bug in refcounts.loadfile: if it returns an EntryDict 
instead of a normal dict it is, in effect, returning an *infinite* dictionary 
that contains refcount definitions for every possible function name (some of 
them are just populated on demand).

So *if* refcounts.loadfile was converted to use an EntryDict, it would need to 
return dict(d) instead of returning d directly.

And this is where the question of whether has_key/__having__ return True or 
False when default_factory is set is important. If they return False, then the 
LBYL (if key in d:) and EAFTP (try/except) approaches give *different answers*.

More importantly, LBYL will never have side effects, whereas EAFTP may.

If the methods always return True (as Martin suggests), then we retain the 
current behaviour where there is no real difference between the two 
approaches. Given the amount of time spent in recent years explaining this 
fact, I don't think it is an equivalence that should be broken lightly (IOW, 
I've persuaded myself that I agree with Martin)

The alternative would be to have an additional query API will_default that 
reflects whether or not a given key is actually present in the dictionary (if 
key not in d.keys() would serve a similar purpose, but requires building the 
list of keys).

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Nick Coghlan
Guido van Rossum wrote:
 But there were
 several suggestions that this would be fine functionality to add to
 the standard dict type -- and then I really don't see any other way to
 do this.

Given the constructor problem, and the issue with this function expects a 
plain dictionary, I think your original instinct to use a subclass may have 
been correct.

The constructor is much cleaner that way:

# bag like behavior
dd = collections.autodict(int)
for elem in collection:
  dd[elem] += 1

# setdefault-like behavior
dd = collections.autodict(list)
for page_number, page in enumerate(book):
  for word in page.split():
  dd[word].append(word)

And it can be a simple fact that for an autodict, if key in d and d[key] 
may give different answers.

Much cleaner than making the semantics of normal dicts dependent on:
  a. whether or not on_missing has been overridden
  b. whether or not default_factory has been set

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Phillip J. Eby
At 11:58 AM 02/17/2006 -0800, Guido van Rossum wrote:
I forgot to mention in my revised proposal that the API for setting
the default_factory is slightly odd:

   d = {}   # or dict()
   d.default_factory = list

rather than

   d = dict(default_factory=list)

This is of course because we cut off that way when we defined what
arbitrary keyword arguments to the dict constructor would do. My
original proposal solved this by creating a subclass. But there were
several suggestions that this would be fine functionality to add to
the standard dict type -- and then I really don't see any other way to
do this. (Yes, I could have a set_default_factory() method -- but a
simple settable attribute seems more pythonic!)

Why not a classmethod constructor:

  d = dict.with_factory(list)

Admittedly, the name's not that great.  Actually, it's almost as bad as 
setdefault in some ways.  But I'd rather set the default and create the 
dictionary in one operation, since when reading it as two, you first think 
'd is a dictionary', and then 'oh, but it has a default factory', as 
opposed to d is a dict with a factory in one thought.  But maybe that's 
just me.  :)

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Guido van Rossum
On 2/17/06, Nick Coghlan [EMAIL PROTECTED] wrote:
 And this is where the question of whether has_key/__having__ return True or
 False when default_factory is set is important. If they return False, then the
 LBYL (if key in d:) and EAFTP (try/except) approaches give *different 
 answers*.

 More importantly, LBYL will never have side effects, whereas EAFTP may.

 If the methods always return True (as Martin suggests), then we retain the
 current behaviour where there is no real difference between the two
 approaches. Given the amount of time spent in recent years explaining this
 fact, I don't think it is an equivalence that should be broken lightly (IOW,
 I've persuaded myself that I agree with Martin)

 The alternative would be to have an additional query API will_default that
 reflects whether or not a given key is actually present in the dictionary (if
 key not in d.keys() would serve a similar purpose, but requires building the
 list of keys).

Looking at it from the which invariants hold POV isn't always the
right perspective.

Reality is that some amount of code that takes a dict won't work if
you give it a dict with a default_factory. Well, that's nothing new.
Some code also breaks if you pass it a dict containing key or value
types it doesn't expect, or if you pass it an anydbm instance, or
os.environ on Windows (which implements case-insensitive keys).

From the POV of someone who decides to use a dict with a
default_factory (or overriding on-missing()), having the 'in' operator
always return True is d*mn annoying -- it means that any kind of
introspection of the dict doesn't work. Take for example the multiset
use case. Suppose you're aware that you're using a dict with this
special behavior. Now you've built up your multiset and now you want
to use it. Part of your app is interested in knowing the list of
values associated with each key. But another part may be interested
only in whether a particular key hs *any* values associated. If key
in d returns whether that key is currently present, you can write

  if key in d:
  print whatever

But under Martin and your proposed semantics, you'd have to write

  if d.get(key):
  print whatever

or (worse)

  if d[key]: # inserts an empty list into the dict!
  print whatever

I'd much rather be able to write if key in d and get the result I want...

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Bengt Richter
On Sat, 18 Feb 2006 00:52:51 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= 
[EMAIL PROTECTED] wrote:

Adam Olsen wrote:
 Consider these two pieces of code:
 
 if key in d:
   dosomething(d[key])
 else:
   dosomethingelse()
 
 try:
   dosomething(d[key])
 except KeyError:
   dosomethingelse()
 
 Before they were the same (assuming dosomething() won't raise
 KeyError).  Now they would behave differently.

I personally think they should continue to do the same thing,
i.e. in should return True if there is a default; in the
current proposal, it should invoke the default factory.

But that's beside the point: Where is the real example
where this difference would matter? (I'm not asking for
a realistic example, I'm asking for a real one)

My guess is that realistically default_factory will be used
to make clean code for filling a dict, and then turning the factory
off if it's to be passed into unknown contexts. Those contexts
can then use old code to do as above, or if worth it can
temporarily set a factory to do some work. Tightly coupled
code I guess could pass factory-enabled dicts between each other.

IOW, no code should break unless you pass a factory-enabled dict
where you shouldn't ;-)

That said, maybe enabling/disabling could be separate from d.default_factory
(e.g., d.defaults_enabled) as that could allow e.g. foo(**kw) more options
in how to copy kw and what foo could do. Would total copy including defaulting 
state
be best? What other copies must be sanitized? type('Foo',(), **{'this':'one?'})

It will be interesting to see what comes out of the woodwork ;-)

Regards,
Bengt Richter

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Michael Urman
On 2/17/06, Adam Olsen [EMAIL PROTECTED] wrote:
 if key in d:
   dosomething(d[key])
 else:
   dosomethingelse()

 try:
   dosomething(d[key])
 except KeyError:
   dosomethingelse()

I agree with the gut feeling that these should still do the same
thing. Could we modify d.get() instead?

 class ddict(dict):
... default_value_factory = None
... def get(self, k, d=None):
... v = super(ddict, self).get(k, d)
... if v is not None or d is not None or
self.default_value_factory is None:
... return v
... return self.setdefault(k, self.default_value_factory())
...
 d = ddict()
 d.default_value_factory = list
 d.get('list', [])
[]
 d['list']
Traceback (most recent call last):
  File stdin, line 1, in ?
KeyError: 'list'
 d.get('list').append(5)
 d['list']
[5]

There was never an exception raised by d.get so this wouldn't change
(assuming the C is implemented more carefully than the python above).
What are the problems with this other than, like setdefault, it only
works on values with mutator methods (i.e., no counting dicts)? Is the
lack of counting dicts that d.__getitem__ supports a deal breaker?

 d.default_value_factory = int
 d.get('count') += 1
SyntaxError: can't assign to function call

How does the above either in dict or a subclass compare to five line
or smaller custom subclasses using something like the following?
def append(self, k, val):
self.setdefault(k, []).append(val)
or
def accumulate(self, k, val):
try: self[k] += val
except KeyError: self[k] = val

Michael
--
Michael Urman  http://www.tortall.net/mu/blog
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Ilya Sandler


On Fri, 17 Feb 2006, Phillip J. Eby wrote:

d = {}   # or dict()
d.default_factory = list

 Why not a classmethod constructor:

   d = dict.with_factory(list)

  But I'd rather set the default and create the
 dictionary in one operation, since when reading it as two, you first think
 'd is a dictionary', and then 'oh, but it has a default factory', as
 opposed to d is a dict with a factory in one thought.


Also, class method would mean less typing (esp if dictionary name
happens to be longer than a couple of characters ;-)

But I'd like to suggest a different  name:

d = dict.with_default( list)

Ilya
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


  1   2   >