Re: [Python-Dev] defaultdict and on_missing()

2006-03-02 Thread Aahz
On Wed, Mar 01, 2006, Guido van Rossum wrote:

 Operations with two or more arguments are often better expressed as
 function calls -- for example, map() and filter() don't make much
 sense as methods on callables or sequences.

OTOH, my personal style is to always use re.compile() because I can
never remember the order of arguments for re.match()/re.search().
-- 
Aahz ([EMAIL PROTECTED])   * http://www.pythoncraft.com/

19. A language that doesn't affect the way you think about programming,
is not worth knowing.  --Alan Perlis
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-03-02 Thread Guido van Rossum
On 3/2/06, Barry Warsaw [EMAIL PROTECTED] wrote:
 On Thu, 2006-03-02 at 07:26 -0800, Aahz wrote:
  OTOH, my personal style is to always use re.compile() because I can
  never remember the order of arguments for re.match()/re.search().

 Agreed.

I don't have that problem, because the order is the same either way:

 re.compile(pattern).match(line)
 re.match(pattern, line)

:-)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-03-02 Thread Aahz
On Thu, Mar 02, 2006, Guido van Rossum wrote:
 On 3/2/06, Barry Warsaw [EMAIL PROTECTED] wrote:
 On Thu, 2006-03-02 at 07:26 -0800, Aahz wrote:

 OTOH, my personal style is to always use re.compile() because I can
 never remember the order of arguments for re.match()/re.search().

 Agreed.
 
 I don't have that problem, because the order is the same either way:
 
  re.compile(pattern).match(line)
  re.match(pattern, line)

But that would require thinking!  ;-)  More seriously, much as I hate the
way ''.join() looks, I have never gotten mixed up about argument order as
I used to with string.join().
-- 
Aahz ([EMAIL PROTECTED])   * http://www.pythoncraft.com/

19. A language that doesn't affect the way you think about programming,
is not worth knowing.  --Alan Perlis
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-03-01 Thread Brett Cannon
On 2/28/06, Terry Reedy [EMAIL PROTECTED] wrote:

 Greg Ewing [EMAIL PROTECTED] wrote in message
 news:[EMAIL PROTECTED]
  And you don't think there are many different
  types of iterables? You might as well argue
  that we don't need len() because it only
  applies to sequences.

 Since you mention it..., many people *have* asked on c.l.p why len() is a
 builtin function rather than a method of sequences (and other collections)
 (as .len, not .__len__).  Some have suggested that it should be the latter.
 The answers justifying the status quo have been twofold.

 1.  Before 2.2, not all builtin sequence types had methods (str and tuple),
 so they could not have a .len method.  (This begs the question of why not,
 but that is moot now.)

Well, up until 2.2 you didn't have new-style classes which have a
common base class.  And if you wanted to do the length compilation
only when requested, you needed a method.  But now with object, we
could add extra smarts to __getattr__ or __getattribute__ so that if
``spam.len`` is requested  it calls ``spam.__next__()`` for you,
basically a poor-man's property.  Or, if ``spam.len`` is defined,
return that.

But moving over to more attributes for how we access basic interfaces
seems great to me.

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-03-01 Thread Guido van Rossum
On 3/1/06, Brett Cannon [EMAIL PROTECTED] wrote:
 But moving over to more attributes for how we access basic interfaces
 seems great to me.

We shouldn't approach this as methods good, functions bad -- nor the
opposite, for that matter.

When all things are equal, I'd much rather do usability studies to
find out what newbies (who haven't been given the OO kool-aid yet)
find easier. For example, math functions are typically written as
sin(x) -- I wouldn't want to switch that to x.sin() and I think most
people would agree.

But often, all things aren't equal. Sometimes, a single algorithm can
be applied to a large set of different types, and then making it a
method is a waste (forcing each type to reimplement the same
algorithm). sin() is actually an example of this (since it applies to
int, long and float). Other times, the same conceptual operation must
be implemented differently for each type, and then a method makes more
sense. I like to think of list.sort() as an example of this -- sorting
algorithms are tightly coupled to internal data representation, and a
generic sort function for mutable sequences would likely be of
theoretical interest only -- in practice it would be much slower than
an implementation that can make use of the internal representation
directly.

Operations with two or more arguments are often better expressed as
function calls -- for example, map() and filter() don't make much
sense as methods on callables or sequences.

str.join() is an interesting case, where usability studies may be
necessary. I've often been asked why this isn't a list method -- but
of course that would be less general, since joining strings applies
equally well to other types of sequeces (and iterables). Making it a
string method is arguably the right thing to do, since this operation
only makes sense for strings. But many people (not me!) find the ,
.join(seq) notation hard to read; some other languages use a built-in
function join(seq, str) instead, which is arguably more readable. The
type of such a polymorphic function is easily specified:
join(sequence[T], T) - T, where T is a string-ish type. (It should
make sense for T==bytes as well; I'm not so sure about T==list though.
:-)

One problem with the methods approach is that there's less pressure to
use the same API for all object types (especially with duck typing).
For an example of methods gone horribly wrong, look at Java, where you
have bultin-array.length, String.length(), and Collection.size().
Give me len() any day. I believe Ruby has similar confusing diversity
for looping (each/forEach).

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-03-01 Thread Greg Ewing
Guido van Rossum wrote:

 str.join() is an interesting case...

  Making it a
 string method is arguably the right thing to do, since this operation
 only makes sense for strings. 

  The type of such a polymorphic function is easily specified:
 join(sequence[T], T) - T, where T is a string-ish type.

I'd say it makes sense for any type that supports
concatenation (maybe that's what you mean by string-ish?)

This looks like a case where the xxx()/__xxx__() pattern
could be of benefit. Suppose there were a function

   def join(seq, sep):
 if hasattr(sep, '__join__'):
   return sep.__join__(seq)
 else:
   # generic implementation

Then you could get nice fast type-specific implementations
for strings, bytes, etc., without being limited to those
types.

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | Carpe post meridiam! |
Christchurch, New Zealand  | (I'm not a morning person.)  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-28 Thread Nick Coghlan
Greg Ewing wrote:
 Nick Coghlan wrote:
 
 I wouldn't mind seeing one of the early ideas from PEP 340 being 
 resurrected some day, such that the signature for the special method 
 was __next__(self, input) and for the builtin next(iterator, 
 input=None)
 
 Aren't we getting an argument to next() anyway?
 Or was that idea dropped?

PEP 342 opted to extend the generator API instead (using send) and leave the 
iterator protocol alone for the time being.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-28 Thread Guido van Rossum
On 2/28/06, Nick Coghlan [EMAIL PROTECTED] wrote:
 Greg Ewing wrote:
  Nick Coghlan wrote:
 
  I wouldn't mind seeing one of the early ideas from PEP 340 being
  resurrected some day, such that the signature for the special method
  was __next__(self, input) and for the builtin next(iterator,
  input=None)
 
  Aren't we getting an argument to next() anyway?
  Or was that idea dropped?

 PEP 342 opted to extend the generator API instead (using send) and leave the
 iterator protocol alone for the time being.

One of the main reasons for this was the backwards compatibility
problems at the C level. The C implementation doesn't take an
argument. Adding an argument would cause all sorts of code breakage
and possible segfaults (if there's 3rd party code calling tp_next for
example).

In 3.0 we could fix this.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-28 Thread Terry Reedy

Greg Ewing [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 And you don't think there are many different
 types of iterables? You might as well argue
 that we don't need len() because it only
 applies to sequences.

Since you mention it..., many people *have* asked on c.l.p why len() is a 
builtin function rather than a method of sequences (and other collections) 
(as .len, not .__len__).  Some have suggested that it should be the latter. 
The answers justifying the status quo have been twofold.

1.  Before 2.2, not all builtin sequence types had methods (str and tuple), 
so they could not have a .len method.  (This begs the question of why not, 
but that is moot now.)

- whereas .next came in with the universalization of methods.

2. Before the addition of list comprehensions, a function could be mapped 
much more easily than a method

- whereas now we do have list comps and even this works
 [i.__add__(2) for i in range(3)]
[2, 3, 4]

- I can imagine wanting to map len to, for instance, a list of strings more 
easily than I can imagine a reason to map next/.next to a list of 
iterators, and if I did, I am willing to use the list comp form.

Terry Jan Reedy



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-27 Thread Greg Ewing
Nick Coghlan wrote:

 I wouldn't mind seeing one of the early ideas from PEP 340 being resurrected 
 some day, such that the signature for the special method was __next__(self, 
 input) and for the builtin next(iterator, input=None)

Aren't we getting an argument to next() anyway?
Or was that idea dropped?

Greg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-25 Thread Guido van Rossum
FWIW this has now been checked in. Enjoy!

--Guido

On 2/23/06, Guido van Rossum [EMAIL PROTECTED] wrote:
 On 2/22/06, Michael Chermside [EMAIL PROTECTED] wrote:
  A minor related point about on_missing():
 
  Haven't we learned from regrets over the .next() method of iterators
  that all magically invoked methods should be named using the __xxx__
  pattern? Shouldn't it be named __on_missing__() instead?

 Good point. I'll call it __missing__. I've uploaded a new patch to
 python.org/sf/1433928.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-24 Thread Greg Ewing
Raymond Hettinger wrote:
 Code that 
 uses next() is more understandable, friendly, and readable without the 
 walls of underscores.

There wouldn't be any walls of underscores, because

   y = x.next()

would become

   y = next(x)

The only time you would need to write underscores is
when defining a __next__ method. That would be no worse
than defining an __init__ or any other special method,
and has the advantage that it clearly marks the method
as being special.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-23 Thread Guido van Rossum
On 2/22/06, Michael Chermside [EMAIL PROTECTED] wrote:
 A minor related point about on_missing():

 Haven't we learned from regrets over the .next() method of iterators
 that all magically invoked methods should be named using the __xxx__
 pattern? Shouldn't it be named __on_missing__() instead?

Good point. I'll call it __missing__. I've uploaded a new patch to
python.org/sf/1433928.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-23 Thread Thomas Wouters
On Wed, Feb 22, 2006 at 01:13:28PM -0800, Michael Chermside wrote:

 Haven't we learned from regrets over the .next() method of iterators
 that all magically invoked methods should be named using the __xxx__
 pattern? Shouldn't it be named __on_missing__() instead?

I agree that on_missing should be __missing__ (or __missing_key__) but I
don't agree on the claim that all 'magically' invoked methods should be
two-way-double-underscored. __methods__ are methods that should only be
called 'magically', or by the object itself. 'next' has quite a few usecases
where it's desireable to call it directly (and I often do.)

-- 
Thomas Wouters [EMAIL PROTECTED]

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-23 Thread Walter Dörwald
Guido van Rossum wrote:

 On 2/22/06, Michael Chermside [EMAIL PROTECTED] wrote:
 A minor related point about on_missing():

 Haven't we learned from regrets over the .next() method of iterators
 that all magically invoked methods should be named using the __xxx__
 pattern? Shouldn't it be named __on_missing__() instead?
 
 Good point. I'll call it __missing__. I've uploaded a new patch to
 python.org/sf/1433928.

I always thought that __magic__ method calls are done by Python on 
objects it doesn't know about. The special method name ensures that it 
is indeed the protocol Python is talking about, not some random method 
(with next() being the exception). In the defaultdict case this isn't a 
problem, because defaultdict is calling its own method.

Bye,
Walter Dörwald

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-23 Thread Michael Chermside
Walter Dörwald writes:
 I always thought that __magic__ method calls are done by Python on
 objects it doesn't know about. The special method name ensures that it
 is indeed the protocol Python is talking about, not some random method
 (with next() being the exception). In the defaultdict case this isn't a
 problem, because defaultdict is calling its own method.

I, instead, felt that the __xxx__ convention served a few purposes. First,
it indicates that the method will be called in some means OTHER than
by name (generally, the interpreter invokes it directly, although in this
case it's a built-in method of dict that would invoke it). Secondly, it serves
to flag the method as being special -- true newbies can safely ignore
nearly all special methods aside from __init__(). And it serves to create
a separate namespace... writers of Python code know that names
beginning and ending with double-underscores are reserved for the
language. Of these, I always felt that special invocation was the most
important feature. The next() method of iterators was an interesting
object lesson. The original reasoning (I think) for using next() not
__next__() was that *sometimes* the method was called directly by
name (when stepping an iterator manually, which one frequently does
for perfectly good reasons). Since it was sometimes invoked by name
and sometimes by special mechanism, the choice was to use the
unadorned name, but later experience showed that it would have been
better the other way.

-- Michael Chermside
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-23 Thread Greg Ewing
Michael Chermside wrote:
 The next() method of iterators was an interesting
 object lesson. ... Since it was sometimes invoked by name
 and sometimes by special mechanism, the choice was to use the
 unadorned name, but later experience showed that it would have been
 better the other way.

Any thoughts about fixing this in 3.0?

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-23 Thread Greg Ewing
Thomas Wouters wrote:

 __methods__ are methods that should only be
 called 'magically', or by the object itself. 
  'next' has quite a few usecases where it's
 desireable to call it directly

That's why the proposal to replace .next() with
.__next__() comes along with a function next(obj)
which calls obj.__next__().

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] defaultdict and on_missing()

2006-02-22 Thread Raymond Hettinger



I'm concerned that the on_missing() part of the 
proposal is gratuitous. The main use cases for defaultdict have a simple 
factory that supplies a zero, empty list, or empty set. The on_missing() 
hook is only thereto support the rarer case of needinga key to 
compute a default value. The hook is not needed for the main use 
cases.

As it stands, we're adding a method to regular 
dicts that cannot be usefully called directly. Essentially, it is a 
framework method meant to be overridden in a subclass. So, it only makes 
sense in the context of subclassing. In the meantime, we've added an 
oddball method to the main dict API, arguably the most important object API in 
Python. 

To use the hook, you write something like 
this:

 class D(dict):
 def 
on_missing(self, key):
 
return somefunc(key)

However, we can already do something like that 
without the hook:

 class 
D(dict):  def __getitem__(self, 
key):  
try: 
 return dict.__getitem__(self, 
key)  except 
KeyError: 
 self[key] = value = 
somefunc(key) 
 return value

The latter form is already possible, doesn't 
require modifying a basic API, and is arguably clearer about when it is called 
and what it does (the former doesn't explicitly show that the returned value 
gets saved in the dictionary).

Since we can already do the latter form, 
wecan get some insight into whether the need has ever actually arisen in 
real code. I scanned the usual sources (my own code, the standard library, 
and my most commonly used third-party libraries) and found no instances of code 
like that. The closest approximation was safe_substitute() in 
string.Template where missing keys returned themselves as a default value. 
Other than that, I conclude that there isn't sufficient need to warrant adding a 
funky method to the API for regular dicts.

I wondered why thesafe_substitute() example was unique. I think the answer is that we 
normally handle default computations through simple in-line code ("if k in d: 
do1() else do2()" or a try/except pair). Overriding on_missing() then is 
really only useful when you need to create a type that can be passed to a client 
function that was expecting a regular dictionary. So it does come-up but 
not much.

Aside: Why on_missing() is an oddball among 
dict methods. When teaching dicts to beginner, all the methods are easily 
explainable except this one. You don't call this method directly, you only 
use it when subclassing, you have to override it to do anything useful, it hooks 
KeyError but onlywhen raised by __getitem__ and not other methods, 
etc. I'm concerned that evening having this method inregular 
dictAPI will create confusionabout when to use dict.get(), when to 
use dict.setdefault(), when to catch a KeyError, or when to LBYL. Adding 
this one extra choice makes the choice more difficult.

My recommendation: Dump the on_missing() 
hook. That leaves the dict API unmolested andallows a more 
straight-forward implementation/explanation of collections.default_dict or 
whatever it ends-up being named. The result is delightfully simple and 
easy to understand/explain.


Raymond







___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-22 Thread Fredrik Lundh
Raymond Hettinger wrote:

 Aside:  Why on_missing() is an oddball among dict methods.  When
 teaching dicts to beginner, all the methods are easily explainable ex-
 cept this one.  You don't call this method directly, you only use it
 when subclassing, you have to override it to do anything useful, it
 hooks KeyError but only when raised by __getitem__ and not
 other methods, etc.

agreed.

 My recommendation:  Dump the on_missing() hook.  That leaves
 the dict API unmolested and allows a more straight-forward im-
 plementation/explanation of collections.default_dict or whatever
 it ends-up being named.  The result is delightfully simple and easy
 to understand/explain.

agreed.

a separate type in collections, a template object (or factory) passed to
the constructor, and implementation inheritance, is more than good en-
ough.  and if I recall correctly, pretty much what Guido first proposed.
I trust his intuition a lot more than I trust the design-by-committee-with-
out-use-cases process.

/F 



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-22 Thread Greg Ewing
Raymond Hettinger wrote:
 I'm concerned that the on_missing() part of the proposal is gratuitous.  

I second all that. A clear case of YAGNI.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-22 Thread Guido van Rossum
On 2/22/06, Raymond Hettinger [EMAIL PROTECTED] wrote:
 I'm concerned that the on_missing() part of the proposal is gratuitous.  The
 main use cases for defaultdict have a simple factory that supplies a zero,
 empty list, or empty set.  The on_missing() hook is only there to support
 the rarer case of needing a key to compute a default value.  The hook is not
 needed for the main use cases.

The on_missing() hook is there to take the action of inserting the
default value into the dict. For this it needs the key.

It seems attractive to collaps default_factory and on_missing into a
single attribute (my first attempt did this, and I was halfway posting
about it before I realized the mistake). But on_missing() really needs
the key, and at the same time you don't want to lose the convenience
of being able to specify set, list, int etc. as default factories, so
default_factory() must be called without the key.

If you don't have on_missing, then the functionality of inserting the
key produced by default_factory would have to be in-lined in
__getitem__, which means the machinery put in place can't be reused
for other use cases -- several people have claimed to have a use case
for returning a value *without* inserting it into the dict.

 As it stands, we're adding a method to regular dicts that cannot be usefully
 called directly.  Essentially, it is a framework method meant to be
 overridden in a subclass.  So, it only makes sense in the context of
 subclassing.  In the meantime, we've added an oddball method to the main
 dict API, arguably the most important object API in Python.

Which to me actually means it's a *good* place to put the hook
functionality, since it allows for maximum reuse.

 To use the hook, you write something like this:

 class D(dict):
 def on_missing(self, key):
  return somefunc(key)

Or, more likely,

def on_missing(key):
self[key] = value = somefunc()
return value

 However, we can already do something like that without the hook:

 class D(dict):
 def __getitem__(self, key):
 try:
 return dict.__getitem__(self, key)
 except KeyError:
 self[key] = value = somefunc(key)
 return value

 The latter form is already possible, doesn't require modifying a basic API,
 and is arguably clearer about when it is called and what it does (the former
 doesn't explicitly show that the returned value gets saved in the
 dictionary).

This is exactly what Google's internal DefaultDict does. But it is
also its downfall, because now *all* __getitem__ calls are weighed
down by going through Python code; in a particular case that came up
at Google I had to recommend against using it for performance reasons.

 Since we can already do the latter form, we can get some insight into
 whether the need has ever actually arisen in real code.  I scanned the usual
 sources (my own code, the standard library, and my most commonly used
 third-party libraries) and found no instances of code like that.   The
 closest approximation was safe_substitute() in string.Template where missing
 keys returned themselves as a default value.  Other than that, I conclude
 that there isn't sufficient need to warrant adding a funky method to the API
 for regular dicts.

In this case I don't believe that the absence of real-life examples
says much (and BTW Google's DefaultDict *is* such a real life example;
it is used in other code). There is not much incentive for subclassing
dict and overriding __getitem__ if the alternative is that in a few
places you have to write two lines of code instead of one:

if key not in d: d[key] = set()# this line would be unneeded
d[key].add(value)

 I wondered why the safe_substitute() example was unique.  I think the answer
 is that we normally handle default computations through simple in-line code
 (if k in d: do1() else do2() or a try/except pair).  Overriding
 on_missing() then is really only useful when you need to create a type that
 can be passed to a client function that was expecting a regular dictionary.
 So it does come-up but not much.

I think the pattern hasn't been commonly known; people have been
struggling with setdefault() all these years.

 Aside:  Why on_missing() is an oddball among dict methods.  When teaching
 dicts to beginner, all the methods are easily explainable except this one.

You don't seriously teach beginners all dict methods do you?
setdefault(), update(), copy() are all advanced material, and so are
iteritems(), itervalues() and iterkeys() (*especially* the last since
it's redundant through for i in d:).

 You don't call this method directly, you only use it when subclassing, you
 have to override it to do anything useful, it hooks KeyError but only when
 raised by __getitem__ and not other methods, etc.

The only other methods that raise KeyError are __delitem__, pop() and
popitem(). I don't see how these could use the same hook as

Re: [Python-Dev] defaultdict and on_missing()

2006-02-22 Thread Raymond Hettinger
[Guido van Rossum]
 If we removed on_missing() from dict, we'd have to override
 __getitem__ in defaultdict (regardless of whether we give
defaultdict an on_missing() hook or in-line it).

You have another option.  Keep your current modifications to
dict.__getitem__ but do not include dict.on_missing().  Let it only
be called in a subclass IF it is defined; otherwise, raise KeyError.

That keeps me happy since the basic dict API won't show on_missing(),
but it still allows a user to attach an on_missing method to a dict subclass 
when
or if needed.  I think all your test cases would still pass without 
modification.
This is approach is not much different than for other magic methods which
kick-in if defined or revert to a default behavior if not.

My core concern is to keep the dict API clean as a whistle.


Raymond 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-22 Thread Guido van Rossum
On 2/22/06, Raymond Hettinger [EMAIL PROTECTED] wrote:
 [Guido van Rossum]
  If we removed on_missing() from dict, we'd have to override
  __getitem__ in defaultdict (regardless of whether we give
 defaultdict an on_missing() hook or in-line it).

 You have another option.  Keep your current modifications to
 dict.__getitem__ but do not include dict.on_missing().  Let it only
 be called in a subclass IF it is defined; otherwise, raise KeyError.

OK. I don't have time right now for another round of patches -- if you
do, please go ahead. The dict docs in my latest patch must be updated
somewhat (since they document on_missing()).

 That keeps me happy since the basic dict API won't show on_missing(),
 but it still allows a user to attach an on_missing method to a dict subclass
 when
 or if needed.  I think all your test cases would still pass without
 modification.

Except the ones that explicitly test for dict.on_missing()'s presence
and behavior. :-)

 This is approach is not much different than for other magic methods which
 kick-in if defined or revert to a default behavior if not.

Right. Plenty of precedent there.

 My core concern is to keep the dict API clean as a whistle.

Understood.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-22 Thread Edward C. Jones
Guido van Rossen wrote:
 I think the pattern hasn't been commonly known; people have been
 struggling with setdefault() all these years.

I use setdefault _only_ to speed up the following code pattern:

if akey not in somedict:
 somedict[akey] = list()
somedict[akey].append(avalue)

These lines of simple Python are much easier to read and write than

somedict.setdefault(akey, list()).append(avalue)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-22 Thread Michael Chermside
A minor related point about on_missing():

Haven't we learned from regrets over the .next() method of iterators
that all magically invoked methods should be named using the __xxx__
pattern? Shouldn't it be named __on_missing__() instead?

-- Michael Chermside

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com