Re: [Python-ideas] dict.setdefault_call(), or API variations thereupon

Steven D'Aprano Fri, 02 Nov 2018 17:07:06 -0700

On Fri, Nov 02, 2018 at 09:52:24AM -0700, Chris Barker wrote:
> On Thu, Nov 1, 2018 at 8:34 PM, Steven D'Aprano <st...@pearwood.info> wrote:
> 
> > The bottom line is, if I understand your proposal, the functionality
> > already exists. All you need do is subclass dict and give it a
> > __missing__ method which does what you want.
> 
> 
> or subclass dict and give it a "setdefault_call") method :-)


Well sure, if we're making up our own methods and calling them anything 
we like :-)

The status quo (as I see it):

dict.setdefault:
    - takes an explicit, but eagerly evaluated, default value;

dict.__missing__:
    - requires subclassing to make it work;
    - passes the missing key to the method, so the method can
      decide what value to return;

defaultdict:
    - takes a zero-argument factory function which is 
      unconditionally called when the key is missing.

Did I miss any?

What we don't have is a version of setdefault where the default is 
evaluated only on need. That would be a great use-case for Call-By-Name 
semantics and thunks, if Python supported such :-)

(That's just a half-baked thought, not a concrete proposal.)


> But as I think Guido wasa pointing out, the real difference here is that
> DefaultDict, or any other subclass, is specifying what the default callable
> is for the entire dict, rather than at time of use. 

As you show below, a default callable for the dict is precisely the use-case 
the OP gives:

    l = d.setdefault_call(somekey, list)

would be equivalent to defaultdict(list) and l = d[somekey].

(I think. Have I missed something?)

Nevertheless, Guido's point is reasonable -- if it comes up in practice 
often enough to care.


[...]
> As for the OP's justification:
> 
> """
> If it's not clear, the purpose is to eliminate the overhead of creating an
> empty list or similar in situations like this:
> 
> d = {}
> for i in range(1000000):  # some large loop
>      l = d.setdefault(somekey, [])
>      l.append(somevalue)
>
> # instead...
> 
> for i in range(1000000):
>     l = d.setdefault_call(somekey, list)
>     l.append(somevalue)
> 
> """

Are we sure that the overhead is significantly more than the cost of the 
name lookup of "list" and the expense of calling it?

You do demonstrate a speed difference with defaultdict (thanks for doing 
the timing tests) but the situation isn't precisely comparable to the 
proposed method, since you aren't looking up the name "list" each time 
through the outer loop.

Could construction of the empty list be optimized more? That might 
reduce the benefit even further (at least for the given case, but not 
for the general case of an arbitrarily expensive default).

We keep coming up against the issue of *eager evaluation* versus 
*delayed evaluation*, and I can't help feel that rather that solving 
this problem on an ad-hoc basis each time it comes up, maybe we really 
do need a way to tell the interpreter "delay evaluating this expression 
until needed". Then we could use it anywhere it was important, without 
having to create a plethora of special case setdefault_call() methods 
and the like.



-- 
Steve
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] dict.setdefault_call(), or API variations thereupon

Reply via email to