[Python-ideas] Re: Generalized deferred computation in Python

Chris Angelico Sat, 25 Jun 2022 23:40:35 -0700

On Sun, 26 Jun 2022 at 16:18, Brendan Barnwell <brenb...@brenbarn.net> wrote:
>
> On 2022-06-25 13:41, Chris Angelico wrote:
> > On Sun, 26 Jun 2022 at 04:41, Brendan Barnwell <brenb...@brenbarn.net> 
> > wrote:
> >>         In contrast, what I would want out of deferred evaluation is 
> >> precisely
> >> the ability to evaluate the deferred expression in the *evaluating*
> >> scope (not the definition scope) --- or in a custom provided namespace.
> >>   Whether this evaluation is implicit or explicit is less important to
> >> me than the ability to control the scope in which it occurs.  As others
> >> mentioned in early posts on this thread, this could complicate things
> >> too much to be feasible, but without it I don't really see the point.
> >
> > A custom-provided namespace can already be partly achieved, but
> > working in the evaluating scope is currently impossible and would
> > require some major deoptimizations to become possible.
> >
> >>>> expr = lambda: x + y
> >>>> expr.__code__.co_code
> > b't\x00t\x01\x17\x00S\x00'
> >>>> ns = {"x": 3, "y": 7}
> >>>> eval(expr.__code__, ns)
> > 10
> >
> > This works because the code object doesn't have any locals, so the
> > name references are encoded as global lookups, and eval() is happy to
> > use arbitrary globals. I say "partly achieved" because this won't work
> > if there are any accidental closure variables - you can't isolate the
> > lambda function from its original context and force everything to be a
> > global:
>
>         Yes, that is the blocker.  It is an important blocker for the query 
> use
> case, because if you're building a query involving variables called
> `length` and `width` and so on, the code building this query and/or
> working with the results may often have its own variables with the same
> names.  So it needs to be possible to create a fully independent
> namespace that does not care what names happened to be defined in the
> surrounding scope.
>
>         Another complicating factor (which I didn't mention in my earlier 
> post)
> is that you actually sometimes might want to explicitly pass through
> (that is, close over) variables in the enclosing scope.  For instance
> you might want to make a query like `column1 == threshold` where
> `threshold` is a variable in the definition scope, whose value you want
> to "freeze" at that moment as part of the deferred query expression.
> This would require some way to mark which values are to be frozen in
> this way (as pandas DataFrame.query does with "@"), which could get a
> bit hairy.


Hmm, that gets a bit messy, since it's entirely possible to want both
namespaces (closed-over names and free names) at the same time.
There's not going to be an easy fix. It might be safest to reject all
closures completely, but then have the ability to define a separate
set of constants that will be available in the expression.

This might be getting outside the scope (pun intended) of a language
proposal, but maybe there could be a general thing of "give me this
expression as a code object, NO closures", and then build something on
top of that to capture specific values.

> >>         This would also mean that such deferred objects could handle the
> >> late-bound default case, but the function would have to "commit" to
> >> explicit evaluation of such defaults.  Probably there could be a no-op
> >> "unwrapping" operation that would work on non-deferred objects (so that
> >> `unwrap([])` or whatever would just evaluate to the same regular list
> >> you passed in), so you could still pass in a plain list a to an argument
> >> whose default was `deferred []`, but the function would still have to
> >> explicitly evaluate it in its body.  Again, I think I'm okay with this,
> >> partly because (as I mentioned in the other thread) I don't see PEP
> >> 671-style late-bound defaults as a particularly pressing need.
> >
> > That seems all very well, but it does incur a fairly huge cost for a
> > relatively simple benefit. Consider:
> >
> > def f(x=defer [], n=defer len(x)):
> >      unwrap(x); unwrap(n)
> >      print("You gave me", n, "elements to work with")
> >
> > f(defer (print := lambda *x: None))
> >
> > Is it correct for every late-bound argument default to also be a code
> > injection opportunity? And if so, then why should other functions
> > *not* have such an opportunity afforded to them? I mean, if we're
> > going to have spooky action at a distance, we may as well commit to
> > it. Okay, I jest, but still - giving callers the ability to put
> > arbitrary code into the function is going to be FAR harder to reason
> > about than simply having the code in the function header.
>
>         As I said, I don't really care so much about whether the deferred
> object has the ability to modify the scope in which its evaluated.  The
> important part is that it has to be able to *read* that scope, in a way
> that doesn't depend in an implicit, non-configurable way on what
> variables happened to exist in the defining scope.

Reading that scope is probably fairly doable, but it's easiest to have
a variant of exec() then.

>         In other words it really is very similar to what a lambda currently 
> is,
> but with more fine-grained control over which variables are bound in
> which namespaces (definition vs. eval).  I'm not talking about "putting
> arbitrary code in the function" in the sense of inlining into the eval
> scope.  In fact, one of the things I dislike about PEP 671 is that it
> does exactly this with the late-bound defaults.  I find it even more
> egregious in that case for extra reasons, but yeah, spooky action at a
> distance is not the goal here.

PEP 671 doesn't put arbitrary code into the function. Only the
function itself can define what gets executed in it. It just
transforms this:

def f(x=NOT_PROVIDED):
    if x was not provided: x = EXPR

into this:

def f(x=>EXPR):

Either way, there's no "arbitrary code" being added to the function.
The function signature is as much a part of the function as the body
is.

The problem starts happening when deferred expressions have to be
provided from outside the function, such as:

_default = later EXPR
def f(x):
    if x was not provided: x = _default
    x = unlater x

which is the semantics of other argument defaults, and allows a
passed-in argument to inject code.

> >>         There are definitely some holes in my idea.  For one thing, with
> >> explicit evaluation required, it is much closer to a regular lambda.
> >> The only real difference is that it would involve more flexible scope
> >> control (rather than unalterably closing over the defining scope).
> >
> > TBH I think that that's quite useful, just not for PEP 671. For query
> > languages, it'd be very handy to be able to have a keyword that says
> > "isolate the parsing of this". I could imagine this being useful for
> > function annotations too, although they've been special-cased
> > somewhat, so that might be less of a concern.
>
>         Right, that's the point of this.  In fact there's a part of me that
> wants something even crazier, like making the deferred object retain
> info about its AST, so that the eval-ing code could manipulate that if
> needed.  R uses this kind of thing to do some pretty crazy stuff.
> Perhaps too crazy, which is why only part of me wants this.  But it can
> be pretty powerful.

TBH I wouldn't be averse to having some sort of syntax that takes
executable code and yields the AST. Trouble is, it would need really
really good syntax, otherwise it'll be simpler and safer to just use
compile() and provide the code as a string.

> > Yep; but the trouble is that referring to a name can also incur a
> > cost, especially when it comes to closures. So I think the explicit
> > namespace is going to be far safer than "evaluate in the caller's
> > context".
> >
> > That said: you can and should be able to prepopulate the evaluation
> > namespace with whatever you like, so using locals() as a "seed"
> > dictionary would basically give you what you want - a non-assignable
> > namespace that has all of these locals available for reference.
>
>         It sounds like what you're saying is that the hard part is referring 
> to
> the "real" evaluation namespace, but it would be easy to refer to a copy
> of that namespace.  That again would probably be okay with me.

There are a few things that are hard, and it's entirely possible that
you don't need any of them.

1) Closures need to capture names.

>>> def f():
...     x = 1
...     def g():
...             print(locals())
...     return g
...
>>> f()()
{}

There's nothing inside g() that says that the name x is important, so
when f() returns, it's disposed of. My guess? This won't be a problem,
and the semantics of locals() will be fine.

2) As mentioned, mutations. Again, the semantics of locals() are
probably fine for your needs, although if you want to guarantee that
it ignores mutations, locals().copy() will ensure that.

3) Class namespaces are unusual. Nested class namespaces can be weird
and surprising if you don't think about them carefully.

> Like if
> you could do eval(unevaluated_expression)` and it auto-filled the
> namespace with `locals()` (i.e., a read-only copy of the eval-ing
> namespace) that would be cool.  As I mentioned before, the point is not
> for the deferred expression to be inlined into the eval-ing namespace;
> the point is for the programmer to be able to choose at will which names
> in deferred expression will have their values taken from the eval-ing
> namespace (as opposed to the defining namespace).

You can just use "eval(code_object, locals())" for that. Or
locals().copy() if you want the safety. The key is getting the right
code object, and I don't know of a way to do that without either (a)
starting from a string, or (b) making an unwanted closure. But a
variant of the lambda keyword could provide exactly that.

ChrisA
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/S5S3XP7SFPXZQS4CQ3VRQMZCG7HCNV4Z/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Generalized deferred computation in Python

Reply via email to