[Python-ideas] Re: Generalized deferred computation in Python

Brendan Barnwell Sat, 25 Jun 2022 23:17:55 -0700

On 2022-06-25 13:41, Chris Angelico wrote:

On Sun, 26 Jun 2022 at 04:41, Brendan Barnwell <[email protected]> wrote:

        In contrast, what I would want out of deferred evaluation is precisely
the ability to evaluate the deferred expression in the *evaluating*
scope (not the definition scope) --- or in a custom provided namespace.
  Whether this evaluation is implicit or explicit is less important to
me than the ability to control the scope in which it occurs.  As others
mentioned in early posts on this thread, this could complicate things
too much to be feasible, but without it I don't really see the point.


A custom-provided namespace can already be partly achieved, but
working in the evaluating scope is currently impossible and would
require some major deoptimizations to become possible.

expr = lambda: x + y
expr.__code__.co_code

b't\x00t\x01\x17\x00S\x00'

ns = {"x": 3, "y": 7}
eval(expr.__code__, ns)

10

This works because the code object doesn't have any locals, so the
name references are encoded as global lookups, and eval() is happy to
use arbitrary globals. I say "partly achieved" because this won't work
if there are any accidental closure variables - you can't isolate the
lambda function from its original context and force everything to be a
global:

Yes, that is the blocker. It is an important blocker for the query usecase, because if you're building a query involving variables called`length` and `width` and so on, the code building this query and/orworking with the results may often have its own variables with the samenames. So it needs to be possible to create a fully independentnamespace that does not care what names happened to be defined in thesurrounding scope.

Another complicating factor (which I didn't mention in my earlier post)is that you actually sometimes might want to explicitly pass through(that is, close over) variables in the enclosing scope. For instanceyou might want to make a query like `column1 == threshold` where`threshold` is a variable in the definition scope, whose value you wantto "freeze" at that moment as part of the deferred query expression.This would require some way to mark which values are to be frozen inthis way (as pandas DataFrame.query does with "@"), which could get abit hairy.

        This would also mean that such deferred objects could handle the
late-bound default case, but the function would have to "commit" to
explicit evaluation of such defaults.  Probably there could be a no-op
"unwrapping" operation that would work on non-deferred objects (so that
`unwrap([])` or whatever would just evaluate to the same regular list
you passed in), so you could still pass in a plain list a to an argument
whose default was `deferred []`, but the function would still have to
explicitly evaluate it in its body.  Again, I think I'm okay with this,
partly because (as I mentioned in the other thread) I don't see PEP
671-style late-bound defaults as a particularly pressing need.


That seems all very well, but it does incur a fairly huge cost for a
relatively simple benefit. Consider:

def f(x=defer [], n=defer len(x)):
     unwrap(x); unwrap(n)
     print("You gave me", n, "elements to work with")

f(defer (print := lambda *x: None))

Is it correct for every late-bound argument default to also be a code
injection opportunity? And if so, then why should other functions
*not* have such an opportunity afforded to them? I mean, if we're
going to have spooky action at a distance, we may as well commit to
it. Okay, I jest, but still - giving callers the ability to put
arbitrary code into the function is going to be FAR harder to reason
about than simply having the code in the function header.

As I said, I don't really care so much about whether the deferredobject has the ability to modify the scope in which its evaluated. Theimportant part is that it has to be able to *read* that scope, in a waythat doesn't depend in an implicit, non-configurable way on whatvariables happened to exist in the defining scope.

In other words it really is very similar to what a lambda currently is,but with more fine-grained control over which variables are bound inwhich namespaces (definition vs. eval). I'm not talking about "puttingarbitrary code in the function" in the sense of inlining into the evalscope. In fact, one of the things I dislike about PEP 671 is that itdoes exactly this with the late-bound defaults. I find it even moreegregious in that case for extra reasons, but yeah, spooky action at adistance is not the goal here.

        There are definitely some holes in my idea.  For one thing, with
explicit evaluation required, it is much closer to a regular lambda.
The only real difference is that it would involve more flexible scope
control (rather than unalterably closing over the defining scope).


TBH I think that that's quite useful, just not for PEP 671. For query
languages, it'd be very handy to be able to have a keyword that says
"isolate the parsing of this". I could imagine this being useful for
function annotations too, although they've been special-cased
somewhat, so that might be less of a concern.

Right, that's the point of this. In fact there's a part of me thatwants something even crazier, like making the deferred object retaininfo about its AST, so that the eval-ing code could manipulate that ifneeded. R uses this kind of thing to do some pretty crazy stuff.Perhaps too crazy, which is why only part of me wants this. But it canbe pretty powerful.

There is also the question of whether it would unacceptably slow down
name references because functions would no longer know which variables
were local; I think I would be okay with saying that the thunk could not
mutate the enclosing namespace (so, e.g., walruses inside the thunk
would only affect an internal thunk namespace).  The point here is for
the consumer to *evaluate* the thunk and get the result, not inline it
into the surrounding code.


Yep; but the trouble is that referring to a name can also incur a
cost, especially when it comes to closures. So I think the explicit
namespace is going to be far safer than "evaluate in the caller's
context".

That said: you can and should be able to prepopulate the evaluation
namespace with whatever you like, so using locals() as a "seed"
dictionary would basically give you what you want - a non-assignable
namespace that has all of these locals available for reference.

It sounds like what you're saying is that the hard part is referring tothe "real" evaluation namespace, but it would be easy to refer to a copyof that namespace. That again would probably be okay with me. Like ifyou could do eval(unevaluated_expression)` and it auto-filled thenamespace with `locals()` (i.e., a read-only copy of the eval-ingnamespace) that would be cool. As I mentioned before, the point is notfor the deferred expression to be inlined into the eval-ing namespace;the point is for the programmer to be able to choose at will which namesin deferred expression will have their values taken from the eval-ingnamespace (as opposed to the defining namespace).


--
Brendan Barnwell

"Do not follow where the path may lead. Go, instead, where there is nopath, and leave a trail."

   --author unknown
_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/YW2J53BR7WF3YFRHU674B6IYCSHPQPR7/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Generalized deferred computation in Python

Reply via email to