[Python-ideas] Re: Generalized deferred computation in Python

2022-06-23 Thread Chris Angelico
On Fri, 24 Jun 2022 at 13:26, Joao S. O. Bueno  wrote:
>
>
>
> On Thu, Jun 23, 2022 at 2:53 AM Chris Angelico  wrote:
>>
>> On Thu, 23 Jun 2022 at 11:35, Joao S. O. Bueno  wrote:
>> >
>> > Martin Di Paola wrote:
>> > > Three cases: Dask/PySpark, Django's ORM and selectq. All of them
>> > > implement deferred expressions but all of them "compute" them in very
>> > > specific ways (aka, they plan and execute the computation differently).
>> >
>> >
>> > So - I've been hit with the "transparency execution of deferred code" 
>> > dilemma
>> > before.
>> >
>> > What happens is that: Python, at one point will have to "use" an object - 
>> > and that use
>> > is through calling one of the dunder methods. Up to that time, like, just 
>> > writing the object name
>> > in a no-operation line, does nothing. (unless the line is in a REPL, which 
>> > will then call the __repr__
>> > method in the object).
>>
>> Why are dunder methods special? Does being passed to some other
>> function also do nothing? What about a non-dunder attribute?
>
>
> Non-dunder attributes goes through obj.__getattribute__  at which point 
> evaluation
> is triggered anyway.

Hmm, do they actually, or is that only if it's defined? But okay. In
that case, simply describe it as "accessing any attribute".

>> Especially, does being involved in an 'is' check count as using an object?
>
>
> "is" is not "using', and will be always false or true as for any other object.
> Under this approach, the delayed object is a proxy, and remains a proxy,
> so this would have side-effects in code consuming the object.
> (extensions expecting strict built-in types might not work with a
> proxy for an int or str) - but "is" comparison should bring 0 surprises.

At this point, I'm wondering if the proposal's been watered down to
being nearly useless. You don't get the actual object, it's always a
proxy, and EVERY attribute lookup on EVERY object has to first check
to see if it's a special proxy.

>> dflt = fetch_cached_object("default")
>> mine = later fetch_cached_object(user.keyword)
>> ...
>> if mine is dflt: ... # "using" mine? Or not?
>>
>> Does it make a difference whether the object has previously been poked
>> in some other way?
>
>
> In this case, "mine" should be a proxy for the evaluation of the call
> of "fetch_cached_object" which clearly IS NOT the returned
> object stored in "dflt".
>
> This is so little, or so much, surprising as verifying that "bool([])" yields 
> False:
> it just follows the language inner workings, with not special casing.

If it's defined as a proxy, then yes, that's the case - it will never
be that object, neither before nor after the undeferral. But that
means that a "later" expression will never truly become the actual
object, so you always have to keep that in mind. I foresee a large
number of style guides decrying the use of identity checks because
they "won't work" with deferred objects.

> Of course, this if this proposal goes forward - I am just pointing that the
> existing mechanisms in the language can already support it in a way
> with no modification. If "is" triggering the resolve is desired, or if
> is desired the delayed object should  be replaced "in place", instead
> of using a proxy, another approach would be needed - and
> I'd favor the "already working" proxy approach I presented here.
>
> (I won't dare touch the bike-shedding about the syntax on this, though)
>

Right, but if the existing mechanisms are sufficient, why not just use
them? We *have* lambda expressions. It wouldn't be THAT hard to define
a small wrapper - okay, the syntax is a bit clunky, but bear with me:

class later:
def __init__(self, func):
self.func = func
self.__is_real = False
def __getattribute__(self, attr):
self.__makereal()
return getattr(self.__wrapped, attr)
def __makereal(self):
if self.__is_real: return
self.__wrapped =  self.func()
self.__is_real = True

x = later(lambda: expensive+expression()*to/calc)

And we don't see a lot of this happening. Why? I don't know for sure,
but I can guess at a few possible reasons:

1) It's not part of the standard library, so you have to go fetch a
thing to do it. If that's significant enough, this is solvable by
adding it to the stdlib, or even a new builtin.

2) "later(lambda: expr)" is clunky. Very clunky. Your proposal solves
that, by making "later expr" do that job, but at the price of creating
some weird edge cases (for instance, you *cannot* parenthesize the
expression - this is probably the only place where that's possible, as
even non-expressions can often be parenthesized, eg import and with
statements).

3) It's never actually the result of the expression, but always this proxy.

4) There's no (clean) way to get at the true object, which means that
all the penalties are permanent.

5) Maybe the need just isn't that strong.

How much benefit would this be? You're proposing a syntactic construct
for something that 

[Python-ideas] Re: Generalized deferred computation in Python

2022-06-23 Thread Joao S. O. Bueno
On Thu, Jun 23, 2022 at 2:53 AM Chris Angelico  wrote:

> On Thu, 23 Jun 2022 at 11:35, Joao S. O. Bueno 
> wrote:
> >
> > Martin Di Paola wrote:
> > > Three cases: Dask/PySpark, Django's ORM and selectq. All of them
> > > implement deferred expressions but all of them "compute" them in very
> > > specific ways (aka, they plan and execute the computation differently).
> >
> >
> > So - I've been hit with the "transparency execution of deferred code"
> dilemma
> > before.
> >
> > What happens is that: Python, at one point will have to "use" an object
> - and that use
> > is through calling one of the dunder methods. Up to that time, like,
> just writing the object name
> > in a no-operation line, does nothing. (unless the line is in a REPL,
> which will then call the __repr__
> > method in the object).
>
> Why are dunder methods special? Does being passed to some other
> function also do nothing? What about a non-dunder attribute?
>

Non-dunder attributes goes through obj.__getattribute__  at which point
evaluation
is triggered anyway.


>
> Especially, does being involved in an 'is' check count as using an object?
>

"is" is not "using', and will be always false or true as for any other
object.
Under this approach, the delayed object is a proxy, and remains a proxy,
so this would have side-effects in code consuming the object.
(extensions expecting strict built-in types might not work with a
proxy for an int or str) - but "is" comparison should bring 0 surprises.

>
> dflt = fetch_cached_object("default")
> mine = later fetch_cached_object(user.keyword)
> ...
> if mine is dflt: ... # "using" mine? Or not?
>
> Does it make a difference whether the object has previously been poked
> in some other way?
>

In this case, "mine" should be a proxy for the evaluation of the call
of "fetch_cached_object" which clearly IS NOT the returned
object stored in "dflt".

This is so little, or so much, surprising as verifying that "bool([])"
yields False:
it just follows the language inner workings, with not special casing.

Of course, this if this proposal goes forward - I am just pointing that the
existing mechanisms in the language can already support it in a way
with no modification. If "is" triggering the resolve is desired, or if
is desired the delayed object should  be replaced "in place", instead
of using a proxy, another approach would be needed - and
I'd favor the "already working" proxy approach I presented here.

(I won't dare touch the bike-shedding about the syntax on this, though)



> ChrisA
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/HUJ36AA34SZU7D5Q4G6N5UFFKYUOGOFT/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/EY3HWBNPVSY5IVZPYN75BXSAFQDEMPNM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: dataclass field argument to allow converting value on init

2022-06-23 Thread Thomas Kehrenberg
This is what attrs' converter functionality is, right?
https://www.attrs.org/en/stable/init.html#converters
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/M52CIMZECB5S7YCSNVO4AKUFB6ZSEGYX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: dataclass field argument to allow converting value on init

2022-06-23 Thread Steve Jorgensen
Sorry for typos.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2RMKRA2GEQ3HADXG4TXYTCLRUX2CR5QG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: dataclass field argument to allow converting value on init

2022-06-23 Thread Steve Jorgensen
Dexter Hill wrote:
> The idea is to have a `default_factory` like argument (either in the `field` 
> function, or a new function entirely) that takes a function as an argument, 
> and that function, with the value provided by `__init__`, is called and the 
> return value is used as the value for the respective field. For example:
> ```py
> @dataclass
> class Foo:
> x: str = field(init_fn=chr)
> f = Foo(65)
> f.x # "A"
> ```
> The `chr` function is called, given the value `65` and `x` is set to its 
> return value of `"A"`. I understand that there is both `__init__` and 
> `__post_init__` which can be used for this purpose, but sometimes it isn't 
> ideal to override them. If you overrided `__init__`, and were using 
> `__post_init__`, you would need to manually call it, and in my case, 
> `__post_init__` is implemented on a base class, which all other classes 
> inherit, and so overloading it would require re-implementing the logic from 
> it (and that's ignoring the fact that you also need to type the field with 
> `InitVar` to even have it passed to `__post_init__` in the first place).
> I've created a proof of concept, shown below:
> ```py
> def initfn(fn, default=None):
> class Inner:
> def __set_name__(_, owner_cls, owner_name):
> old_setattr = getattr(owner_cls, "__setattr__")
> def __setattr__(self, attr_name, value):
> if attr_name == owner_name:
> # Bypass `__setattr__`
> self.__dict__[attr_name] = fac(value)
> else:
> old_setattr(self, attr_name, value)
> setattr(owner_cls, "__setattr__", __setattr__)
> def fac(value):
> if isinstance(value, Inner):
> return default
> return fn(value)
> return field(default=Inner())
> ```
> It makes use of the fact that providing `default` as an argument to `field` 
> means it checks the value for a `__set_name__` function, and calls it with 
> the class and field name as arguments. Overriding `__setattr__` is just used 
> to catch when a value is being assigned to a field, and if that field's name 
> matches the name given to `__set_name__`, it calls the function on the value, 
> at sets the field to that instead.
> It can be used like so:
> ```py
> @dataclass
> class Foo:
> x: str = initfn(fn=chr, default="Z")
> f = Foo(65)
> f2 = Foo()
> f.x # "A"
> f2.x # "Z"
> ```
> It adds a little overhead, especially with having to override `__setattr__` 
> however, I believe it would have very little overhead if directly implemented 
> in the dataclass library.
> Even in the case of being able to override one of the init functions, I still 
> think it would be nice to have as a quality of life feature as I feel calling 
> a function is too simple to want to override the functions, if that makes 
> sense.
> Thanks.
> Dexter

What if, instead, the `init` parameter could accept either a boolean (as it 
does now) or a type? When given a type, that would mean that to created the 
property and accept the argument but pass the argument ti `__post_init__` 
rather than using it to initialize the property directly. The type passed to 
`init` would become the type hint for the argument.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/YERVGXA5QJUHOQW357GVN7JERB2AJT6P/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Generalized deferred computation in Python

2022-06-23 Thread Barry


> On 23 Jun 2022, at 08:27, Stephen J. Turnbull  
> wrote:
> 
> Barry Scott writes:
> 
>> I can think of ways to implement evaluation-on-reference, but they
>> all have the effect of making python slower.
> 
> Probably.
> 
>> The simple
>> 
>>a = b
>> 
>> will need to slow down so that the object in b can checked to see
>> if it need evaluating.
> 
> No, it doesn't.  Binding a name is special in many ways, why not this
> one too?  Or "a = a" could be the idiom for "resolve a deferred now",
> which would require the check for __evaluate_me_now__ as you say.  But
> such simple "a = b" assignments are not so common that they would be a
> major slowdown.  I would think the real problem would be the "oops" of
> doing "a = b" and evaluating a deferred you don't want to evaluate.
> But this isn't a completely new problem, it's similar to a = b = []
> and expecting a is not b.

Interest idea that ref does not auto evaluate in all cases.
I was wondering about what the compile/runtime can do it avoid the costs
of checking for an evaluation.
> 
> Now consider a = b + 0.  b.__add__ will be invoked in the usual way.
> Only if b is a deferred will evaluation take place.

But the act of checking if b is deferred is a cost I am concerned about.

> 
> So I don't really see the rest of Python slowing down much.

Once we have the PEP address it’s semantics in detail we can estimate the costs.

I would think that it’s not that hard to add the expected check into the python 
ceval.c
And benchmark the impact of the checks. This would not need a full 
implementation
of the deferred mechanism.

Barry
> 

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/DEBOPY6OIRRLLCO2SQDYXXM7UMXZYRMZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: dataclass field argument to allow converting value on init

2022-06-23 Thread Dexter Hill
Interesting point, it's not something I thought of. One solution as mentioned 
by Simão, and what I had in mind, is to pull the type from the first parameter 
of the function. We know that the function is always going to have minumum 1 
parameter, and the value is always passed as the first argument.
One downside is that it isn't very transparent to the user - they might not 
understand that the type is being taken from the first argument of the 
function, and wonder where it is coming from, in which case, the other solution 
would be to do something like `InitVar` but that takes two types (the return 
and the init type); something like `var: InitFn[int, str]` for the `chr` 
example.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/I6Y2IAYMB3TTJQGFFAMBQXM2G6Z7ZWOG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: dataclass field argument to allow converting value on init

2022-06-23 Thread Simão Afonso
On 2022-06-23 00:03:03, Paul Bryan wrote:
> What type hint will be exposed for the __init__ parameter? Clearly,
> it's not a `str` type in your example; you're passing it an `int` value
> in your example. Presumably to overcome this, you'd need yet another
> `field` function parameter to provide the type hint for the `__init__`
> param?

Can't this be derived from the type `default_factory` function?

`chr` is something like:

> def chr(num: int) -> str:

So that type should be `int`.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/S4YKLANPZAXC2HCMX5Z2ZPROPPIZK244/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Generalized deferred computation in Python

2022-06-23 Thread Stephen J. Turnbull
Barry Scott writes:

 > I can think of ways to implement evaluation-on-reference, but they
 > all have the effect of making python slower.

Probably.

 > The simple
 > 
 >  a = b
 > 
 > will need to slow down so that the object in b can checked to see
 > if it need evaluating.

No, it doesn't.  Binding a name is special in many ways, why not this
one too?  Or "a = a" could be the idiom for "resolve a deferred now",
which would require the check for __evaluate_me_now__ as you say.  But
such simple "a = b" assignments are not so common that they would be a
major slowdown.  I would think the real problem would be the "oops" of
doing "a = b" and evaluating a deferred you don't want to evaluate.
But this isn't a completely new problem, it's similar to a = b = []
and expecting a is not b.

Now consider a = b + 0.  b.__add__ will be invoked in the usual way.
Only if b is a deferred will evaluation take place.

So I don't really see the rest of Python slowing down much.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/XYCOZYJSNDZTRMMBDNIL4E62SVIYHRBR/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: dataclass field argument to allow converting value on init

2022-06-23 Thread Paul Bryan
What type hint will be exposed for the __init__ parameter? Clearly,
it's not a `str` type in your example; you're passing it an `int` value
in your example. Presumably to overcome this, you'd need yet another
`field` function parameter to provide the type hint for the `__init__`
param?


On Wed, 2022-06-22 at 20:43 +, Dexter Hill wrote:
> The idea is to have a `default_factory` like argument (either in the
> `field` function, or a new function entirely) that takes a function
> as an argument, and that function, with the value provided by
> `__init__`, is called and the return value is used as the value for
> the respective field. For example:
> ```py
> @dataclass
> class Foo:
>     x: str = field(init_fn=chr)
> 
> f = Foo(65)
> f.x # "A"
> ```
> The `chr` function is called, given the value `65` and `x` is set to
> its return value of `"A"`. I understand that there is both `__init__`
> and `__post_init__` which can be used for this purpose, but sometimes
> it isn't ideal to override them. If you overrided `__init__`, and
> were using `__post_init__`, you would need to manually call it, and
> in my case, `__post_init__` is implemented on a base class, which all
> other classes inherit, and so overloading it would require re-
> implementing the logic from it (and that's ignoring the fact that you
> also need to type the field with `InitVar` to even have it passed to
> `__post_init__` in the first place).
> 
> I've created a proof of concept, shown below:
> ```py
> def initfn(fn, default=None):
>     class Inner:
>     def __set_name__(_, owner_cls, owner_name):
>     old_setattr = getattr(owner_cls, "__setattr__")
> 
>     def __setattr__(self, attr_name, value):
> 
>     if attr_name == owner_name:
>     # Bypass `__setattr__`
>     self.__dict__[attr_name] = fac(value)
> 
>     else:
>     old_setattr(self, attr_name, value)
> 
>     setattr(owner_cls, "__setattr__", __setattr__)
> 
>     def fac(value):
>     if isinstance(value, Inner):
>     return default
> 
>     return fn(value)
> 
>     return field(default=Inner())
> ```
> It makes use of the fact that providing `default` as an argument to
> `field` means it checks the value for a `__set_name__` function, and
> calls it with the class and field name as arguments. Overriding
> `__setattr__` is just used to catch when a value is being assigned to
> a field, and if that field's name matches the name given to
> `__set_name__`, it calls the function on the value, at sets the field
> to that instead.
> It can be used like so:
> ```py
> @dataclass
> class Foo:
>     x: str = initfn(fn=chr, default="Z")
> 
> f = Foo(65)
> f2 = Foo()
> 
> f.x # "A"
> f2.x # "Z"
> ```
> It adds a little overhead, especially with having to override
> `__setattr__` however, I believe it would have very little overhead
> if directly implemented in the dataclass library.
> 
> Even in the case of being able to override one of the init functions,
> I still think it would be nice to have as a quality of life feature
> as I feel calling a function is too simple to want to override the
> functions, if that makes sense.
> 
> Thanks.
> Dexter
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/4SM5EVP6MMGGHQMZSJXBML74PWWDHEWV/
> Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/N2JQWHBBKVDK3VJAFVUY5YCT5MZOTPPN/
Code of Conduct: http://python.org/psf/codeofconduct/