[Python-ideas] Re: Looking for people interested in a Python register virtual machine project

2021-03-22 Thread Skip Montanaro
Thanks for the response. I will try to address your comments inline.

> I guess it should be a good idea to answer what's the scope of this
> project - is it research one or "production" one? If it's research one,
> why be concerned with the churn of over-modern CPython versions?
> Wouldn't it be better to just use some scalable, incremental
> implementation which would allow to forward-port it to a newer version,
> if it ever comes to that?

The motivation for revisiting this idea was/is largely personal. As I
indicated, I first messed around with it over 20 years ago and it's
been in the back of my mind ever since. Somehow I never lost the code
despite I'm not sure how many computers came and went and that the
code was never uploaded to any sort of distributed version control
system. I decided to pick things up again as a way to mostly keep my
head in the game after I retired. So, neither "research" nor
"production" seems to be a correct descriptor. Still, if taken to
functional completion — functional enough for performance testing and
application to more than just toy scripts — I realized pretty quickly
that I'd need help.

> Otherwise, if it's "production", who's the "customer" and how they
> "compensate" you for doing work (chasing the moving target) which is
> clearly of little interest to you and conflicts with the goal of the
> project?

Nobody is compensating me. I have no desire to try and turn it into
something I do for hire. Maybe I misunderstood your question?

> > This PEP proposes the addition of register-based instructions to the
> > existing Python virtual machine, with the intent that they eventually
> > replace the existing stack-based opcodes.
>
> Sorry, what? The purpose of register-based instructions is to just
> replace stack-based instructions? That's not what's I'd like to hear as
> the intro phrase. You probably want to replace one with the other
> because register-based ones offer some benefit, faster execution
> perhaps? That's what I'd like to hear instead of "deciphering" that
> between the lines.

Replacing stack-based instructions would be a reasonable initial goal,
I think. Victor reported performance improvements in his
implementation (also a translator). As I indicated in the "PEP" (I use
that term rather loosely, as I have no plans at the moment to submit
it for consideration, certainly not in its current, incomplete state),
a better ultimate way to go would be to generate register instructions
directly from the AST. The current translation scheme allows me to
write simple test case functions, generate register instructions, then
compare that when called the two produce the same result.

> > They [2 instruction sets] are almost completely distinct.
>
> That doesn't correspond to the mental image I would have. In my list,
> the 2 sets would be exactly the same, except that stack-based encode
> argument locations implicitly, while register-based - explicitly. Would
> be interesting to read (in the following "pep" sections) what makes them
> "almost completely distinct".

Well, sure. The main difference is the way two pairs of instructions
(say, BINARY_ADD vs BINARY_ADD_REG) get their operands and save their
result. You still have to be able to add two objects, call functions,
etc.

> > Within a single function only one set of opcodes or the other will
> > be used at any one time.
>
> That would be the opposite of "scalable, incremental" development
> approach mentioned above. Why not allow 2 sets to freely co-exist, and
> migrate codegeneration/implement code translation gradually?

The fact that I treat the current frame's stack space as registers
makes it pretty much impossible to execute both stack and register
instructions within the same frame. Victor's implementation did things
differently in this regard. I believe he just allocated extra space
for 256 registers at the end of each frame, so (in theory, I suppose),
you could have instructions from both executed in the same frame.

> > ## Motivation
>
> I'm not sure the content of the section corresponds much to its title.
> It jumps from background survey of the different Python VM optimizations
> to (some) implementation details of register VM - leaving "motivation"
> somewhere "between the lines".
>
> > Despite all that effort, opcodes which do nothing more than move data
> > onto or off of the stack (LOAD_FAST, LOAD_GLOBAL, etc) still account
> > for nearly half of all opcodes executed.
>
> ... And - you intend to change that with a register VM? In which way and
> how? As an example, LOAD_GLOBAL isn't going anywhere - it loads a
> variable by *symbolic* name into a register.

Certainly, if you have data which isn't already on the stack, you are
going to have to move data. As the appendix shows though, a fairly
large chunk of the current virtual machine does nothing more than
manipulate the stack (LOAD_FAST, STORE_FAST, POP_TOP, etc).

> > Running Pyperformance using a development version of Python 3.9
> > showed 

[Python-ideas] Re: Looking for people interested in a Python register virtual machine project

2021-03-22 Thread Skip Montanaro
> Yeah, that is old writing, so is probably less clear (no pun intended)
> than it should be. In frame_dealloc, Py_CLEAR is called for
> stack/register slots instead of just Py_XDECREF. Might not be
> necessary.

Also, the intent is not to change any semantics here. The
implementation of RETURN_VALUE_REG still Py_INCREFs the to-be-returned
value. It's not like the data can get reclaimed before the caller
receives it.

S
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/PMBNH3L6WEV7TRVKQYOQVSTNJQHH6YYB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Looking for people interested in a Python register virtual machine project

2021-03-22 Thread Skip Montanaro
> In the "Object Lifetime" section you say "registers should be cleared upon 
> last reference". That isn't safe, since there can be hidden dependencies on 
> side effects of __del__, e.g.:
>
> process_objects = create_pipeline()
> output_process = process_objects[-1]
> return output_process.wait()
>
> If the process class terminates the process in __del__ (PyQt5's QProcess 
> does), then implicitly deleting process_objects after the second line will 
> break the code.

Yeah, that is old writing, so is probably less clear (no pun intended)
than it should be. In frame_dealloc, Py_CLEAR is called for
stack/register slots instead of just Py_XDECREF. Might not be
necessary.

Skip
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2SHROF3OEMZ7G3KBA3L2EKMGWYRWS5LV/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Allow syntax "func(arg=x if condition)"

2021-03-22 Thread Joao S. O. Bueno
I've missed this feature on occasion as well. +1 for whatever that counts;

On Mon, 22 Mar 2021 at 17:30, Caleb Donovick 
wrote:

> Never needed this for lists but definitely had the pain for kwargs.  Seems
> very reasonable for that use case, +0.5.
>
> In libraries I control I can make sure to use the same default values for
> functions and their wrappers.
> However when wrapping functions I don't control there is not a great way
> to do this. And I end up
> incrementally building up a kwargs dict. I suppose the same thing could
> occur with *args lists so it makes sense for
> both positional and keyword arguments.
>
> Yes one could do something like:
> ```
> def fun(a, b=0): ...
> def wraps_fun(args, b=inspect.signature(fun).parameters['b'].default): ...
> ```
> But I would hardly call that clear.  Further it is not robust as would
> fail if `fun` is itself wrapped in way
> that destroys its signature.  E.g.:
> ```
> def destroy_signature(f):
> # should decorate here with functools.wraps(f)
> def wrapper(*args, **kwargs):
> return f(*args, **kwargs)
> return wrapper
> ```
>
> Caleb
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/2EHOQDIIK7BMAY54KG44Z45IYWDDSZSW/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/TWZVNIEE3R2ZLMSLPUCC27DJI5XH6MY3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Allow syntax "func(arg=x if condition)"

2021-03-22 Thread Caleb Donovick
Never needed this for lists but definitely had the pain for kwargs.  Seems
very reasonable for that use case, +0.5.

In libraries I control I can make sure to use the same default values for
functions and their wrappers.
However when wrapping functions I don't control there is not a great way to
do this. And I end up
incrementally building up a kwargs dict. I suppose the same thing could
occur with *args lists so it makes sense for
both positional and keyword arguments.

Yes one could do something like:
```
def fun(a, b=0): ...
def wraps_fun(args, b=inspect.signature(fun).parameters['b'].default): ...
```
But I would hardly call that clear.  Further it is not robust as would fail
if `fun` is itself wrapped in way
that destroys its signature.  E.g.:
```
def destroy_signature(f):
# should decorate here with functools.wraps(f)
def wrapper(*args, **kwargs):
return f(*args, **kwargs)
return wrapper
```

Caleb
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2EHOQDIIK7BMAY54KG44Z45IYWDDSZSW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Looking for people interested in a Python register virtual machine project

2021-03-22 Thread Guido van Rossum
As I wrote, Skip’s Porto+PEP is not proposing to delete locals that are not
used in the rest of the function, only registers. So the voiced concerns
don’t apply.

On Sun, Mar 21, 2021 at 23:59 Chris Angelico  wrote:

> On Mon, Mar 22, 2021 at 5:37 PM Ben Rudiak-Gould 
> wrote:
> >
> > On Sun, Mar 21, 2021 at 11:10 PM Chris Angelico 
> wrote:
> >>
> >> At what point does the process_objects list cease to be referenced?
> >> After the last visible use of it, or at the end of the function?
> >
> >
> > In Python as it stands, at the end of the function, as you say.
> >
> > Skip Montanaro's PEP suggested that in his register machine, locals
> would be dereferenced after their last visible use. I don't think that's
> intrinsically a bad idea, but it's not backward compatible. The thing with
> the process objects was just an example of currently working code that
> would break.
> >
> > The example has nothing to do with PyQt5 really. I just happen to know
> that QProcess objects kill the controlled process when they're collected. I
> think it's a bad design, but that's the way it is.
> >
> > Another example would be something like
> >
> > td =  tempfile.TemporaryDirectory()
> > p = subprocess.Popen([..., td.name, ...], ...)
> > p.wait()
> >
> > where the temporary directory will hang around until the process exits
> with current semantics, but not if td is deleted after the second line. Of
> course you should use a with statement in this kind of situation, but
> there's probably a lot of code that doesn't.
> >
>
> Thanks for the clarification. I think the tempfile example will be a
> lot easier to explain this with, especially since it requires only the
> stdlib and isn't implying that there's broken code in a third-party
> library.
>
> I don't like this. In a bracey language (eg C++), you can declare that
> a variable should expire prior to the end of the function by including
> it in a set of braces; in Python, you can't do that, and the normal
> idiom is to reassign the variable or 'del' it. Changing the semantics
> of when variables cease to be referenced could potentially break a LOT
> of code. Maybe, if Python were a brand new language today, you could
> define the semantics that way (and require "with" blocks for anything
> that has user-visible impact, reserving __del__ for resource disposal
> ONLY), but as it is, that's a very very sneaky change that will break
> code in subtle and hard-to-debug ways.
>
> (Not sure why this change needs to go alongside the register-based VM,
> as it seems to my inexpert mind to be quite orthogonal to it; but
> whatever, I guess there's a good reason.)
>
> ChrisA
> ___
> Python-ideas mailing list -- python-ideas@python.org
> To unsubscribe send an email to python-ideas-le...@python.org
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-ideas@python.org/message/C3CUQYW3TQGJHC7SP5B4QJXFDV2XTEXB/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-- 
--Guido (mobile)
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/5TBFVUF6OTDFWQHEAC6ZJ4LQ64D5DZWA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Itertools generator injection

2021-03-22 Thread Dennis Sweeney
Hm maybe there is something worthwhile here. I implemented a Python version 
with the same semantics as itertools.product, except only consuming the 
iterators when absolutely necessary; still only consuming each iterator once, 
but building its pools incrementally.  See 
https://gist.github.com/sweeneyde/fb6734d7b9f7d17e132c28af9ecb6270

from itertools import count
it = LazyProductObject(count(0), count(0), "ab", repeat=3)
assert next(it) == (0, 0, "a", 0, 0, "a", 0, 0, "a")
assert next(it) == (0, 0, "a", 0, 0, "a", 0, 0, "b")
assert next(it) == (0, 0, "a", 0, 0, "a", 0, 1, "a")
assert next(it) == (0, 0, "a", 0, 0, "a", 0, 1, "b")
assert next(it) == (0, 0, "a", 0, 0, "a", 0, 2, "a")
assert next(it) == (0, 0, "a", 0, 0, "a", 0, 2, "b")

It looks like it could be made to have a similar tight loop as the current 
implementation if we decided to re-write it in C.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/HCX35TWZRDUY7LMGLOERWPVX6ZRICWRS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Looking for people interested in a Python register virtual machine project

2021-03-22 Thread Chris Angelico
On Mon, Mar 22, 2021 at 5:37 PM Ben Rudiak-Gould  wrote:
>
> On Sun, Mar 21, 2021 at 11:10 PM Chris Angelico  wrote:
>>
>> At what point does the process_objects list cease to be referenced?
>> After the last visible use of it, or at the end of the function?
>
>
> In Python as it stands, at the end of the function, as you say.
>
> Skip Montanaro's PEP suggested that in his register machine, locals would be 
> dereferenced after their last visible use. I don't think that's intrinsically 
> a bad idea, but it's not backward compatible. The thing with the process 
> objects was just an example of currently working code that would break.
>
> The example has nothing to do with PyQt5 really. I just happen to know that 
> QProcess objects kill the controlled process when they're collected. I think 
> it's a bad design, but that's the way it is.
>
> Another example would be something like
>
> td =  tempfile.TemporaryDirectory()
> p = subprocess.Popen([..., td.name, ...], ...)
> p.wait()
>
> where the temporary directory will hang around until the process exits with 
> current semantics, but not if td is deleted after the second line. Of course 
> you should use a with statement in this kind of situation, but there's 
> probably a lot of code that doesn't.
>

Thanks for the clarification. I think the tempfile example will be a
lot easier to explain this with, especially since it requires only the
stdlib and isn't implying that there's broken code in a third-party
library.

I don't like this. In a bracey language (eg C++), you can declare that
a variable should expire prior to the end of the function by including
it in a set of braces; in Python, you can't do that, and the normal
idiom is to reassign the variable or 'del' it. Changing the semantics
of when variables cease to be referenced could potentially break a LOT
of code. Maybe, if Python were a brand new language today, you could
define the semantics that way (and require "with" blocks for anything
that has user-visible impact, reserving __del__ for resource disposal
ONLY), but as it is, that's a very very sneaky change that will break
code in subtle and hard-to-debug ways.

(Not sure why this change needs to go alongside the register-based VM,
as it seems to my inexpert mind to be quite orthogonal to it; but
whatever, I guess there's a good reason.)

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/C3CUQYW3TQGJHC7SP5B4QJXFDV2XTEXB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Looking for people interested in a Python register virtual machine project

2021-03-22 Thread Ben Rudiak-Gould
On Sun, Mar 21, 2021 at 11:10 PM Chris Angelico  wrote:

> At what point does the process_objects list cease to be referenced?
> After the last visible use of it, or at the end of the function?


In Python as it stands, at the end of the function, as you say.

Skip Montanaro's PEP suggested that in his register machine, locals would
be dereferenced after their last visible use. I don't think that's
intrinsically a bad idea, but it's not backward compatible. The thing with
the process objects was just an example of currently working code that
would break.

The example has nothing to do with PyQt5 really. I just happen to know that
QProcess objects kill the controlled process when they're collected. I
think it's a bad design, but that's the way it is.

Another example would be something like

td =  tempfile.TemporaryDirectory()
p = subprocess.Popen([..., td.name, ...], ...)
p.wait()

where the temporary directory will hang around until the process exits with
current semantics, but not if td is deleted after the second line. Of
course you should use a with statement in this kind of situation, but
there's probably a lot of code that doesn't.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/EQ372ZPJWQW2GKCLPGXX2A6VKMRZRB36/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Looking for people interested in a Python register virtual machine project

2021-03-22 Thread Chris Angelico
On Mon, Mar 22, 2021 at 3:14 PM Guido van Rossum  wrote:
>
> On Sun, Mar 21, 2021 at 3:35 PM Chris Angelico  wrote:
>>
>> On Mon, Mar 22, 2021 at 7:49 AM Ben Rudiak-Gould  wrote:
>> >
>> > In the "Object Lifetime" section you say "registers should be cleared upon 
>> > last reference". That isn't safe, since there can be hidden dependencies 
>> > on side effects of __del__, e.g.:
>> >
>> > process_objects = create_pipeline()
>> > output_process = process_objects[-1]
>> > return output_process.wait()
>> >
>> > If the process class terminates the process in __del__ (PyQt5's QProcess 
>> > does), then implicitly deleting process_objects after the second line will 
>> > break the code.
>> >
>>
>> Hang on hang on hang on. After the second line, there are two
>> references to the last object, and one to everything else. (If
>> create_pipeline returns two objects, one for each end of the pipe,
>> then there are two references to the second one, and one to the
>> first.) Even if you dispose of process_objects itself on the basis
>> that it's not used any more (which I would disagree with, since it's
>> very difficult to manage that well), it shouldn't terminate the
>> process, because one of the objects is definitely still alive.
>
>
> In the hypothetical scenario, presumably create_pipeline() returns a list of 
> process objects, where the process class somehow kills the process when it is 
> finalized. In that case dropping the last reference to process_objects[0] 
> would kill the first process in the pipeline. I don't know if that's good API 
> design, but Ben states that PyQt5 does this, and it could stand in for any 
> number of other APIs that legitimately destroy an external resource when the 
> last reference is dropped. (E.g., stdlib temporary files.)
>

The question is really whether process_objects ceases to exist after
the last time it's referenced. I may have misinterpreted the thin
example here, but let's just focus on process_objects[0] (hereunder
"po0" for simplicity), and assume that there's at least two elements
in the list.

A list begins to exist somewhere inside create_pipeline(), and at the
point where that list is returned, it has a reference to po0. That
list is returned, and assigned to process_objects, which we assume is
a function-local variable. So the function's locals reference
process_objects, which references po0. So far, so good.

Then we get a new variable output_process, and we lift something
unrelated from the list.

Then we call a method on an unrelated object, and return from the function.

At what point does the process_objects list cease to be referenced?
After the last visible use of it, or at the end of the function? My
understanding of Python's semantics is that the list object MUST
continue to exist all the way up until the function exits, or wording
that another way, that the function's call frame has a reference to
ALL of its locals, not just the ones that can visibly be seen to be
used.

Allowing an object to be disposed of early if there are no future uses
of it would be quite surprising.

It would be different if, before the return statement,
"process_objects = None" were inserted. Then the list would cease to
be referenced, and po0 would cease to be referenced, and regardless of
the exact type of GC being used, it would be legit to ditch it before
the wait() call. If *that* version is broken, then there's a problem
with the objects in the list depending on each other in a
non-Python-visible way, and that's a bug in the library.

Can a PyQT user clarify, please?

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/FDS3ZZX6F3CB5BSEF6LC7R4UOHKOFBPO/
Code of Conduct: http://python.org/psf/codeofconduct/