[Python-ideas] Re: Looking for people interested in a Python register virtual machine project
Thanks for the response. I will try to address your comments inline. > I guess it should be a good idea to answer what's the scope of this > project - is it research one or "production" one? If it's research one, > why be concerned with the churn of over-modern CPython versions? > Wouldn't it be better to just use some scalable, incremental > implementation which would allow to forward-port it to a newer version, > if it ever comes to that? The motivation for revisiting this idea was/is largely personal. As I indicated, I first messed around with it over 20 years ago and it's been in the back of my mind ever since. Somehow I never lost the code despite I'm not sure how many computers came and went and that the code was never uploaded to any sort of distributed version control system. I decided to pick things up again as a way to mostly keep my head in the game after I retired. So, neither "research" nor "production" seems to be a correct descriptor. Still, if taken to functional completion — functional enough for performance testing and application to more than just toy scripts — I realized pretty quickly that I'd need help. > Otherwise, if it's "production", who's the "customer" and how they > "compensate" you for doing work (chasing the moving target) which is > clearly of little interest to you and conflicts with the goal of the > project? Nobody is compensating me. I have no desire to try and turn it into something I do for hire. Maybe I misunderstood your question? > > This PEP proposes the addition of register-based instructions to the > > existing Python virtual machine, with the intent that they eventually > > replace the existing stack-based opcodes. > > Sorry, what? The purpose of register-based instructions is to just > replace stack-based instructions? That's not what's I'd like to hear as > the intro phrase. You probably want to replace one with the other > because register-based ones offer some benefit, faster execution > perhaps? That's what I'd like to hear instead of "deciphering" that > between the lines. Replacing stack-based instructions would be a reasonable initial goal, I think. Victor reported performance improvements in his implementation (also a translator). As I indicated in the "PEP" (I use that term rather loosely, as I have no plans at the moment to submit it for consideration, certainly not in its current, incomplete state), a better ultimate way to go would be to generate register instructions directly from the AST. The current translation scheme allows me to write simple test case functions, generate register instructions, then compare that when called the two produce the same result. > > They [2 instruction sets] are almost completely distinct. > > That doesn't correspond to the mental image I would have. In my list, > the 2 sets would be exactly the same, except that stack-based encode > argument locations implicitly, while register-based - explicitly. Would > be interesting to read (in the following "pep" sections) what makes them > "almost completely distinct". Well, sure. The main difference is the way two pairs of instructions (say, BINARY_ADD vs BINARY_ADD_REG) get their operands and save their result. You still have to be able to add two objects, call functions, etc. > > Within a single function only one set of opcodes or the other will > > be used at any one time. > > That would be the opposite of "scalable, incremental" development > approach mentioned above. Why not allow 2 sets to freely co-exist, and > migrate codegeneration/implement code translation gradually? The fact that I treat the current frame's stack space as registers makes it pretty much impossible to execute both stack and register instructions within the same frame. Victor's implementation did things differently in this regard. I believe he just allocated extra space for 256 registers at the end of each frame, so (in theory, I suppose), you could have instructions from both executed in the same frame. > > ## Motivation > > I'm not sure the content of the section corresponds much to its title. > It jumps from background survey of the different Python VM optimizations > to (some) implementation details of register VM - leaving "motivation" > somewhere "between the lines". > > > Despite all that effort, opcodes which do nothing more than move data > > onto or off of the stack (LOAD_FAST, LOAD_GLOBAL, etc) still account > > for nearly half of all opcodes executed. > > ... And - you intend to change that with a register VM? In which way and > how? As an example, LOAD_GLOBAL isn't going anywhere - it loads a > variable by *symbolic* name into a register. Certainly, if you have data which isn't already on the stack, you are going to have to move data. As the appendix shows though, a fairly large chunk of the current virtual machine does nothing more than manipulate the stack (LOAD_FAST, STORE_FAST, POP_TOP, etc). > > Running Pyperformance using a development version of Python 3.9 > > showed
[Python-ideas] Re: Looking for people interested in a Python register virtual machine project
> Yeah, that is old writing, so is probably less clear (no pun intended) > than it should be. In frame_dealloc, Py_CLEAR is called for > stack/register slots instead of just Py_XDECREF. Might not be > necessary. Also, the intent is not to change any semantics here. The implementation of RETURN_VALUE_REG still Py_INCREFs the to-be-returned value. It's not like the data can get reclaimed before the caller receives it. S ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/PMBNH3L6WEV7TRVKQYOQVSTNJQHH6YYB/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Looking for people interested in a Python register virtual machine project
> In the "Object Lifetime" section you say "registers should be cleared upon > last reference". That isn't safe, since there can be hidden dependencies on > side effects of __del__, e.g.: > > process_objects = create_pipeline() > output_process = process_objects[-1] > return output_process.wait() > > If the process class terminates the process in __del__ (PyQt5's QProcess > does), then implicitly deleting process_objects after the second line will > break the code. Yeah, that is old writing, so is probably less clear (no pun intended) than it should be. In frame_dealloc, Py_CLEAR is called for stack/register slots instead of just Py_XDECREF. Might not be necessary. Skip ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/2SHROF3OEMZ7G3KBA3L2EKMGWYRWS5LV/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Allow syntax "func(arg=x if condition)"
I've missed this feature on occasion as well. +1 for whatever that counts; On Mon, 22 Mar 2021 at 17:30, Caleb Donovick wrote: > Never needed this for lists but definitely had the pain for kwargs. Seems > very reasonable for that use case, +0.5. > > In libraries I control I can make sure to use the same default values for > functions and their wrappers. > However when wrapping functions I don't control there is not a great way > to do this. And I end up > incrementally building up a kwargs dict. I suppose the same thing could > occur with *args lists so it makes sense for > both positional and keyword arguments. > > Yes one could do something like: > ``` > def fun(a, b=0): ... > def wraps_fun(args, b=inspect.signature(fun).parameters['b'].default): ... > ``` > But I would hardly call that clear. Further it is not robust as would > fail if `fun` is itself wrapped in way > that destroys its signature. E.g.: > ``` > def destroy_signature(f): > # should decorate here with functools.wraps(f) > def wrapper(*args, **kwargs): > return f(*args, **kwargs) > return wrapper > ``` > > Caleb > ___ > Python-ideas mailing list -- python-ideas@python.org > To unsubscribe send an email to python-ideas-le...@python.org > https://mail.python.org/mailman3/lists/python-ideas.python.org/ > Message archived at > https://mail.python.org/archives/list/python-ideas@python.org/message/2EHOQDIIK7BMAY54KG44Z45IYWDDSZSW/ > Code of Conduct: http://python.org/psf/codeofconduct/ > ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/TWZVNIEE3R2ZLMSLPUCC27DJI5XH6MY3/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Allow syntax "func(arg=x if condition)"
Never needed this for lists but definitely had the pain for kwargs. Seems very reasonable for that use case, +0.5. In libraries I control I can make sure to use the same default values for functions and their wrappers. However when wrapping functions I don't control there is not a great way to do this. And I end up incrementally building up a kwargs dict. I suppose the same thing could occur with *args lists so it makes sense for both positional and keyword arguments. Yes one could do something like: ``` def fun(a, b=0): ... def wraps_fun(args, b=inspect.signature(fun).parameters['b'].default): ... ``` But I would hardly call that clear. Further it is not robust as would fail if `fun` is itself wrapped in way that destroys its signature. E.g.: ``` def destroy_signature(f): # should decorate here with functools.wraps(f) def wrapper(*args, **kwargs): return f(*args, **kwargs) return wrapper ``` Caleb ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/2EHOQDIIK7BMAY54KG44Z45IYWDDSZSW/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Looking for people interested in a Python register virtual machine project
As I wrote, Skip’s Porto+PEP is not proposing to delete locals that are not used in the rest of the function, only registers. So the voiced concerns don’t apply. On Sun, Mar 21, 2021 at 23:59 Chris Angelico wrote: > On Mon, Mar 22, 2021 at 5:37 PM Ben Rudiak-Gould > wrote: > > > > On Sun, Mar 21, 2021 at 11:10 PM Chris Angelico > wrote: > >> > >> At what point does the process_objects list cease to be referenced? > >> After the last visible use of it, or at the end of the function? > > > > > > In Python as it stands, at the end of the function, as you say. > > > > Skip Montanaro's PEP suggested that in his register machine, locals > would be dereferenced after their last visible use. I don't think that's > intrinsically a bad idea, but it's not backward compatible. The thing with > the process objects was just an example of currently working code that > would break. > > > > The example has nothing to do with PyQt5 really. I just happen to know > that QProcess objects kill the controlled process when they're collected. I > think it's a bad design, but that's the way it is. > > > > Another example would be something like > > > > td = tempfile.TemporaryDirectory() > > p = subprocess.Popen([..., td.name, ...], ...) > > p.wait() > > > > where the temporary directory will hang around until the process exits > with current semantics, but not if td is deleted after the second line. Of > course you should use a with statement in this kind of situation, but > there's probably a lot of code that doesn't. > > > > Thanks for the clarification. I think the tempfile example will be a > lot easier to explain this with, especially since it requires only the > stdlib and isn't implying that there's broken code in a third-party > library. > > I don't like this. In a bracey language (eg C++), you can declare that > a variable should expire prior to the end of the function by including > it in a set of braces; in Python, you can't do that, and the normal > idiom is to reassign the variable or 'del' it. Changing the semantics > of when variables cease to be referenced could potentially break a LOT > of code. Maybe, if Python were a brand new language today, you could > define the semantics that way (and require "with" blocks for anything > that has user-visible impact, reserving __del__ for resource disposal > ONLY), but as it is, that's a very very sneaky change that will break > code in subtle and hard-to-debug ways. > > (Not sure why this change needs to go alongside the register-based VM, > as it seems to my inexpert mind to be quite orthogonal to it; but > whatever, I guess there's a good reason.) > > ChrisA > ___ > Python-ideas mailing list -- python-ideas@python.org > To unsubscribe send an email to python-ideas-le...@python.org > https://mail.python.org/mailman3/lists/python-ideas.python.org/ > Message archived at > https://mail.python.org/archives/list/python-ideas@python.org/message/C3CUQYW3TQGJHC7SP5B4QJXFDV2XTEXB/ > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido (mobile) ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/5TBFVUF6OTDFWQHEAC6ZJ4LQ64D5DZWA/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Itertools generator injection
Hm maybe there is something worthwhile here. I implemented a Python version with the same semantics as itertools.product, except only consuming the iterators when absolutely necessary; still only consuming each iterator once, but building its pools incrementally. See https://gist.github.com/sweeneyde/fb6734d7b9f7d17e132c28af9ecb6270 from itertools import count it = LazyProductObject(count(0), count(0), "ab", repeat=3) assert next(it) == (0, 0, "a", 0, 0, "a", 0, 0, "a") assert next(it) == (0, 0, "a", 0, 0, "a", 0, 0, "b") assert next(it) == (0, 0, "a", 0, 0, "a", 0, 1, "a") assert next(it) == (0, 0, "a", 0, 0, "a", 0, 1, "b") assert next(it) == (0, 0, "a", 0, 0, "a", 0, 2, "a") assert next(it) == (0, 0, "a", 0, 0, "a", 0, 2, "b") It looks like it could be made to have a similar tight loop as the current implementation if we decided to re-write it in C. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/HCX35TWZRDUY7LMGLOERWPVX6ZRICWRS/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Looking for people interested in a Python register virtual machine project
On Mon, Mar 22, 2021 at 5:37 PM Ben Rudiak-Gould wrote: > > On Sun, Mar 21, 2021 at 11:10 PM Chris Angelico wrote: >> >> At what point does the process_objects list cease to be referenced? >> After the last visible use of it, or at the end of the function? > > > In Python as it stands, at the end of the function, as you say. > > Skip Montanaro's PEP suggested that in his register machine, locals would be > dereferenced after their last visible use. I don't think that's intrinsically > a bad idea, but it's not backward compatible. The thing with the process > objects was just an example of currently working code that would break. > > The example has nothing to do with PyQt5 really. I just happen to know that > QProcess objects kill the controlled process when they're collected. I think > it's a bad design, but that's the way it is. > > Another example would be something like > > td = tempfile.TemporaryDirectory() > p = subprocess.Popen([..., td.name, ...], ...) > p.wait() > > where the temporary directory will hang around until the process exits with > current semantics, but not if td is deleted after the second line. Of course > you should use a with statement in this kind of situation, but there's > probably a lot of code that doesn't. > Thanks for the clarification. I think the tempfile example will be a lot easier to explain this with, especially since it requires only the stdlib and isn't implying that there's broken code in a third-party library. I don't like this. In a bracey language (eg C++), you can declare that a variable should expire prior to the end of the function by including it in a set of braces; in Python, you can't do that, and the normal idiom is to reassign the variable or 'del' it. Changing the semantics of when variables cease to be referenced could potentially break a LOT of code. Maybe, if Python were a brand new language today, you could define the semantics that way (and require "with" blocks for anything that has user-visible impact, reserving __del__ for resource disposal ONLY), but as it is, that's a very very sneaky change that will break code in subtle and hard-to-debug ways. (Not sure why this change needs to go alongside the register-based VM, as it seems to my inexpert mind to be quite orthogonal to it; but whatever, I guess there's a good reason.) ChrisA ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/C3CUQYW3TQGJHC7SP5B4QJXFDV2XTEXB/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Looking for people interested in a Python register virtual machine project
On Sun, Mar 21, 2021 at 11:10 PM Chris Angelico wrote: > At what point does the process_objects list cease to be referenced? > After the last visible use of it, or at the end of the function? In Python as it stands, at the end of the function, as you say. Skip Montanaro's PEP suggested that in his register machine, locals would be dereferenced after their last visible use. I don't think that's intrinsically a bad idea, but it's not backward compatible. The thing with the process objects was just an example of currently working code that would break. The example has nothing to do with PyQt5 really. I just happen to know that QProcess objects kill the controlled process when they're collected. I think it's a bad design, but that's the way it is. Another example would be something like td = tempfile.TemporaryDirectory() p = subprocess.Popen([..., td.name, ...], ...) p.wait() where the temporary directory will hang around until the process exits with current semantics, but not if td is deleted after the second line. Of course you should use a with statement in this kind of situation, but there's probably a lot of code that doesn't. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/EQ372ZPJWQW2GKCLPGXX2A6VKMRZRB36/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Looking for people interested in a Python register virtual machine project
On Mon, Mar 22, 2021 at 3:14 PM Guido van Rossum wrote: > > On Sun, Mar 21, 2021 at 3:35 PM Chris Angelico wrote: >> >> On Mon, Mar 22, 2021 at 7:49 AM Ben Rudiak-Gould wrote: >> > >> > In the "Object Lifetime" section you say "registers should be cleared upon >> > last reference". That isn't safe, since there can be hidden dependencies >> > on side effects of __del__, e.g.: >> > >> > process_objects = create_pipeline() >> > output_process = process_objects[-1] >> > return output_process.wait() >> > >> > If the process class terminates the process in __del__ (PyQt5's QProcess >> > does), then implicitly deleting process_objects after the second line will >> > break the code. >> > >> >> Hang on hang on hang on. After the second line, there are two >> references to the last object, and one to everything else. (If >> create_pipeline returns two objects, one for each end of the pipe, >> then there are two references to the second one, and one to the >> first.) Even if you dispose of process_objects itself on the basis >> that it's not used any more (which I would disagree with, since it's >> very difficult to manage that well), it shouldn't terminate the >> process, because one of the objects is definitely still alive. > > > In the hypothetical scenario, presumably create_pipeline() returns a list of > process objects, where the process class somehow kills the process when it is > finalized. In that case dropping the last reference to process_objects[0] > would kill the first process in the pipeline. I don't know if that's good API > design, but Ben states that PyQt5 does this, and it could stand in for any > number of other APIs that legitimately destroy an external resource when the > last reference is dropped. (E.g., stdlib temporary files.) > The question is really whether process_objects ceases to exist after the last time it's referenced. I may have misinterpreted the thin example here, but let's just focus on process_objects[0] (hereunder "po0" for simplicity), and assume that there's at least two elements in the list. A list begins to exist somewhere inside create_pipeline(), and at the point where that list is returned, it has a reference to po0. That list is returned, and assigned to process_objects, which we assume is a function-local variable. So the function's locals reference process_objects, which references po0. So far, so good. Then we get a new variable output_process, and we lift something unrelated from the list. Then we call a method on an unrelated object, and return from the function. At what point does the process_objects list cease to be referenced? After the last visible use of it, or at the end of the function? My understanding of Python's semantics is that the list object MUST continue to exist all the way up until the function exits, or wording that another way, that the function's call frame has a reference to ALL of its locals, not just the ones that can visibly be seen to be used. Allowing an object to be disposed of early if there are no future uses of it would be quite surprising. It would be different if, before the return statement, "process_objects = None" were inserted. Then the list would cease to be referenced, and po0 would cease to be referenced, and regardless of the exact type of GC being used, it would be legit to ditch it before the wait() call. If *that* version is broken, then there's a problem with the objects in the list depending on each other in a non-Python-visible way, and that's a bug in the library. Can a PyQT user clarify, please? ChrisA ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/FDS3ZZX6F3CB5BSEF6LC7R4UOHKOFBPO/ Code of Conduct: http://python.org/psf/codeofconduct/