Hi Petr,

On 21/10/2020 11:49 am, Petr Viktorin wrote:
Let me explain an impression I'm getting. It is *just one aspect* of my opinion, one that doesn't make sense to me. Please tell me where it is wrong.


In the C API, there's a somewhat controversial refactoring going on, which involves passing around tstate arguments. I'm not saying [the first discussion] was perfect, and there are still issues, but, however flawed the "do-ocracy" process is, it is the best way we found to move forward. No one who can/wants to do the work has a better solution.

Later, Mark says there is an even better way – or at least, a less intrusive one! In [the second discussion], he hints at it vaguely (from that limited info I have, it involves switching to C11 and/or using compiler-specific extensions -- not an easy change to do). But frustratingly, Mark doesn't reveal any actual details, and a lot of the complaints are about churn and merge conflicts. And now, there's news -- the better solution won't be revealed unless the PSF pays for it!

There's no secret. C thread locals are well documented.
I even provided a code example last time we discussed it.

You reminded me of it yesterday ;)
https://godbolt.org/z/dpSo-Q

The "even faster" solution I mentioned yesterday, is as I stated yesterday to use an aligned stack.
If you wanted more info, you could have asked :)

First, you ensure that the stack is in a 2**N aligned block.
Assuming that the C stack grows down from the top, then the threadstate struct goes at the bottom. It's probably a good idea to put a guard page between the C stack and the threadstate struct.

The struct's address can then be found by masking off the bottom N bits from the stack pointer. This approach uses 0 registers and cost 1 ALU instruction. Can't get cheaper than that :)

It's not portable and probably a pain to implement, but it is fast.

But it doesn't matter how it's implemented. The implementation is hidden behind `PyThreadState_GET()`, it can be changed to use a thread local,
or to some fancy aligned stack, without the rest of the codebase changing.


That's a very bad situation to be in for having discussions: basically, either we disregard Mark and go with the not-ideal solution, or virtually all work on changing the C API and internal structures is blocked.

The existence of multiple interpreters should be orthogonal to speeding up those interpreters, provided the separation is clean and well designed.
But it should be clean and well designed anyway, IMO.


I sense a similar thing happening here: https://github.com/ericsnowcurrently/multi-core-python/issues/69 --

The title of that issue is 'Clarify what is a "sub-interpreter" and what is an "interpreter"'?

there's a vague proposal to do things very differently, but I find it

This?
https://github.com/ericsnowcurrently/multi-core-python/issues/69#issuecomment-712837899

hard to find anything actionable. I would like to change my plans to align with Mark's fork, or to better explain some of the non-performance reasons for recent/planned changes. But I can't, because details are behind a paywall.

Let's make this very clear.
My objections to the way multiple interpreters is being implemented has very little to do speeding up the interpreter and entirely to do with long term maintenance and ultimate success of the project.

Obviously, I would like it if multiple interpreters didn't slowdown CPython.
But that has always been the case.

Cheers,
Mark.



[the first discussion]: https://mail.python.org/archives/list/python-dev@python.org/thread/PQBGECVGVYFTVDLBYURLCXA3T7IPEHHO/#Q4IPXMQIM5YRLZLHADUGSUT4ZLXQ6MYY

[the second discussion]: https://mail.python.org/archives/list/python-dev@python.org/thread/KGBXVVJQZJEEZD7KDS5G3GLBGZ6XNJJX/#WOKAUQYDJDVRA7SJRJDEAHXTRXSVPNMO


On 10/20/20 2:53 PM, Mark Shannon wrote:
Hi everyone,

CPython is slow. We all know that, yet little is done to fix it.

I'd like to change that.
I have a plan to speed up CPython by a factor of five over the next few years. But it needs funding.

I am aware that there have been several promised speed ups in the past that have failed. You might wonder why this is different.

Here are three reasons:
1. I already have working code for the first stage.
2. I'm not promising a silver bullet. I recognize that this is a substantial amount of work and needs funding. 3. I have extensive experience in VM implementation, not to mention a PhD in the subject.

My ideas for possible funding, as well as the actual plan of development, can be found here:

https://github.com/markshannon/faster-cpython

I'd love to hear your thoughts on this.

Cheers,
Mark.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/RDXLCH22T2EZDRCBM6ZYYIUTBWQVVVWH/
Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/7DKURFZ3JEZTKCUAUDCPR527FUBYMY7N/
Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/G3VXADDJ5OYEQHMHXG3GDEWCU733JMOT/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to