Hi Petr,
On 21/10/2020 11:49 am, Petr Viktorin wrote:
Let me explain an impression I'm getting. It is *just one aspect* of my
opinion, one that doesn't make sense to me. Please tell me where it is
wrong.
In the C API, there's a somewhat controversial refactoring going on,
which involves passing around tstate arguments. I'm not saying [the
first discussion] was perfect, and there are still issues, but, however
flawed the "do-ocracy" process is, it is the best way we found to move
forward. No one who can/wants to do the work has a better solution.
Later, Mark says there is an even better way – or at least, a less
intrusive one! In [the second discussion], he hints at it vaguely (from
that limited info I have, it involves switching to C11 and/or using
compiler-specific extensions -- not an easy change to do). But
frustratingly, Mark doesn't reveal any actual details, and a lot of the
complaints are about churn and merge conflicts.
And now, there's news -- the better solution won't be revealed unless
the PSF pays for it!
There's no secret. C thread locals are well documented.
I even provided a code example last time we discussed it.
You reminded me of it yesterday ;)
https://godbolt.org/z/dpSo-Q
The "even faster" solution I mentioned yesterday, is as I stated
yesterday to use an aligned stack.
If you wanted more info, you could have asked :)
First, you ensure that the stack is in a 2**N aligned block.
Assuming that the C stack grows down from the top, then the threadstate
struct goes at the bottom. It's probably a good idea to put a guard page
between the C stack and the threadstate struct.
The struct's address can then be found by masking off the bottom N bits
from the stack pointer.
This approach uses 0 registers and cost 1 ALU instruction. Can't get
cheaper than that :)
It's not portable and probably a pain to implement, but it is fast.
But it doesn't matter how it's implemented. The implementation is hidden
behind `PyThreadState_GET()`, it can be changed to use a thread local,
or to some fancy aligned stack, without the rest of the codebase changing.
That's a very bad situation to be in for having discussions: basically,
either we disregard Mark and go with the not-ideal solution, or
virtually all work on changing the C API and internal structures is
blocked.
The existence of multiple interpreters should be orthogonal to speeding
up those interpreters, provided the separation is clean and well designed.
But it should be clean and well designed anyway, IMO.
I sense a similar thing happening here:
https://github.com/ericsnowcurrently/multi-core-python/issues/69 --
The title of that issue is 'Clarify what is a "sub-interpreter" and what
is an "interpreter"'?
there's a vague proposal to do things very differently, but I find it
This?
https://github.com/ericsnowcurrently/multi-core-python/issues/69#issuecomment-712837899
hard to find anything actionable. I would like to change my plans to
align with Mark's fork, or to better explain some of the non-performance
reasons for recent/planned changes. But I can't, because details are
behind a paywall.
Let's make this very clear.
My objections to the way multiple interpreters is being implemented has
very little to do speeding up the interpreter and entirely to do with
long term maintenance and ultimate success of the project.
Obviously, I would like it if multiple interpreters didn't slowdown CPython.
But that has always been the case.
Cheers,
Mark.
[the first discussion]:
https://mail.python.org/archives/list/python-dev@python.org/thread/PQBGECVGVYFTVDLBYURLCXA3T7IPEHHO/#Q4IPXMQIM5YRLZLHADUGSUT4ZLXQ6MYY
[the second discussion]:
https://mail.python.org/archives/list/python-dev@python.org/thread/KGBXVVJQZJEEZD7KDS5G3GLBGZ6XNJJX/#WOKAUQYDJDVRA7SJRJDEAHXTRXSVPNMO
On 10/20/20 2:53 PM, Mark Shannon wrote:
Hi everyone,
CPython is slow. We all know that, yet little is done to fix it.
I'd like to change that.
I have a plan to speed up CPython by a factor of five over the next
few years. But it needs funding.
I am aware that there have been several promised speed ups in the past
that have failed. You might wonder why this is different.
Here are three reasons:
1. I already have working code for the first stage.
2. I'm not promising a silver bullet. I recognize that this is a
substantial amount of work and needs funding.
3. I have extensive experience in VM implementation, not to mention a
PhD in the subject.
My ideas for possible funding, as well as the actual plan of
development, can be found here:
https://github.com/markshannon/faster-cpython
I'd love to hear your thoughts on this.
Cheers,
Mark.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/RDXLCH22T2EZDRCBM6ZYYIUTBWQVVVWH/
Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/7DKURFZ3JEZTKCUAUDCPR527FUBYMY7N/
Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/G3VXADDJ5OYEQHMHXG3GDEWCU733JMOT/
Code of Conduct: http://python.org/psf/codeofconduct/