Barry Scott schrieb am 27.03.22 um 22:23:
On 22 Mar 2022, at 15:57, Jonathan Fine wrote:
As you may have seen, AMD has recently announced CPUs that have much larger L3 
caches. Does anyone know of any work that's been done to research or make 
critical Python code and data smaller so that more of it fits in the CPU cache? 
I'm particularly interested in measured benefits.

I few years ago (5? 10?) there was a blog about making the python eval loop fit 
into L1 cache.
The author gave up on the work as he claimed it was too hard to contribute any 
changes to python at the time.
I have not kept a link to the blog post sadly.

What I recall is that the author found that GCC was producing far more code 
then was required to implement sections of ceval.c.
Fixing that would shrink the ceval code by 50% I recall was the claim. He had a 
PoC that showed the improvements.

Might be worth trying out if "gcc -Os" changes anything for ceval.c. Can also be enabled temporarily with a pragma (and MSVC has a similar option).

We use it in Cython for the (run once) module init code to reduce the binary module size, but it might have an impact on cache usage as well.

Stefan

_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/QQVYUUKOKN472N4OLNCAA76HLVFXMKLB/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to