>> 1. CPU cache lines (64 bytes on X86) containing a beginning of a >> PyObject are very often invalidated, resulting in loosing many chances >> to use the CPU caches > > Mutating data doesn't invalidate a cache line. It just makes it > necessary to write it back to memory at some point. >
I think he's referring to the multi-core case. In MESI terminology, the cache line will become modified in the current cache (current thread), but invalid in other cores' caches. But given that objects are accessed serialized by the GIL (which will issue a memory barrier anyway), I'm not sure that the performance impact will be noticeable. Furthermore, given that threads are actually serialized, I suspect that the scheduler tends to bind them naturally to the same CPU. >> 2. The copy-on-write after fork() optimization (Linux) is almost >> useless in CPython, because even if you don't modify data directly, >> refcounts are modified, and PyObjects with refcounts inside are spread >> all over process' memory (and one small refcount modification causes >> the whole page - 4kB - to be copied into a child process). > > Indeed. > There's been a bug report a couple months ago from someone using large datasets for some scientific application. He was suggesting to add support for Linux's MADV_MERGEABLE, but the root cause is really the reference count being incremented even when objects are treated read-only. For the record, it's http://bugs.python.org/issue9942 (and this idea was brought up here). cf _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com