Re: [Python-Dev] CPython optimization: storing reference counters outside of objects
Hello, On Sun, 22 May 2011 01:57:55 +0200 Artur Siekielski wrote: > 1. CPU cache lines (64 bytes on X86) containing a beginning of a > PyObject are very often invalidated, resulting in loosing many chances > to use the CPU caches Mutating data doesn't invalidate a cache line. It just makes it necessary to write it back to memory at some point. > 2. The copy-on-write after fork() optimization (Linux) is almost > useless in CPython, because even if you don't modify data directly, > refcounts are modified, and PyObjects with refcounts inside are spread > all over process' memory (and one small refcount modification causes > the whole page - 4kB - to be copied into a child process). Indeed. > I'm not a compiler/profiling expert so the main question is if such > design can work, and maybe someone was thinking about something > similar? And if CPython was profiled for CPU cache usage? This has already been proposed a couple of times. I guess what's needed is for someone to experiment and post benchmark results. Regards Antoine. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] CPython optimization: storing reference counters outside of objects
>> 1. CPU cache lines (64 bytes on X86) containing a beginning of a >> PyObject are very often invalidated, resulting in loosing many chances >> to use the CPU caches > > Mutating data doesn't invalidate a cache line. It just makes it > necessary to write it back to memory at some point. > I think he's referring to the multi-core case. In MESI terminology, the cache line will become modified in the current cache (current thread), but invalid in other cores' caches. But given that objects are accessed serialized by the GIL (which will issue a memory barrier anyway), I'm not sure that the performance impact will be noticeable. Furthermore, given that threads are actually serialized, I suspect that the scheduler tends to bind them naturally to the same CPU. >> 2. The copy-on-write after fork() optimization (Linux) is almost >> useless in CPython, because even if you don't modify data directly, >> refcounts are modified, and PyObjects with refcounts inside are spread >> all over process' memory (and one small refcount modification causes >> the whole page - 4kB - to be copied into a child process). > > Indeed. > There's been a bug report a couple months ago from someone using large datasets for some scientific application. He was suggesting to add support for Linux's MADV_MERGEABLE, but the root cause is really the reference count being incremented even when objects are treated read-only. For the record, it's http://bugs.python.org/issue9942 (and this idea was brought up here). cf ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Stable buildbots update
Tarek Ziadé wrote: > Yes, I am aware of this. I have fixed today most remaining issues, and > fixing the final ones right now. Just FYI: the "AMD64 Snow Leopard" buildbot and "PPC Leopard" buildbots are now green, but the "PPC Tiger" buildbot is still failing for all branches because of packaging errors: == FAIL: test_user_site (packaging.tests.test_command_install_dist.InstallTestCase) -- Traceback (most recent call last): File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/packaging/tests/test_command_install_dist.py", line 95, in test_user_site self._test_user_site() File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/packaging/tests/test_command_install_dist.py", line 124, in _test_user_site self.assertTrue(os.path.exists(self.user_base)) AssertionError: False is not true == FAIL: test_get_outputs (packaging.tests.test_command_install_lib.InstallLibTestCase) -- Traceback (most recent call last): File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/packaging/tests/test_command_install_lib.py", line 71, in test_get_outputs self.assertEqual(len(cmd.get_outputs()), 4) AssertionError: 2 != 4 Bill ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] CPython optimization: storing reference counters outside of objects
> I'm not a compiler/profiling expert so the main question is if such > design can work, and maybe someone was thinking about something > similar? My expectation is that your approach would likely make the issues worse in a multi-CPU setting. If you put multiple reference counters into a contiguous block of memory, unrelated reference counters will live in the same cache line. Consequentially, changing one reference counter on one CPU will invalidate the cached reference counters of that cache line on other CPU, making your problem a) actually worse. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] CPython optimization: storing reference counters outside of objects
2011/5/23 "Martin v. Löwis" > > I'm not a compiler/profiling expert so the main question is if such > > design can work, and maybe someone was thinking about something > > similar? > > My expectation is that your approach would likely make the issues > worse in a multi-CPU setting. If you put multiple reference counters > into a contiguous block of memory, unrelated reference counters will > live in the same cache line. Consequentially, changing one reference > counter on one CPU will invalidate the cached reference counters of > that cache line on other CPU, making your problem a) actually worse. > > Regards, > Martin > I don't think that moving ob_refcnt to a proper memory pool will solve the problem of cache pollution anyway. ob_refcnt is obviously the most stressed field in PyObject, but it's not the only one. We have , that is needed to model each object (instance) "behavior", which is massively accessed too, so a cache line will be loaded as well when the object will be used. Also, only a few of simple objects have just ob_refcnt and ob_type. Most of them have other fields too, and accessing them means a line cache load. Regards, Cesare P.S. Memory allocation granularity can help sometimes, leaving some data (ob_refcnt and/or ob_type) on one cache line, and the other on the next one. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] looking for a contact at Google on the Blogger team
On Sat, May 21, 2011 at 7:47 AM, "Martin v. Löwis" wrote: >> As Jesse has said, there is an RFP in development to improve >> python.org to the point where we can self-host blogs and the like and >> deal with the associated user account administration appropriately. > > To run a blog on www.python.org, a PEP is not needed. If anybody would > volunteer to set this up, it could be done in no time. If I understand correctly, the RFP is more about improving the entire python.org toolchain to make it something that non-programmers can easily provide content for (and even *programmers* don't particularly like the current toolchain). Cheers, Nick. -- Nick Coghlan | [email protected] | Brisbane, Australia ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Hello!
On Sat, May 21, 2011 at 9:59 PM, Antoine Pitrou wrote: > On Fri, 20 May 2011 19:01:26 +0200 > Charles-François Natali wrote: > >> Hi, >> >> My name is Charles-François Natali, I've been using Python for a >> couple years, and I've recently been granted commit priviledge. >> I just wanted to say hi to everyone on this list, and let you know >> that I'm really happy and proud of joining this great community. > > Welcome, and keep up the good work. Indeed! Cheers, Nick. -- Nick Coghlan | [email protected] | Brisbane, Australia ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The socket HOWTO
On Sun, May 22, 2011 at 3:38 AM, Georg Brandl wrote: > On 05/21/11 18:01, Senthil Kumaran wrote: >> So a rewrite with good pointers would be more appropriate. > > Even then, it's better off in the Wiki until the rewrite is complete. Perhaps replacing it with a placeholder page that refers to the Wiki would be appropriate? A simple summary saying that the HOWTO had not aged well, and hence had been removed from the official documentation until it had been updated on the Wiki would allow people looking for it to better understand the situation, and also how to help improve it. Cheers, Nick. -- Nick Coghlan | [email protected] | Brisbane, Australia ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Stable buildbots update
On Mon, May 23, 2011 at 3:00 AM, Bill Janssen wrote: > Tarek Ziadé wrote: > >> Yes, I am aware of this. I have fixed today most remaining issues, and >> fixing the final ones right now. > > Just FYI: the "AMD64 Snow Leopard" buildbot and "PPC Leopard" buildbots > are now green, but the "PPC Tiger" buildbot is still failing for all > branches because of packaging errors: All the linux and windows stable slaves are now green, and I have a few issues left to be fixed for all solaris flavors and the two you are showing, that are also failing under Free BSD. Thanks Tarek -- Tarek Ziadé | http://ziade.org ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
