Re: [Python-Dev] CPython optimization: storing reference counters outside of objects

2011-05-22 Thread Antoine Pitrou

Hello,

On Sun, 22 May 2011 01:57:55 +0200
Artur Siekielski  wrote:
> 1. CPU cache lines (64 bytes on X86) containing a beginning of a
> PyObject are very often invalidated, resulting in loosing many chances
> to use the CPU caches

Mutating data doesn't invalidate a cache line. It just makes it
necessary to write it back to memory at some point.

> 2. The copy-on-write after fork() optimization (Linux) is almost
> useless in CPython, because even if you don't modify data directly,
> refcounts are modified, and PyObjects with refcounts inside are spread
> all over process' memory (and one small refcount modification causes
> the whole page - 4kB - to be copied into a child process).

Indeed.

> I'm not a compiler/profiling expert so the main question is if such
> design can work, and maybe someone was thinking about something
> similar? And if CPython was profiled for CPU cache usage?

This has already been proposed a couple of times. I guess what's needed
is for someone to experiment and post benchmark results.

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] CPython optimization: storing reference counters outside of objects

2011-05-22 Thread Charles-François Natali
>> 1. CPU cache lines (64 bytes on X86) containing a beginning of a
>> PyObject are very often invalidated, resulting in loosing many chances
>> to use the CPU caches
>
> Mutating data doesn't invalidate a cache line. It just makes it
> necessary to write it back to memory at some point.
>

I think he's referring to the multi-core case.
In MESI terminology, the cache line will become modified in the
current cache (current thread),  but invalid in other cores' caches.
But given that objects are accessed serialized by the GIL (which will
issue a memory barrier anyway), I'm not sure that the performance
impact will be noticeable. Furthermore, given that threads are
actually serialized, I suspect that the scheduler tends to bind them
naturally to the same CPU.

>> 2. The copy-on-write after fork() optimization (Linux) is almost
>> useless in CPython, because even if you don't modify data directly,
>> refcounts are modified, and PyObjects with refcounts inside are spread
>> all over process' memory (and one small refcount modification causes
>> the whole page - 4kB - to be copied into a child process).
>
> Indeed.
>

There's been a bug report a couple months ago from someone using large
datasets for some scientific application. He was suggesting to add
support for Linux's MADV_MERGEABLE, but the root cause is really the
reference count being incremented even when objects are treated
read-only.
For the record, it's http://bugs.python.org/issue9942 (and this idea
was brought up here).

cf
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Stable buildbots update

2011-05-22 Thread Bill Janssen
Tarek Ziadé  wrote:

> Yes, I am aware of this. I have fixed today most remaining issues, and
> fixing the final ones right now.

Just FYI:  the "AMD64 Snow Leopard" buildbot and "PPC Leopard" buildbots
are now green, but the "PPC Tiger" buildbot is still failing for all
branches because of packaging errors:

==
FAIL: test_user_site (packaging.tests.test_command_install_dist.InstallTestCase)
--
Traceback (most recent call last):
  File 
"/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/packaging/tests/test_command_install_dist.py",
 line 95, in test_user_site
self._test_user_site()
  File 
"/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/packaging/tests/test_command_install_dist.py",
 line 124, in _test_user_site
self.assertTrue(os.path.exists(self.user_base))
AssertionError: False is not true

==
FAIL: test_get_outputs 
(packaging.tests.test_command_install_lib.InstallLibTestCase)
--
Traceback (most recent call last):
  File 
"/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/packaging/tests/test_command_install_lib.py",
 line 71, in test_get_outputs
self.assertEqual(len(cmd.get_outputs()), 4)
AssertionError: 2 != 4

Bill
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] CPython optimization: storing reference counters outside of objects

2011-05-22 Thread Martin v. Löwis
> I'm not a compiler/profiling expert so the main question is if such
> design can work, and maybe someone was thinking about something
> similar?

My expectation is that your approach would likely make the issues
worse in a multi-CPU setting. If you put multiple reference counters
into a contiguous block of memory, unrelated reference counters will
live in the same cache line. Consequentially, changing one reference
counter on one CPU will invalidate the cached reference counters of
that cache line on other CPU, making your problem a) actually worse.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] CPython optimization: storing reference counters outside of objects

2011-05-22 Thread Cesare Di Mauro
2011/5/23 "Martin v. Löwis" 

> > I'm not a compiler/profiling expert so the main question is if such
> > design can work, and maybe someone was thinking about something
> > similar?
>
> My expectation is that your approach would likely make the issues
> worse in a multi-CPU setting. If you put multiple reference counters
> into a contiguous block of memory, unrelated reference counters will
> live in the same cache line. Consequentially, changing one reference
> counter on one CPU will invalidate the cached reference counters of
> that cache line on other CPU, making your problem a) actually worse.
>
> Regards,
> Martin
>

I don't think that moving ob_refcnt to a proper memory pool will solve the
problem of cache pollution anyway.

ob_refcnt is obviously the most stressed field in PyObject, but it's not the
only one. We have , that is needed to model each object (instance)
"behavior", which is massively accessed too, so a cache line will be loaded
as well when the object will be used.

Also, only a few of simple objects have just ob_refcnt and ob_type. Most of
them have other fields too, and accessing them means a line cache load.

Regards,
Cesare

P.S. Memory allocation granularity can help sometimes, leaving some data
(ob_refcnt and/or ob_type) on one cache line, and the other on the next one.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] looking for a contact at Google on the Blogger team

2011-05-22 Thread Nick Coghlan
On Sat, May 21, 2011 at 7:47 AM, "Martin v. Löwis"  wrote:
>> As Jesse has said, there is an RFP in development to improve
>> python.org to the point where we can self-host blogs and the like and
>> deal with the associated user account administration appropriately.
>
> To run a blog on www.python.org, a PEP is not needed. If anybody would
> volunteer to set this up, it could be done in no time.

If I understand correctly, the RFP is more about improving the entire
python.org toolchain to make it something that non-programmers can
easily provide content for (and even *programmers* don't particularly
like the current toolchain).

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Hello!

2011-05-22 Thread Nick Coghlan
On Sat, May 21, 2011 at 9:59 PM, Antoine Pitrou  wrote:
> On Fri, 20 May 2011 19:01:26 +0200
> Charles-François Natali  wrote:
>
>> Hi,
>>
>> My name is Charles-François Natali, I've been using Python for a
>> couple years, and I've recently been granted commit priviledge.
>> I just wanted to say hi to everyone on this list, and let you know
>> that I'm really happy and proud of joining this great community.
>
> Welcome, and keep up the good work.

Indeed!

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The socket HOWTO

2011-05-22 Thread Nick Coghlan
On Sun, May 22, 2011 at 3:38 AM, Georg Brandl  wrote:
> On 05/21/11 18:01, Senthil Kumaran wrote:
>> So a rewrite with good pointers would be more appropriate.
>
> Even then, it's better off in the Wiki until the rewrite is complete.

Perhaps replacing it with a placeholder page that refers to the Wiki
would be appropriate? A simple summary saying that the HOWTO had not
aged well, and hence had been removed from the official documentation
until it had been updated on the Wiki would allow people looking for
it to better understand the situation, and also how to help improve
it.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Stable buildbots update

2011-05-22 Thread Tarek Ziadé
On Mon, May 23, 2011 at 3:00 AM, Bill Janssen  wrote:
> Tarek Ziadé  wrote:
>
>> Yes, I am aware of this. I have fixed today most remaining issues, and
>> fixing the final ones right now.
>
> Just FYI:  the "AMD64 Snow Leopard" buildbot and "PPC Leopard" buildbots
> are now green, but the "PPC Tiger" buildbot is still failing for all
> branches because of packaging errors:

All the linux and windows stable slaves are now green, and I have a
few issues left to be fixed for all solaris flavors and the two you
are showing, that are also failing under Free BSD.

Thanks
Tarek

-- 
Tarek Ziadé | http://ziade.org
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com