On Wed, Dec 15, 2021 at 2:21 AM Antoine Pitrou <anto...@python.org> wrote:
>
> On Wed, 15 Dec 2021 10:42:17 +0100
> Christian Heimes <christ...@python.org> wrote:
> > On 14/12/2021 19.19, Eric Snow wrote:
> > > A while back I concluded that neither approach would work for us.  The
> > > approach I had taken would have significant cache performance
> > > penalties in a per-interpreter GIL world.  The approach that modifies
> > > Py_INCREF() has a significant performance penalty due to the extra
> > > branch on such a frequent operation.
> >
> > Would it be possible to write the Py_INCREF() and Py_DECREF() macros in
> > a way that does not depend on branching? For example we could use the
> > highest bit of the ref count as an immutable indicator and do something like
> >
> >      ob_refcnt += !(ob_refcnt >> 63)
> >
> > instead of
> >
> >      ob_refcnt++
>
> Probably, but that would also issue spurious writes to immortal
> refcounts from different threads at once, so might end up worse
> performance-wise.

Unless the CPU is clever enough to skip claiming the cacheline in
exclusive-mode for a "+= 0". Which I guess is something you'd have to
check empirically on every microarch and instruction pattern you care
about, because there's no way it's documented. But maybe? CPUs are
very smart, except when they aren't.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZLARVPQCPZXWVHGYOZNSDRTCNNJ67ANM/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to