On 2013-10-16 22:39:07 +0200, Tom Lane wrote:
> Andres Freund <and...@2ndquadrant.com> writes:
> > On 2013-10-16 14:30:34 -0400, Robert Haas wrote:
> >>> But _and, _or are really useful because they can be used to atomically
> >>> clear and set flag bits.
> >> Until we have some concrete application that requires this
> >> functionality for adequate performance, I'd be inclined not to bother.
> >> I think we have bigger fish to fry, and (to extend my nautical
> >> metaphor) trying to get this working everywhere might amount to trying
> >> to boil the ocean.
> > Well, I actually have a very, very preliminary patch using them in the
> > course of removing the spinlocks in PinBuffer() (via LockBufHdr()).
> I think we need to be very, very wary of making our set of required
> atomics any larger than it absolutely has to be.  The more stuff we
> require, the closer we're going to be to making PG a program that only
> runs well on Intel.

Well, essentially I am advocating to support basically three operations:
* compare and swap
* atomic add (and by that sub)
* test and set

The other operations are just porcelain around that.

With compare and swap both the others can be easily implemented if

Note that e.g. linux - running on all platforms we're talking about but
VAX - exensively and unconditionally uses atomic operations widely. So I
think we don't have to be too afraid about non-intel performance here.

>  I am not comforted by the "gcc will provide good
> implementations of the atomics everywhere" argument, because in fact
> it won't.  See for example the comment associated with our only existing
> use of gcc atomics:
>  * On ARM, we use __sync_lock_test_and_set(int *, int) if available, and if
>  * not fall back on the SWPB instruction.  SWPB does not work on ARMv6 or
>  * later, so the compiler builtin is preferred if available.  Note also that
>  * the int-width variant of the builtin works on more chips than other widths.
>    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> That's not a track record that should inspire much faith that complete
> sets of atomics will exist elsewhere.  What's more, we don't just need
> atomics that *work*, we need atomics that are *fast*, else the whole
> exercise turns into pessimization not improvement.  The more atomic ops
> we depend on, the more likely it is that some of them will involve kernel
> support on some chips, destroying whatever performance improvement we
> hoped to get.

Hm. I can't really follow. We *prefer* to use __sync_lock_test_and_set
in contrast to our own open-coded implementation, right? And the comment
about some hardware only supporting "int-width" matches that I only want to
require u32 atomics support, right?

I completely agree that we cannot rely on 8byte math or similar (16byte
cmpxchg) to be supported by all platforms. That indeed would require
kernel fallbacks on e.g. ARM.

> > ... The only thing I'd touch around platforms in that patch is
> > adding a generic fallback pg_atomic_test_and_set() to s_lock.h and
> > remove the special case usages of __sync_lock_test_and_set() (Arm64,
> > some 32 bit arms).
> Um, if we had a "generic" pg_atomic_test_and_set(), s_lock.h would be
> about one line long.  The above quote seems to me to be exactly the kind
> of optimism that is not warranted in this area.

Well, I am somewhat hesitant to change s_lock for more platforms than
necessary. So I proposed to restructure it in a way that leaves all
existing implementations in place that do not already rely on

There's also SPIN_DELAY btw, which I don't see any widely provided
intrinsics for.


Andres Freund

 Andres Freund                     http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to