On 2013-10-16 22:39:07 +0200, Tom Lane wrote: > Andres Freund <and...@2ndquadrant.com> writes: > > On 2013-10-16 14:30:34 -0400, Robert Haas wrote: > >>> But _and, _or are really useful because they can be used to atomically > >>> clear and set flag bits. > > >> Until we have some concrete application that requires this > >> functionality for adequate performance, I'd be inclined not to bother. > >> I think we have bigger fish to fry, and (to extend my nautical > >> metaphor) trying to get this working everywhere might amount to trying > >> to boil the ocean. > > > Well, I actually have a very, very preliminary patch using them in the > > course of removing the spinlocks in PinBuffer() (via LockBufHdr()). > > I think we need to be very, very wary of making our set of required > atomics any larger than it absolutely has to be. The more stuff we > require, the closer we're going to be to making PG a program that only > runs well on Intel.
Well, essentially I am advocating to support basically three operations: * compare and swap * atomic add (and by that sub) * test and set The other operations are just porcelain around that. With compare and swap both the others can be easily implemented if neccessary. Note that e.g. linux - running on all platforms we're talking about but VAX - exensively and unconditionally uses atomic operations widely. So I think we don't have to be too afraid about non-intel performance here. > I am not comforted by the "gcc will provide good > implementations of the atomics everywhere" argument, because in fact > it won't. See for example the comment associated with our only existing > use of gcc atomics: > > * On ARM, we use __sync_lock_test_and_set(int *, int) if available, and if > * not fall back on the SWPB instruction. SWPB does not work on ARMv6 or > * later, so the compiler builtin is preferred if available. Note also that > * the int-width variant of the builtin works on more chips than other widths. > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > That's not a track record that should inspire much faith that complete > sets of atomics will exist elsewhere. What's more, we don't just need > atomics that *work*, we need atomics that are *fast*, else the whole > exercise turns into pessimization not improvement. The more atomic ops > we depend on, the more likely it is that some of them will involve kernel > support on some chips, destroying whatever performance improvement we > hoped to get. Hm. I can't really follow. We *prefer* to use __sync_lock_test_and_set in contrast to our own open-coded implementation, right? And the comment about some hardware only supporting "int-width" matches that I only want to require u32 atomics support, right? I completely agree that we cannot rely on 8byte math or similar (16byte cmpxchg) to be supported by all platforms. That indeed would require kernel fallbacks on e.g. ARM. > > ... The only thing I'd touch around platforms in that patch is > > adding a generic fallback pg_atomic_test_and_set() to s_lock.h and > > remove the special case usages of __sync_lock_test_and_set() (Arm64, > > some 32 bit arms). > > Um, if we had a "generic" pg_atomic_test_and_set(), s_lock.h would be > about one line long. The above quote seems to me to be exactly the kind > of optimism that is not warranted in this area. Well, I am somewhat hesitant to change s_lock for more platforms than necessary. So I proposed to restructure it in a way that leaves all existing implementations in place that do not already rely on __sync_lock_test_and_set(). There's also SPIN_DELAY btw, which I don't see any widely provided intrinsics for. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers