On Wed, Feb 7, 2018 at 1:41 PM, Andres Freund <and...@anarazel.de> wrote: > Well, it's more than just systems like that - for 64bit atomics we > sometimes do fall back to spinlock based atomics on 32bit systems, even > if they support 32 bit atomics.
I built with -m32 on my laptop and tried "select aid, count(*) from pgbench_accounts group by 1 having count(*) > 1" on pgbench at scale factor 100 with pgbench_accounts_pkey dropped and max_parallel_workers_per_gather set to 10 on (a) commit 0b5e33f667a2042d7022da8bef31a8be5937aad1 (I know this is a little old, but I think it doesn't matter), (b) same plus shm-mq-less-spinlocks-v3, and (c) same plus shm-mq-less-spinlocks-v3 and shm-mq-reduce-receiver-latch-set-v2. (a) 16563.790 ms, 16625.257 ms, 16496.062 ms (b) 17217.051 ms, 17157.745 ms, 17225.755 ms [median to median +3.9% vs. (a)] (c) 15491.947 ms, 15455.840 ms, 15452.649 ms [median to median -7.0% vs. (a), -10.2% vs (b)] Do you think that's a problem? If it is, what do you think we should do about it? It seems to me that it's probably OK because (1) with both patches we still come out ahead and (2) 32-bit systems will presumably continue to become rarer as time goes on, but you might disagree. >> > Hm, do all paths here guarantee that mq->mq_detached won't be stored on >> > the stack / register in the first iteration, and then not reread any >> > further? I think it's fine because every branch of the if below ends up >> > in a syscall / barrier, but it might be worth noting on that here. >> >> Aargh. I hate compilers. I added a comment. Should I think about >> changing mq_detached to pg_atomic_uint32 instead? > > I think a pg_compiler_barrier() would suffice to alleviate my concern, > right? If you wanted to go for an atomic, using pg_atomic_flag would > probably be more appropriate than pg_atomic_uint32. Hmm, all right, I'll add pg_compiler_barrier(). >> >> - /* Write as much data as we can via a single >> >> memcpy(). */ >> >> + /* >> >> + * Write as much data as we can via a single >> >> memcpy(). Make sure >> >> + * these writes happen after the read of >> >> mq_bytes_read, above. >> >> + * This barrier pairs with the one in >> >> shm_mq_inc_bytes_read. >> >> + */ >> > >> > s/above/above. Otherwise a newer mq_bytes_read could become visible >> > before the corresponding reads have fully finished./? >> >> I don't find that very clear. A newer mq_bytes_read could become >> visible whenever, and a barrier doesn't prevent that from happening. > > Well, my point was that the barrier prevents the the write to > mq_bytes_read becoming visible before the corresponding reads have > finished. Which then would mean the memcpy() could overwrite them. And a > barrier *does* prevent that from happening. I think we're talking about the same thing, but not finding each others' explanations very clear. >> Hmm, I'm not sure I understand what you're suggesting, here. In >> general, we return with the data for the current message unconsumed, >> and then consume it the next time the function is called, so that >> (except when the message wraps the end of the buffer) we can return a >> pointer directly into the buffer rather than having to memcpy(). What >> this patch does is postpone consuming the data further, either until >> we can free up at least a quarter of the ring buffer or until we need >> to wait for more data. It seemed worthwhile to free up space in the >> ring buffer occasionally even if we weren't to the point of waiting >> yet, so that the sender has an opportunity to write new data into that >> space if it happens to still be running. > > What I'm trying to suggest is that instead of postponing an update of > mq_bytes_read (by storing amount of already performed reads in > mqh_consume_pending), we continue to update mq_bytes_read but only set > the latch if your above thresholds are crossed. That way a burst of > writes can fully fill the ringbuffer, but the cost of doing a SetLatch() > is amortized. In my testing SetLatch() was the expensive part, not the > necessary barriers in shm_mq_inc_bytes_read(). OK, I'll try to check how feasible that would be. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company