Re: LogwrtResult contended spinlock

2024-04-08 Thread Jeff Davis
On Mon, 2024-04-08 at 10:24 +0200, Alvaro Herrera wrote: > My trouble with the "copy" term is that we don't use that term > anywhere > in relation to WAL. I got the term from CopyXLogRecordToWAL(). > This "copy" is in > reality just the insertion, after it's finished.  The "Result" suffix > is

Re: LogwrtResult contended spinlock

2024-04-08 Thread Alvaro Herrera
On 2024-Apr-07, Jeff Davis wrote: > On Sun, 2024-04-07 at 14:19 +0200, Alvaro Herrera wrote: > > I pushed the "copy" pointer now, except that I renamed it to > > "insert", > > which is what we call the operation being tracked.  I also added some > > comments. > > The "copy" pointer, as I called

Re: LogwrtResult contended spinlock

2024-04-07 Thread Jeff Davis
On Sun, 2024-04-07 at 14:19 +0200, Alvaro Herrera wrote: > I pushed the "copy" pointer now, except that I renamed it to > "insert", > which is what we call the operation being tracked.  I also added some > comments. The "copy" pointer, as I called it, is not the same as the "insert" pointer (as

Re: LogwrtResult contended spinlock

2024-04-07 Thread Alvaro Herrera
I pushed the "copy" pointer now, except that I renamed it to "insert", which is what we call the operation being tracked. I also added some comments. One perhaps significant change from what Bharath submitted is that we now use the return value from monotonic advance to return an updated value

Re: LogwrtResult contended spinlock

2024-04-06 Thread Jeff Davis
On Sat, 2024-04-06 at 18:13 +0200, Alvaro Herrera wrote: > my understanding of pg_atomic_compare_exchange_u64 is that > *expected is set to the value that's stored in *ptr prior to the > exchange. Sorry, my mistake. Your version looks good. Regards, Jeff Davis

Re: LogwrtResult contended spinlock

2024-04-06 Thread Alvaro Herrera
On 2024-Apr-05, Jeff Davis wrote: > Minor comments: > * Also, I assume that the Max() call in > pg_atomic_monotonic_advance_u64() is just for clarity? Surely the > currval cannot be less than _target when it returns. I'd probably just > do Assert(currval >= _target) and then return currval. Uhh

Re: LogwrtResult contended spinlock

2024-04-06 Thread Bharath Rupireddy
On Sat, Apr 6, 2024 at 9:21 AM Thomas Munro wrote: > > On Sat, Apr 6, 2024 at 6:55 AM Alvaro Herrera wrote: > > Pushed 0001. > > Could that be related to the 3 failures on parula that look like this? > > TRAP: failed Assert("node->next == 0 && node->prev == 0"), File: >

Re: LogwrtResult contended spinlock

2024-04-05 Thread Thomas Munro
On Sat, Apr 6, 2024 at 6:55 AM Alvaro Herrera wrote: > Pushed 0001. Could that be related to the 3 failures on parula that look like this? TRAP: failed Assert("node->next == 0 && node->prev == 0"), File: "../../../../src/include/storage/proclist.h", Line: 63, PID: 29119 2024-04-05 16:16:26.812

Re: LogwrtResult contended spinlock

2024-04-05 Thread Jeff Davis
On Fri, 2024-04-05 at 13:54 +0200, Alvaro Herrera wrote: > Couldn't push: I tested with --disable-atomics --disable-spinlocks > and > the tests fail because the semaphore for the atomic variables is not > always initialized.  This is weird -- it's like a client process is > running at a time when

Re: LogwrtResult contended spinlock

2024-04-05 Thread Alvaro Herrera
Pushed 0001. Here's the patch that adds the Copy position one more time, with the monotonic_advance function returning the value. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ >From 3f5c860576245b92701e7bfc517947c418c68510 Mon Sep 17 00:00:00 2001 From: Alvaro

Re: LogwrtResult contended spinlock

2024-04-05 Thread Alvaro Herrera
Couldn't push: I tested with --disable-atomics --disable-spinlocks and the tests fail because the semaphore for the atomic variables is not always initialized. This is weird -- it's like a client process is running at a time when StartupXLOG has not initialized the variable ... so the

Re: LogwrtResult contended spinlock

2024-04-05 Thread Alvaro Herrera
On 2024-Apr-05, Bharath Rupireddy wrote: > 1. > /* > * Update local copy of shared XLogCtl->log{Write,Flush}Result > + * > + * It's critical that Flush always trails Write, so the order of the reads is > + * important, as is the barrier. > */ > #define RefreshXLogWriteResult(_target) \ >

Re: LogwrtResult contended spinlock

2024-04-04 Thread Bharath Rupireddy
On Fri, Apr 5, 2024 at 4:21 AM Jeff Davis wrote: > > > 4. If we're not modifying any callers of WALReadFromBuffers(), then > > AFAICS the added check that the request is not past the Copy pointer > > is > > useless. In a quick look at that code, I think we only try to read > > data > > that's

Re: LogwrtResult contended spinlock

2024-04-04 Thread Jeff Davis
On Thu, 2024-04-04 at 19:45 +0200, Alvaro Herrera wrote: > 1. Using pg_atomic_write_membarrier_u64 is useless and it imposes > mora > barriers than we actually need.  So I switched back to > pg_atomic_write_u64 and add one barrier between the two writes.  Same > for reads. +1. This looks correct

Re: LogwrtResult contended spinlock

2024-04-04 Thread Alvaro Herrera
I've noticed a few things here, v16 attached with some rather largish changes. 1. Using pg_atomic_write_membarrier_u64 is useless and it imposes mora barriers than we actually need. So I switched back to pg_atomic_write_u64 and add one barrier between the two writes. Same for reads. 2. Using

Re: LogwrtResult contended spinlock

2024-04-03 Thread Jeff Davis
On Wed, 2024-04-03 at 13:19 +0200, Alvaro Herrera wrote: > So what I do in the attached 0001 is stop using the XLogwrtResult > struct > in XLogCtl and replace it with separate Write and Flush values, and > add > the macro XLogUpdateLocalLogwrtResult() that copies the values of > Write > and Flush

Re: LogwrtResult contended spinlock

2024-04-03 Thread Alvaro Herrera
On 2024-Apr-03, Bharath Rupireddy wrote: > On Wed, Apr 3, 2024 at 4:49 PM Alvaro Herrera wrote: > > So what I do in the attached 0001 is stop using the XLogwrtResult struct > > in XLogCtl and replace it with separate Write and Flush values, and add > > the macro XLogUpdateLocalLogwrtResult()

Re: LogwrtResult contended spinlock

2024-04-03 Thread Bharath Rupireddy
On Wed, Apr 3, 2024 at 4:49 PM Alvaro Herrera wrote: > > Thanks for keeping this moving forward. I gave your proposed patches a > look. One thing I didn't like much is that we're adding a new member > (Copy) to XLogwrtAtomic -- but this struct is supposed to be a mirror of > XLogwrtResult for

Re: LogwrtResult contended spinlock

2024-04-03 Thread Alvaro Herrera
On 2024-Apr-03, Alvaro Herrera wrote: > So what I do in the attached 0001 is stop using the XLogwrtResult struct > in XLogCtl and replace it with separate Write and Flush values, and add > the macro XLogUpdateLocalLogwrtResult() that copies the values of Write > and Flush from the shared XLogCtl

Re: LogwrtResult contended spinlock

2024-04-03 Thread Alvaro Herrera
Thanks for keeping this moving forward. I gave your proposed patches a look. One thing I didn't like much is that we're adding a new member (Copy) to XLogwrtAtomic -- but this struct is supposed to be a mirror of XLogwrtResult for use with atomic access. Since this new member is not added to

Re: LogwrtResult contended spinlock

2024-03-14 Thread Bharath Rupireddy
On Mon, Mar 4, 2024 at 9:15 PM Bharath Rupireddy wrote: > > > 0003: > > > > * We need to maintain the invariant that Copy >= Write >= Flush. I > > believe that's always satisfied, because the > > XLogWaitInsertionsToFinish() is always called before XLogWrite(). But > > we should add an assert or

Re: LogwrtResult contended spinlock

2024-03-04 Thread Bharath Rupireddy
Thanks for looking into this. On Thu, Feb 22, 2024 at 1:54 AM Jeff Davis wrote: > > > 3. > > @@ -6371,7 +6373,9 @@ GetFlushRecPtr(TimeLineID *insertTLI) > > If at all, a read > > barrier is warranted here, we can use atomic read with full barrier > > I don't think we need a full barrier but I'm

Re: LogwrtResult contended spinlock

2024-02-22 Thread Robert Haas
On Fri, Feb 23, 2024 at 1:18 AM Jeff Davis wrote: > I don't see the global non-shared variable as a huge problem, so if it > serves a purpose then I'm fine keeping it. Perhaps we could make it a > bit safer by using some wrapper functions. I actually really hate these kinds of variables. I think

Re: LogwrtResult contended spinlock

2024-02-22 Thread Jeff Davis
On Thu, 2024-02-22 at 10:17 +0530, Robert Haas wrote: > I think the problems tend to be worst when you have some bit of data > that's being frequently modified by multiple backends. Every backend > that wants to modify the value needs to steal the cache line, and > eventually you spend most of

Re: LogwrtResult contended spinlock

2024-02-21 Thread Robert Haas
On Thu, Feb 22, 2024 at 1:54 AM Jeff Davis wrote: > On Wed, 2024-02-21 at 20:00 +0530, Bharath Rupireddy wrote: > > The local copy of LogwrtResult is so frequently used in the backends, > > if we were to replace it with atomic accesses, won't the atomic reads > > be costly and start showing up in

Re: LogwrtResult contended spinlock

2024-02-21 Thread Jeff Davis
On Wed, 2024-02-21 at 20:00 +0530, Bharath Rupireddy wrote: > The local copy of LogwrtResult is so frequently used in the backends, > if we were to replace it with atomic accesses, won't the atomic reads > be costly and start showing up in perf profiles? I don't see exactly where the extra cost

Re: LogwrtResult contended spinlock

2024-02-21 Thread Bharath Rupireddy
On Sat, Feb 17, 2024 at 2:24 AM Jeff Davis wrote: > > Though it looks like we can remove the non-shared LogwrtResult > entirely. Andres expressed some concern here: > > https://www.postgresql.org/message-id/20210130020211.rtu5ir3dpjrbi...@alap3.anarazel.de > > But then seemed to favor removing it

Re: LogwrtResult contended spinlock

2024-02-16 Thread Jeff Davis
On Mon, 2024-02-12 at 17:44 -0800, Jeff Davis wrote: > It looks like there's some renewed interest in this patch: After rebasing (attached as 0001), I'm seeing some test failures. It looks like the local LogwrtResult is not being updated in as many places, and that's hitting the Assert that I

Re: LogwrtResult contended spinlock

2024-02-12 Thread Jeff Davis
On Fri, 2022-09-23 at 10:49 +0200, Alvaro Herrera wrote: > On 2022-Jul-28, Alvaro Herrera wrote: > > > v10 is just a trivial rebase.  No changes.  Moved to next > > commitfest. > > I realized that because of commit e369f3708636 this change is no > longer > as critical as it used to be, so I'm

Re: LogwrtResult contended spinlock

2022-09-23 Thread Alvaro Herrera
On 2022-Jul-28, Alvaro Herrera wrote: > v10 is just a trivial rebase. No changes. Moved to next commitfest. I realized that because of commit e369f3708636 this change is no longer as critical as it used to be, so I'm withdrawing this patch from the commitfest. -- Álvaro Herrera

Re: LogwrtResult contended spinlock

2022-07-28 Thread Alvaro Herrera
v10 is just a trivial rebase. No changes. Moved to next commitfest. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ >From 86655a0231e277c3f1bc907a0f0eb669943d4c71 Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Tue, 2 Feb 2021 14:03:43 -0300 Subject:

Re: LogwrtResult contended spinlock

2022-04-07 Thread Alvaro Herrera
On 2022-Apr-05, Alvaro Herrera wrote: > Apologies -- I selected the wrong commit to extract the commit message > from. Here it is again. I also removed an obsolete /* XXX */ comment. I spent a lot of time staring at this to understand the needs for memory barriers in the interactions. In the

Re: LogwrtResult contended spinlock

2022-04-05 Thread Alvaro Herrera
Apologies -- I selected the wrong commit to extract the commit message from. Here it is again. I also removed an obsolete /* XXX */ comment. -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/ >From 0b26f90f1b95f8e9b932eb34bbf9c2a50729cf60 Mon Sep 17 00:00:00 2001

Re: LogwrtResult contended spinlock

2022-04-05 Thread Alvaro Herrera
Here's a v8, where per my previous comment I removed some code that I believe is no longer necessary. I've omitted the patch that renames LogwrtResult subvariables into LogWriteResult/LogWriteFlush; I still think the end result is better after that one, but it's a pretty trivial change that can

Re: LogwrtResult contended spinlock

2022-03-22 Thread Alvaro Herrera
So I've been wondering about this block at the bottom of XLogWrite: /* * Make sure that the shared 'request' values do not fall behind the * 'result' values. This is not absolutely essential, but it saves some * code in a couple of places. */ {

Re: LogwrtResult contended spinlock

2022-03-22 Thread Alvaro Herrera
On 2022-Mar-22, Tom Lane wrote: > I looked briefly at 0001, and I've got to say that I disagree with > your decision to rearrange the representation of the local LogwrtResult > copy. It clutters the patch tremendously and makes it hard to > understand what the actual functional change is.

Re: LogwrtResult contended spinlock

2022-03-22 Thread Tom Lane
Alvaro Herrera writes: > On 2022-Mar-21, Andres Freund wrote: >> Are you aiming this for v15? Otherwise I'd like to move the entry to the next >> CF. Marked as waiting-on-author. > I'd like to get 0001 pushed to pg15, yes. I'll let 0002 sit here for > discussion, but I haven't seen any evidence

Re: LogwrtResult contended spinlock

2022-03-22 Thread Alvaro Herrera
On 2022-Mar-21, Andres Freund wrote: > This patch currently doesn't apply: http://cfbot.cputube.org/patch_37_2716.log Updated. > Are you aiming this for v15? Otherwise I'd like to move the entry to the next > CF. Marked as waiting-on-author. I'd like to get 0001 pushed to pg15, yes. I'll let

Re: LogwrtResult contended spinlock

2022-03-21 Thread Andres Freund
Hi, On 2021-11-22 18:56:43 -0300, Alvaro Herrera wrote: > There was an earlier comment by Andres that asyncXactLSN should also be > atomic, to avoid an ugly spinlock interaction with the new atomic-based > logwrtResult. The 0002 here is an attempt at doing that; I found that > it also needed to

Re: LogwrtResult contended spinlock

2021-11-22 Thread Alvaro Herrera
There was an earlier comment by Andres that asyncXactLSN should also be atomic, to avoid an ugly spinlock interaction with the new atomic-based logwrtResult. The 0002 here is an attempt at doing that; I found that it also needed to change WalWriterSleeping to use atomics, to avoid

Re: LogwrtResult contended spinlock

2021-11-18 Thread Alvaro Herrera
Here's a further attempt at this. Sorry it took so long. In this version, I replaced the coupled-in-a-struct representation of Write with two separate global variables. The reason to do this is to cater to Andres' idea to keep them up-to-date separately. Of course, I could kept them together,

Re: LogwrtResult contended spinlock

2021-11-04 Thread Daniel Gustafsson
> On 8 Sep 2021, at 17:14, Jaime Casanova wrote: > > On Tue, Feb 02, 2021 at 08:19:19PM -0300, Alvaro Herrera wrote: >> Hello, >> >> So I addressed about half of your comments in this version merely by >> fixing silly bugs. The problem I had which I described as >> "synchronization fails" was

Re: LogwrtResult contended spinlock

2021-09-08 Thread Jaime Casanova
On Tue, Feb 02, 2021 at 08:19:19PM -0300, Alvaro Herrera wrote: > Hello, > > So I addressed about half of your comments in this version merely by > fixing silly bugs. The problem I had which I described as > "synchronization fails" was one of those silly bugs. > Hi Álvaro, Are we waiting for

Re: LogwrtResult contended spinlock

2021-02-02 Thread Andres Freund
Hi, On 2021-02-02 20:19:19 -0300, Alvaro Herrera wrote: > > > @@ -1169,6 +1169,8 @@ XLogInsertRecord(XLogRecData *rdata, > > > { > > > /* advance global request to include new block(s) > > > */ > > > pg_atomic_monotonic_advance_u64(>LogwrtRqst.Write, > > >

Re: LogwrtResult contended spinlock

2021-02-02 Thread Alvaro Herrera
Hello, So I addressed about half of your comments in this version merely by fixing silly bugs. The problem I had which I described as "synchronization fails" was one of those silly bugs. So in further thinking, it seems simpler to make only LogwrtResult atomic, and leave LogwrtRqst as

Re: LogwrtResult contended spinlock

2021-01-29 Thread Andres Freund
Hi, On 2021-01-29 22:47:56 -0300, Alvaro Herrera wrote: > So I tried this, but -- perhaps not suprisingly -- I can't get it to > work properly; the synchronization fails. What do you mean by "synchronization fails"? > Strangely, all tests work for me, but the pg_upgrade one in particular >

Re: LogwrtResult contended spinlock

2021-01-29 Thread Andres Freund
Hi, On 2021-01-29 12:40:18 -0300, Alvaro Herrera wrote: > On 2020-Aug-31, Andres Freund wrote: > > > Wouldn't the better fix here be to allow reading of individual members > > without a lock? E.g. by wrapping each in a 64bit atomic. > > So I've been playing with this and I'm annoyed about

Re: LogwrtResult contended spinlock

2021-01-29 Thread Alvaro Herrera
So I tried this, but -- perhaps not suprisingly -- I can't get it to work properly; the synchronization fails. I suspect I need some barriers, but I tried adding a few (probably some that are not really necessary) and that didn't have the expected effect. Strangely, all tests work for me, but

Re: LogwrtResult contended spinlock

2021-01-29 Thread Alvaro Herrera
On 2020-Aug-31, Andres Freund wrote: > Wouldn't the better fix here be to allow reading of individual members > without a lock? E.g. by wrapping each in a 64bit atomic. So I've been playing with this and I'm annoyed about having two datatypes to represent Write/Flush positions: typedef struct

Re: LogwrtResult contended spinlock

2021-01-22 Thread Masahiko Sawada
On Fri, Jan 22, 2021 at 10:39 PM Alvaro Herrera wrote: > > On 2021-Jan-22, Masahiko Sawada wrote: > > > On Wed, Nov 25, 2020 at 12:02 AM Alvaro Herrera > > wrote: > > > > Yes, please move it forward. I'll post an update sometime before the > > > next CF. > > > > Anything update on this? > >

Re: LogwrtResult contended spinlock

2021-01-22 Thread Alvaro Herrera
On 2021-Jan-22, Masahiko Sawada wrote: > On Wed, Nov 25, 2020 at 12:02 AM Alvaro Herrera > wrote: > > Yes, please move it forward. I'll post an update sometime before the > > next CF. > > Anything update on this? I'll update this one early next week. Thanks! -- Álvaro Herrera

Re: LogwrtResult contended spinlock

2021-01-22 Thread Masahiko Sawada
Hi Alvaro, On Wed, Nov 25, 2020 at 12:02 AM Alvaro Herrera wrote: > > On 2020-Nov-24, Anastasia Lubennikova wrote: > > > On 04.09.2020 20:13, Andres Freund wrote: > > > > Re general routine: On second thought, it might actually be worth having > > > it. Even just for LSNs - there's plenty places

Re: LogwrtResult contended spinlock

2020-11-24 Thread Alvaro Herrera
On 2020-Nov-24, Anastasia Lubennikova wrote: > On 04.09.2020 20:13, Andres Freund wrote: > > Re general routine: On second thought, it might actually be worth having > > it. Even just for LSNs - there's plenty places where it's useful to > > ensure a variable is at least a certain size. I think

Re: LogwrtResult contended spinlock

2020-11-24 Thread Anastasia Lubennikova
On 04.09.2020 20:13, Andres Freund wrote: Hi, On 2020-09-04 10:05:45 -0700, Andres Freund wrote: On 2020-09-03 14:34:52 -0400, Alvaro Herrera wrote: Looking at patterns like this if (XLogCtl->LogwrtRqst.Write < EndPos) XLogCtl->LogwrtRqst.Write = EndPos; It seems

Re: LogwrtResult contended spinlock

2020-09-04 Thread Andres Freund
Hi, On 2020-09-04 10:05:45 -0700, Andres Freund wrote: > On 2020-09-03 14:34:52 -0400, Alvaro Herrera wrote: > > Looking at patterns like this > > > > if (XLogCtl->LogwrtRqst.Write < EndPos) > > XLogCtl->LogwrtRqst.Write = EndPos; > > > > It seems possible to implement with > >

Re: LogwrtResult contended spinlock

2020-09-04 Thread Andres Freund
Hi, On 2020-09-03 14:34:52 -0400, Alvaro Herrera wrote: > Looking at patterns like this > > if (XLogCtl->LogwrtRqst.Write < EndPos) > XLogCtl->LogwrtRqst.Write = EndPos; > > It seems possible to implement with > > do { > XLogRecPtr currwrite; > >

Re: LogwrtResult contended spinlock

2020-09-03 Thread Alvaro Herrera
Looking at patterns like this if (XLogCtl->LogwrtRqst.Write < EndPos) XLogCtl->LogwrtRqst.Write = EndPos; It seems possible to implement with do { XLogRecPtr currwrite; currwrite = pg_atomic_read_u64(LogwrtRqst.Write); if (currwrite >

Re: LogwrtResult contended spinlock

2020-08-31 Thread Andres Freund
Hi, On August 31, 2020 11:34:45 AM PDT, Alvaro Herrera wrote: >On 2020-Aug-31, Andres Freund wrote: > >> Hi, >> >> On August 31, 2020 11:21:56 AM PDT, Alvaro Herrera > wrote: > >> >At first I wanted to make the new LWLock cover only LogwrtResult >> >proper, >> >and leave LogwrtRqst alone.

Re: LogwrtResult contended spinlock

2020-08-31 Thread Alvaro Herrera
On 2020-Aug-31, Andres Freund wrote: > Hi, > > On August 31, 2020 11:21:56 AM PDT, Alvaro Herrera > wrote: > >At first I wanted to make the new LWLock cover only LogwrtResult > >proper, > >and leave LogwrtRqst alone. However on doing it, it seemed that that > >might change the locking

Re: LogwrtResult contended spinlock

2020-08-31 Thread Andres Freund
Hi, On August 31, 2020 11:21:56 AM PDT, Alvaro Herrera wrote: >Jaime Casanova recently reported a situation where pglogical >replicating >from 64 POS sites to a single central (64-core) node, each with two >replication sets, causes XLog's info_lck to become highly contended >because of

LogwrtResult contended spinlock

2020-08-31 Thread Alvaro Herrera
Jaime Casanova recently reported a situation where pglogical replicating from 64 POS sites to a single central (64-core) node, each with two replication sets, causes XLog's info_lck to become highly contended because of frequently reading LogwrtResult. We tested the simple fix of adding a new