On 2018-04-06 11:25:59 +0200, Magnus Hagander wrote: > Since you know a lot more about that type of interlocks than I do :) We > already wait for all running transactions to finish before we start doing > anything. Obviously transactions != buffer writes (and we have things like > the checkpointer/bgwriter to consider). Is there something else that we > could safely just *wait* for? I have no problem whatsoever if this is a > long wait (given the total time). I mean to the point of "what if we just > stick a sleep(10) in there" level waiting.
I don't think anything just related to "time" is resonable in any sort of way. On a overloaded system you can see long long stalls of processes that have done a lot of work. Locking protocols should be correct, and that's that. > Or can that somehow be cleanly solved using some of the new atomic > operators? Or is that likely to cause the same kind of overhead as throwing > a barrier in there? Worse. I wonder if we could introduce "MegaExpensiveRareMemoryBarrier()" that goes through pgproc and signals every process with a signal that requiers the other side to do an operation implying a memory barrier. That's actually not hard to do (e.g. every latch operation qualifies), the problem is that signal delivery isn't synchronous. So you need some acknowledgement protocol. I think you could introduce a procsignal message that does a memory barrier and then sets PGPROC->barrierGeneration to ProcArrayStruct->barrierGeneration. MegaExpensiveRareMemoryBarrier() increments ProcArrayStruct->barrierGeneration, signals everyone, and then waits till every PGPROC->barrierGeneration has surpassed ProcArrayStruct->barrierGeneration. Greetings, Andres Freund