Hi, On 2019-04-16 17:05:36 -0700, Andres Freund wrote: > On 2019-04-16 18:59:37 -0400, Robert Haas wrote: > > On Tue, Apr 16, 2019 at 6:45 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > > > Do we need to think harder about establishing rules for multiplexed > > > use of the process latch? I'm imagining some rule like "if you are > > > not the outermost event loop of a process, you do not get to > > > summarily clear MyLatch. Make sure to leave it set after waiting, > > > if there was any possibility that it was set by something other than > > > the specific event you're concerned with". > > > > Hmm, yeah. If the latch is left set, then the outer loop will just go > > through an extra and unnecessary iteration, which seems fine. If the > > latch is left clear, then the outer loop might miss a wakeup intended > > for it and hang forever. > > Arguably that's a sign that the latch using code in the outer loop(s) isn't > written correctly? If you do it as: > > while (true) > { > CHECK_FOR_INTERRUPTS(); > > ResetLatch(MyLatch); > > if (work_needed) > { > Plenty(); > Code(); > Using(MyLatch); > } > else > { > WaitLatch(MyLatch); > } > } > > I think that's not a danger? I think the problem really is that we > suggest doing that WaitLatch() unconditionally: > > * The correct pattern to wait for event(s) is: > * > * for (;;) > * { > * ResetLatch(); > * if (work to do) > * Do Stuff(); > * WaitLatch(); > * } > * > * It's important to reset the latch *before* checking if there's work to > * do. Otherwise, if someone sets the latch between the check and the > * ResetLatch call, you will miss it and Wait will incorrectly block. > * > * Another valid coding pattern looks like: > * > * for (;;) > * { > * if (work to do) > * Do Stuff(); // in particular, exit loop if some condition > satisfied > * WaitLatch(); > * ResetLatch(); > * } > > Obviously there's the issue that a lot of latch using code isn't written > that way - but I also don't think there's that many latch using code > that then also uses latch. Seems like we could fix that. While it has > obviously dangers of not being followed, so does the > 'always-set-latch-unless-outermost-loop' approach. > > I'm not sure I like the idea of incurring another unnecessary SetLatch() > call for most latch using places. > > I guess there's a bit bigger danger of taking longer to notice > postmaster-death. But I'm not sure I can quite see that being > problematic - seems like all we should incur is another cycle through > the loop, as the latch shouldn't be set anymore.
I think we should thus change our latch documentation to say: something like: diff --git a/src/include/storage/latch.h b/src/include/storage/latch.h index fc995819d35..dc46dd94c5b 100644 --- a/src/include/storage/latch.h +++ b/src/include/storage/latch.h @@ -44,22 +44,31 @@ * { * ResetLatch(); * if (work to do) - * Do Stuff(); - * WaitLatch(); + * DoStuff(); + * else + * WaitLatch(); * } * * It's important to reset the latch *before* checking if there's work to * do. Otherwise, if someone sets the latch between the check and the * ResetLatch call, you will miss it and Wait will incorrectly block. * + * The reason to only wait on the latch in case there is nothing to do is that + * code inside DoStuff() might use the same latch, and leave it reset, even + * though a SetLatch() aimed for the outer loop arrived. Which again could + * lead to incorrectly blocking in Wait. + * * Another valid coding pattern looks like: * * for (;;) * { * if (work to do) - * Do Stuff(); // in particular, exit loop if some condition satisfied - * WaitLatch(); - * ResetLatch(); + * DoStuff(); // in particular, exit loop if some condition satisfied + * else + * { + * WaitLatch(); + * ResetLatch(); + * } * } * * This is useful to reduce latch traffic if it's expected that the loop's and adapt code to match (at least in the outer loops). Greetings, Andres Freund