Hi, On 2018-04-05 23:32:19 +0200, Magnus Hagander wrote: > On Thu, Apr 5, 2018 at 11:23 PM, Andres Freund <and...@anarazel.de> wrote: > > Is there any sort of locking that guarantees that worker processes see > > an up2date value of > > DataChecksumsNeedWrite()/ControlFile->data_checksum_version? Afaict > > there's not. So you can afaict end up with checksums being computed by > > the worker, but concurrent writes missing them. The window is going to > > be at most one missed checksum per process (as the unlocking of the page > > is a barrier) and is probably not easy to hit, but that's dangerous > > enough. > > > > So just to be clear of the case you're worried about. It's basically: > Session #1 - sets checksums to inprogress > Session #1 - starts dynamic background worker ("launcher") > Launcher reads and enumerates pg_database > Launcher starts worker in first database > Worker processes first block of data in database > And at this point, Session #2 has still not seen the "checksums inprogress" > flag and continues to write without checksums?
Yes. I think there are some variations of that, but yes, that's pretty much it. > That seems like quite a long time to me -- is that really a problem? We don't generally build locking models that are only correct based on likelihood. Especially not without a lengthy comment explaining that analysis. > I'm guessing you're seeing a shorter path between the two that I can't > see right now (I'll blame the late evning...)? I don't think it matters terribly much how long that path is. Greetings, Andres Freund