On Mon, Mar 4, 2024 at 9:15 PM Bharath Rupireddy <bharath.rupireddyforpostg...@gmail.com> wrote: > > > 0003: > > > > * We need to maintain the invariant that Copy >= Write >= Flush. I > > believe that's always satisfied, because the > > XLogWaitInsertionsToFinish() is always called before XLogWrite(). But > > we should add an assert or runtime check of this invariant somewhere. > > Yes, that invariant is already maintained by the server. Although, I'm > not fully agree, I added an assertion to WaitXLogInsertionsToFinish > after updating XLogCtl->LogwrtResult.Copy. CF bot is happy with it - > https://github.com/BRupireddy2/postgres/tree/atomic_LogwrtResult_v13.
I've now separated these invariants out into the 0004 patch. With the assertions placed in WaitXLogInsertionsToFinish after updating Copy ptr, I observed the assertion failing in one of the CF bot machines - https://cirrus-ci.com/build/6202112288227328. I could reproduce it locally with [1]. I guess the reason is that the Write and Flush ptrs are now updated independently and atomically without lock, they might drift and become out-of-order for a while if concurrently they are accessed in WaitXLogInsertionsToFinish. So, I guess the right place to verify the invariant Copy >= Write >= Flush is in XLogWrite once Write and Flush ptrs in shared memory are updated (note that only one process at a time can do this). Accordingly, I've moved the assertions to XLogWrite in the attached v14-0004 patch. > Please see the attached v13 patch set for further review. Earlier versions of the patches removed a piece of code ensuring shared WAL 'request' values did not fall beading the 'result' values. There's a good reason for us to have it. So, I restored it. - /* - * Update shared-memory status - * - * We make sure that the shared 'request' values do not fall behind the - * 'result' values. This is not absolutely essential, but it saves some - * code in a couple of places. - */ Please see the attached v14 patch set. [1] for i in {1..100}; do make check PROVE_TESTS="t/027_stream_regress.pl"; if [ $? -ne 0 ]; then echo "The command failed on iteration $i"; break; fi; done -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
v14-0001-Add-monotonic-advancement-functions-for-atomics.patch
Description: Binary data
v14-0002-Make-XLogCtl-LogwrtResult-accessible-with-atomic.patch
Description: Binary data
v14-0003-Add-Copy-pointer-to-track-data-copied-to-WAL-buf.patch
Description: Binary data
v14-0004-Add-invariants-for-shared-LogwrtResult-members.patch
Description: Binary data