On Sun, Jun 14, 2026 at 02:23:54PM +0500, Andrey Borodin wrote: > The same revision also closes an unrelated init race: when two processes > create the backing file at once, the loser could map it before the winner > sized it (SIG-something on first touch). The attach path now waits for the > file to reach full size before mapping. > > CI is green. Same two patches on top of your atomics commit. Let me know > if this prototype goes radically wrong way.
You may find my reply surprising (or not), but using a client tool to bypass the SQL protocol is super invasive in terms of the in-core changes, and I'm -1 on that. Test tooling should rely on simple facilities, and you are re-implementing quite a few things here that come with a new class of bugs and what look like design issues to me due to the invasiveness: - Reimplementation of the registry protocol for file mapping. That may be useful for other things, and there may be use for a refactored in-core API, but I don't see why we should use it here: - Mirroring of internal structs for shmem manipulation. - Core data updates with zero locking. I am wondering if we are not overcomplicating things here.. How about this idea instead of 0002 and 0003, for paths that cannot rely on some SQL: - Let's use a file-based markup, located in a injection_points/ in the data folder. - When attaching a wait point, write a marker, say wait_$POINT. - On wakeup, write a wakeup_$POINT. - The wait routine does periodic checks of the wakeup_$POINT file, using a stat(). It depends on how much we want to achieve, but forcing a postmaster to wait at an early startup sequence would then be something like: - Add the a wait markup. - At startup, shared_preload_libraries scans the pg_injection_points/ repo, fills in its shmem state by calling InjectionPointAttach(). - postmaster hits the injection point, calls injection_wait() - Test does what it wants while the postmaster waits, manipulates states. - Test script drops a wakeup file. - Postmaster sees the file, continues its startup sequence. This means one cannot set an injection point until shared_preload_libraries is loaded, of course. Cleanup logic feels kind of nice: test cleans his stuff, or just remove pg_injection_points/ at shutdown with an exit callback. No platform specific logic, only FS checks. This means a bit higher latency, but it eliminates a full class of bugs due to the fact that it is simpler. This would need to live alongside patch 0001 so as a _PG_init() can be loaded when the injection_points lib is loaded; we're still going to need it to fill the shmem info with the wait slot being occupied. -- Michael
signature.asc
Description: PGP signature
