On Tue, 2 Dec 2025 at 03:41, Andres Freund <[email protected]> wrote: > > Hi,
Hi! > I.e. we modify the buffer without even holding a share lock on the page. That seems ... not ok. I was recently confused by these lines too doing some related hacking on my work. > > Now, this normally won't happen, was the FSM isn't WAL > logged, but still. And I think there may be special circumstances where the > page is included in a WAL record, e.g. as part of an CREATE DATABASE. And > there's FreeSpaceMapPrepareTruncateRel() - which hopefully can't run > concurrently with fsm_vacuum_page(), but would seem to court WAL corruption, > if it ever did. Yep, also extension may want to run log_newpage_range for FSM fork for whatever reason. Extension then should rely on that when running log_newpage_range, other activity will not cause any hazards or corruptions. Is it? > Besides modifying the page while not even share locked, there are a few other > oddities: > > > There seem to be some other oddities: > - GetRecordedFreeSpace() does fsm_get_avail() without locking > - fsm_vacuum_page() does fsm_get_avail(), fsm_get_max_avail() without locking IIRC, Given these all are single-byte reads will there be no actual troubles? Like, we already can read corrupted info from FSM fork, because this is just a "hint", and fsm callers are ready to not-trust the results. Just my 2c > > ISTM we clearly should take a lock in fsm_vacuum_page() to reset fp_next_slot, > that just seems like a nasty hard to find bug waiting to happen. Changing it > to not look at the page without a lock seems a bit more challenging. > I agree on the grounds of enforcing good examples of buf lock/pin management along the internals. Like, maybe all of this actually works on HEAD as expected, but I'm not sure it is worth its additional complexity. -- Best regards, Kirill Reshke
