On Mon, Aug 7, 2023 at 1:06 PM Julien Rouhaud <rjuju...@gmail.com> wrote: > > On Mon, Aug 07, 2023 at 12:42:33PM +0530, Amit Kapila wrote: > > On Mon, Aug 7, 2023 at 11:29 AM Julien Rouhaud <rjuju...@gmail.com> wrote: > > > > > > Unless I'm missing something I don't see what prevents something to > > > connect > > > using the replication protocol and issue any query or even create new > > > replication slots? > > > > > > > I think the point is that if we have any slots where we have not > > consumed the pending WAL (other than the expected like > > SHUTDOWN_CHECKPOINT) or if there are invalid slots then the upgrade > > won't proceed and we will request user to remove such slots or ensure > > that WAL is consumed by slots. So, I think in the case you mentioned, > > the upgrade won't succeed. > > What if new slots are added while the old instance is started in the middle of > pg_upgrade, *after* the various checks are done? >
They won't be copied but I think that won't be any different than other objects like tables. Anyway, I have another idea which is to not allow creating slots during binary upgrade unless one specifically requests it by having an API like binary_upgrade_allow_slot_create() similar to existing APIs binary_upgrade_*. > > > Note also that as complained a few years ago nothing prevents a bgworker > > > from > > > spawning up during pg_upgrade and possibly corrupt the upgraded cluster if > > > multixid are assigned. If publications are preserved wouldn't it mean > > > that > > > such bgworkers could also lead to data loss? > > > > > > > Is it because such workers would write some WAL which slots may not > > process? If so, I think it is equally dangerous as other problems that > > can arise due to such a worker. Do you think of any special handling > > here? > > Yes, and there were already multiple reports of multixact corruption due to > bgworker activity during pg_upgrade (see > https://www.postgresql.org/message-id/20210121152357.s6eflhqyh4g5e...@dalibo.com > for instance). I think we should once and for all fix this whole class of > problem one way or another. > I don't object to doing something like we discussed in the thread you linked but don't see the link with this work. Surely, the extra WAL/XIDs generated during the upgrade will cause data inconsistency which is no different after this patch. -- With Regards, Amit Kapila.