> On 1 Oct 2024, at 02:35, Thomas Krennwallner <t...@postsubmeta.net> wrote: > > On 30/09/2024 17.29, Daniel Gustafsson wrote: >>> On 30 Sep 2024, at 16:55, Tom Lane <t...@sss.pgh.pa.us> wrote: >>> TBH I'm not finding anything very much wrong with the current >>> behavior... this has to be a rare situation, do we need to add >>> debatable behavior to make it easier? >> One argument would be to make the checks consistent, pg_upgrade generally >> tries >> to report all the offending entries to help the user when fixing the source >> database. Not sure if it's a strong enough argument for carrying code which >> really shouldn't see much use though. > In general, I agree that this situation should be rare for deliberate DROP > DATABASE interrupted in interactive sessions. > > Unfortunately, for (popular) tools that perform automatic "temporary > database" cleanup, we could recently see an increase in invalid databases. > > The additional check for pg_upgrade was made necessary due to several > unrelated customers having invalid databases that stem from left-over Prisma > Migrate "shadow databases" [1]. We could not reproduce this Prisma Migrate > issue yet, as those migrations happened some time ago. Maybe this bug really > stems from a much older Prisma Migrate version and we only see the fallout > now. This is still a TODO item. > > But it appears that this tool can get interrupted "at the wrong time" while > it is deleting temporary databases (probably a manual Ctrl-C), and clients > are unaware that this can then leave behind invalid databases. > > Those temporary databases do not cause any harm as they are not used anymore. > But eventually, PG installations will be upgraded to the next major version, > and it is only then when those invalid databases resurface after pg_upgrade > fails to run the checks.
Databases containing transient data no longer needed left by buggy tools is one thing, but pg_upgrade won't be able to differentiate between those and invalid databases of legitimate interest. Allowing pg_upgrade to skip invalid databases expose the risk of (potentially) valuable data being dropped during the upgrade due to the user not having realized a rarely-used production database was invalid. > Long story short: interactive DROP DATABASE interrupts are rare (they do > exist, but customers are usually aware). Automation tools on the other hand > may run DROP DATABASE and when they get interrupted at the wrong time they > will then produce several left-over invalid databases. pg_upgrade will then > fail to run the checks. Checking and reporting all invalid databases during the check phase seems like a user-friendly option here, I can agree that the current behaviour isn't great for users experiencing this issue. -- Daniel Gustafsson