On Mon, May 13, 2024 at 9:44 AM Alvaro Herrera <alvhe...@alvh.no-ip.org> wrote: > The problematic point is the need to add NOT NULL constraints during > table creation that don't exist in the table being dumped, for > performance of primary key creation -- I called this a throwaway > constraint. We needed to be able to drop those constraints after the PK > was created. These were marked NO INHERIT to allow them to be dropped, > which is easier if the children don't have them. This all worked fine.
This seems really weird to me. Why is it necessary? I mean, in existing releases, if you declare a column as PRIMARY KEY, the columns included in the key are forced to be NOT NULL, and you can't change that for so long as they are included in the PRIMARY KEY. So I would have thought that after this patch, you'd end up with the same thing. One way of doing that would be to make the PRIMARY KEY depend on the now-catalogued NOT NULL constraints, and the other way would be to keep it as an ad-hoc prohibition, same as now. In PostgreSQL 16, I get a dump like this: CREATE TABLE public.foo ( a integer NOT NULL, b text ); COPY public.foo (a, b) FROM stdin; \. ALTER TABLE ONLY public.foo ADD CONSTRAINT foo_pkey PRIMARY KEY (a); If I'm dumping from an existing release, I don't see why any of that needs to change. The NOT NULL decoration should lead to a system-generated constraint name. If I'm dumping from a new release, the NOT NULL decoration needs to be replaced with CONSTRAINT existing_constraint_name NOT NULL. But I don't see why I need to end up with what the patch generates, which seems to be something like CONSTRAINT pgdump_throwaway_notnull_0 NOT NULL NO INHERIT. That kind of thing suggests that we're changing around the order of operations in pg_dump, probably by adding the NOT NULL constraints at a later stage than currently, and I think the proper solution is most likely to be to avoid doing that in the first place. > However, at some point we realized that we needed to add NOT NULL > constraints in child tables for the columns in which the parent had a > primary key. Then things become messy because we had the throwaway > constraints on one hand and the not-nulls that descend from the PK on > the other hand, where one was NO INHERIT and the other wasn't; worse if > the child also has a primary key. This seems like another problem that is created by changing the order of operations in pg_dump. > > The other possibility that occurs to me is that I think the motivation > > for cataloging NOT NULL constraints was that we wanted to be able to > > track dependencies on them, or something like that, which seems like > > it might be able to create issues of the type that you're facing, but > > the details aren't clear to me. > > NOT VALID constraints would be extremely useful, for one thing (because > then you don't need to exclusively-lock the table during a long scan in > order to add a constraint), and it's just one step away from having > these constraints be catalogued. It was also fixing some inconsistent > handling of inheritance cases. I agree that NOT VALID constraints would be very useful. I'm a little scared by the idea of fixing inconsistent handling of inheritance cases, just for fear that there may be more things relying on the inconsistent behavior than we realize. I feel like this is an area where it's easy for changes to be scarier than they at first seem. I still have memories of discovering some of the current behavior back in the mid-2000s when I was learning PostgreSQL (and databases generally). It struck me as fiddly back then, and it still does. I feel like there are probably some behaviors that look like arbitrary decisions but are actually very important for some undocumented reason. That's not to say that we shouldn't try to make improvements, just that it may be hard to get right. -- Robert Haas EDB: http://www.enterprisedb.com