Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Jan Wieck wrote: I think this is a larger argument than the one that was being discussed above. Given a dump of objects I own, can I restore them without requiring the fk check to be done if I alter table add constraint a foreign key? If the answer to that is no, then the option can be put in as a superuser only option and it's relatively easy. If the answer to that is yes, then there are additional issues that need to be resolved. Well, with the CREATE CONSTRAINT TRIGGER you _can_, but we already have a consensus that we don't _want_ that. Probably we should declare it deprecated and remove it in 7.5. And the option currently under discussion is exactly what will cause ALTER TABLE to let you, but IMHO that _should_ be restricted. Added to TODO: * Remove CREATE CONSTRAINT TRIGGER -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Well, with the CREATE CONSTRAINT TRIGGER you _can_, but we already have a consensus that we don't _want_ that. Probably we should declare it deprecated and remove it in 7.5. And the option currently under discussion is exactly what will cause ALTER TABLE to let you, but IMHO that _should_ be restricted. How can we ever remove it - what about people upgrading from 7.0, 7.1, 7.2? Also, people upgrading from 7.3 who've never heard of adddepend... Chris ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Christopher Kings-Lynne wrote: Well, with the CREATE CONSTRAINT TRIGGER you _can_, but we already have a consensus that we don't _want_ that. Probably we should declare it deprecated and remove it in 7.5. And the option currently under discussion is exactly what will cause ALTER TABLE to let you, but IMHO that _should_ be restricted. How can we ever remove it - what about people upgrading from 7.0, 7.1, 7.2? Also, people upgrading from 7.3 who've never heard of adddepend... Not sure. We can remove documentation about it, at least. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Bruce Momjian wrote: Christopher Kings-Lynne wrote: Well, with the CREATE CONSTRAINT TRIGGER you _can_, but we already have a consensus that we don't _want_ that. Probably we should declare it deprecated and remove it in 7.5. And the option currently under discussion is exactly what will cause ALTER TABLE to let you, but IMHO that _should_ be restricted. How can we ever remove it - what about people upgrading from 7.0, 7.1, 7.2? Also, people upgrading from 7.3 who've never heard of adddepend... Not sure. We can remove documentation about it, at least. If the idea is to support any 7.n - 7.m (where n m) upgrade directly, then it's IMHO time for 8.0 and clearly stating that 7.x - 8.y only is supported as 7.x - 7.3 - 8.0 - 8.y and you're on your own with any other attempt. Don't get this wrong, I am a big friend of easy upgrades. But I am not a big friend of making improvements impossible by backward compatibility forever. It was the backward compatibility to CP/M-80 (v2.2) that caused MS-DOS 7.0 to have a maximum commandline length of 127 characters ... that was taking compatibility too far. Well, M$ took it too far the other way from there and is compatible to nothing any more, not even to themself ... but at least they learned from that mistake. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Stephan Szabo wrote: On Tue, 30 Sep 2003, Bruce Momjian wrote: Stephan Szabo wrote: If we go that direction, why don't we just make a GUC variable to disable constraint checking. Is that what this will do, or is it more limited. I know it breaks referential integrity, but we have had many folks as for it, it is on the TODO list, and there are tons of server functions flying around that do just this by fiddling with pg_class. I would rather just have it be a GUC for that particular backend. People are going to need to turn it off anyway, so why not give them a clean way to do it. But such a GUC wouldn't affect just one backend. It'd potentially affect all backends that were doing concurrent modifications that would be involved since the locks aren't taken. In addition, who would be allowed to set this value and what constraints would it affect? If it's only superusers, then it doesn't help for non-superuser restores. If it's settable by anyone and affects only constraints on tables that user owns and that refer to tables that user owns it might be okay. If it's settable by anyone and affects all tables it renders the constraints meaningless since anyone could break them. I assume it would be only setable by the super-user. They are mucking around with pg_class anyway (and have permission to do so), so let them do it cleanly at least. Allowing non-supers to do it for tables they own would be OK, I guess. Is there a problem if some of the primary table is owned by someone else? Not sure. The problem I have with a super-user only solution is that it doesn't solve the problem for restores in general. I think we need a mechanism that works for any user that wants to restore a table (or tables) from dump(s), so for the dump/restore mechanism I think we should be looking in that direction. OK. Let's explore that. What does ownership mean? If I grant all permissions on an object I own to you, what can you not do? I think GRANT/REVOKE and ALTER TABLE are the only two ones, right? So, if I own it, I am the only one who can ALTER the table to add/remove the foreign key constraint. So, if I already have a foreign key constraint on a table, I can easily remove it if I am the owner and do whatever I want with the table. Now, the big question is, is there harm in my saying in the system catalogs that I have a foreign key constraint on a table, when I might have turned off the constraint via GUC and modified the table so the foreign key constraint isn't valid? I think that is the big question --- is there harm to others in saying something I own has a foreign key, when it might not? -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: How many folks are going to remember to do this? Why make it hard for them? Someone is going to forget too easily. Why is this restore taking so long? Oh, I forgot that switch. Or they put it in a login file and forget it is set. Seems safer for it to be in the dump file. I disagree. The how many folks are going to remember to do this argument applies just as well to magic pg_dump switches; that's not a tenable argument against doing it at restore time. The difference between controlling it at pg_dump time and pg_restore time is that if you change your mind after having made the dump, it's too late, if the decision was nailed down in the dump file. In an upgrade situation it's very likely that you no longer have the option to re-do your dump, because you already blew away your old installation. Since there's no performance difference at pg_dump time, I can't see any advantage to freezing your decision then. I understand, and if everyone used pg_restore, then adding a flag to pg_restore to do this would make sense. However, everyone is used to treating that dump file as a simple dump and throwing it into psql. Psql doesn't have any special dump flags, so you have to do the environment variable trick, which you must admit is pretty ugly looking and prone to typing errors, and forgetting, because they are used to invoking psql all the time. Maybe we need a psql dump reload flag? Would we be able to do any other optimizations, like increasing sort_mem or something? That would be a clean solution, and perhaps allow additional optimizations. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: I assume it would be only setable by the super-user. That might be a good restriction too (on top of my speculation about not allowing it in postgresql.conf). Only allow it to be SET per-session, We don't have a way to make something unsetable in postgresql.conf right now, do we? and only by a superuser. See my recent email on this about ownership. I personally am happy with super-user only (or db-owner and super-user only). As I said, it is a question of what documenting a foreign key in the system catalogs means to folks who don't own the table. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Bruce Momjian wrote: Fact is, folks are doing it anyway by modifying pg_class. I know one guy who did it in a transaction so he was the only one to see the triggers disabled! The PostgreSQL cookbook page has an example too. People are always asking how to do this. Why not just make it setable only by the super-user. FYI, TODO has: * Allow triggers to be disabled [trigger] * With disabled triggers, allow pg_dump to use ALTER TABLE ADD FOREIGN KEY For practical reasons, I'd prefer the disable trigger not to influence fk triggers, or at least to have such a default flavor. When restoring a database, you might consider the data as consistent and complete, so no triggers and ref checks are needed at all. But in the cases of some kind of application data import, you might like the data to have fk ref checked, but don't want to trigger all user triggers. The implementation of fk checking by triggers should normally be hidden to the user. Regards, Andreas ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: Since there's no performance difference at pg_dump time, I can't see any advantage to freezing your decision then. I understand, and if everyone used pg_restore, then adding a flag to pg_restore to do this would make sense. However, everyone is used to treating that dump file as a simple dump and throwing it into psql. So? A GUC variable could be set equally easily either way. In fact more so. Psql doesn't have any special dump flags, so you have to do the environment variable trick, You forgot SET ... regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
On Tue, 30 Sep 2003, Jan Wieck wrote: Stephan Szabo wrote: On Tue, 30 Sep 2003, Tom Lane wrote: I see where Stephan is coming from, but in my mind disabling consistency checks ought to be a feature reserved to the DBA (ie superuser), who presumably has some clue about the tradeoffs involved. I don't think ordinary users should be able to do it. If we can get the cost of performing the initial check down to something reasonable (and I don't mean near zero, I mean something that's small in comparison to the other costs of loading data and creating indexes), then I think we've done as much as we should do for ordinary users. Limiting the cases under which constraint ignoring works is certainly fine by me, but I was assuming that we were trying to make it accessable to any restore. If that's not true, then we don't need to worry about that part of the issue. It is not true. Fact is that restoring can require more rights than creating the dump. That is already the case if you want to restore anything that contains objects owned by different users. Trying to enable everyone who can take a dump also to restore it, by whatever mechanism, gives someone the right to revert things in time and create a situation (consistent or not) that he could not (re)create without doing dump/restore. This is wrong and should not be possible. I think this is a larger argument than the one that was being discussed above. Given a dump of objects I own, can I restore them without requiring the fk check to be done if I alter table add constraint a foreign key? If the answer to that is no, then the option can be put in as a superuser only option and it's relatively easy. If the answer to that is yes, then there are additional issues that need to be resolved. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: That might be a good restriction too (on top of my speculation about not allowing it in postgresql.conf). Only allow it to be SET per-session, We don't have a way to make something unsetable in postgresql.conf right now, do we? Yeah, we do --- see GUC_DISALLOW_IN_FILE. The existing variable zero_damaged_pages has the same restrictions we're talking about here, and for largely the same reasons: you can shoot yourself in the foot with it. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Stephan Szabo wrote: On Tue, 30 Sep 2003, Jan Wieck wrote: Stephan Szabo wrote: On Tue, 30 Sep 2003, Tom Lane wrote: I see where Stephan is coming from, but in my mind disabling consistency checks ought to be a feature reserved to the DBA (ie superuser), who presumably has some clue about the tradeoffs involved. I don't think ordinary users should be able to do it. If we can get the cost of performing the initial check down to something reasonable (and I don't mean near zero, I mean something that's small in comparison to the other costs of loading data and creating indexes), then I think we've done as much as we should do for ordinary users. Limiting the cases under which constraint ignoring works is certainly fine by me, but I was assuming that we were trying to make it accessable to any restore. If that's not true, then we don't need to worry about that part of the issue. It is not true. Fact is that restoring can require more rights than creating the dump. That is already the case if you want to restore anything that contains objects owned by different users. Trying to enable everyone who can take a dump also to restore it, by whatever mechanism, gives someone the right to revert things in time and create a situation (consistent or not) that he could not (re)create without doing dump/restore. This is wrong and should not be possible. I think this is a larger argument than the one that was being discussed above. Given a dump of objects I own, can I restore them without requiring the fk check to be done if I alter table add constraint a foreign key? If the answer to that is no, then the option can be put in as a superuser only option and it's relatively easy. If the answer to that is yes, then there are additional issues that need to be resolved. Well, with the CREATE CONSTRAINT TRIGGER you _can_, but we already have a consensus that we don't _want_ that. Probably we should declare it deprecated and remove it in 7.5. And the option currently under discussion is exactly what will cause ALTER TABLE to let you, but IMHO that _should_ be restricted. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
On Tue, 30 Sep 2003, Tom Lane wrote: I see where Stephan is coming from, but in my mind disabling consistency checks ought to be a feature reserved to the DBA (ie superuser), who presumably has some clue about the tradeoffs involved. I don't think ordinary users should be able to do it. If we can get the cost of performing the initial check down to something reasonable (and I don't mean near zero, I mean something that's small in comparison to the other costs of loading data and creating indexes), then I think we've done as much as we should do for ordinary users. Limiting the cases under which constraint ignoring works is certainly fine by me, but I was assuming that we were trying to make it accessable to any restore. If that's not true, then we don't need to worry about that part of the issue. As a side note, in the partial implementation I'd already done, I noticed a potential problem if the person doing the alter table didn't have read permissions on the pktable. I'd written it to bail and do the slow check in that case (well actually in most error cases that didn't themselves cause an elog), does anyone have a better idea? ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Stephan Szabo [EMAIL PROTECTED] writes: As a side note, in the partial implementation I'd already done, I noticed a potential problem if the person doing the alter table didn't have read permissions on the pktable. I'd written it to bail and do the slow check in that case (well actually in most error cases that didn't themselves cause an elog), does anyone have a better idea? Wouldn't all the subsequent triggers fail also in such a case? (For that matter, wouldn't the existing implementation of the initial check fail?) I can't see a reason to expend code to avoid failing here. It's not very sensible to be able to create an FK on a table you don't have read permission for. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
On Tue, 30 Sep 2003, Tom Lane wrote: Stephan Szabo [EMAIL PROTECTED] writes: As a side note, in the partial implementation I'd already done, I noticed a potential problem if the person doing the alter table didn't have read permissions on the pktable. I'd written it to bail and do the slow check in that case (well actually in most error cases that didn't themselves cause an elog), does anyone have a better idea? Wouldn't all the subsequent triggers fail also in such a case? (For that matter, wouldn't the existing implementation of the initial check fail?) I can't see a reason to expend code to avoid failing here. It's No, because the triggers change permissions to the owner of the appropriate (either fk or pk) table before running the query, so the old method works as well as the final constraint would. However, if the two owners are not the same, you can't set to both during the single query. not very sensible to be able to create an FK on a table you don't have read permission for. IIRC, you only need references permissions to make an fk constraint, not select. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Bruce Momjian [EMAIL PROTECTED] writes: Stephan Szabo wrote: The problem I have with a super-user only solution is that it doesn't solve the problem for restores in general. OK. Let's explore that. What does ownership mean? It does not normally mean the ability to bypass consistency checks; for example, if you put a CHECK constraint on a table, you don't get to violate it because you own the table. (Of course superuserness doesn't let you do so either...) I see where Stephan is coming from, but in my mind disabling consistency checks ought to be a feature reserved to the DBA (ie superuser), who presumably has some clue about the tradeoffs involved. I don't think ordinary users should be able to do it. If we can get the cost of performing the initial check down to something reasonable (and I don't mean near zero, I mean something that's small in comparison to the other costs of loading data and creating indexes), then I think we've done as much as we should do for ordinary users. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Stephan Szabo wrote: On Tue, 30 Sep 2003, Tom Lane wrote: I see where Stephan is coming from, but in my mind disabling consistency checks ought to be a feature reserved to the DBA (ie superuser), who presumably has some clue about the tradeoffs involved. I don't think ordinary users should be able to do it. If we can get the cost of performing the initial check down to something reasonable (and I don't mean near zero, I mean something that's small in comparison to the other costs of loading data and creating indexes), then I think we've done as much as we should do for ordinary users. Limiting the cases under which constraint ignoring works is certainly fine by me, but I was assuming that we were trying to make it accessable to any restore. If that's not true, then we don't need to worry about that part of the issue. It is not true. Fact is that restoring can require more rights than creating the dump. That is already the case if you want to restore anything that contains objects owned by different users. Trying to enable everyone who can take a dump also to restore it, by whatever mechanism, gives someone the right to revert things in time and create a situation (consistent or not) that he could not (re)create without doing dump/restore. This is wrong and should not be possible. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Christopher Kings-Lynne wrote: You could just as easily argue that the lack of integrity testing at data load time was equally a bug. I think we need someway of telling postgres to suppress a foreign key check. The main problem is that the foreign key column is often not indexed. So a db designer made a bloody mistake. The problem is there's no easy way to find out what's missing. I'd really like EXPLAIN to display all subsequent triggered queries also, to see the full scans caused by missing indexes. Regards, Andreas ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
So a db designer made a bloody mistake. The problem is there's no easy way to find out what's missing. I'd really like EXPLAIN to display all subsequent triggered queries also, to see the full scans caused by missing indexes. I'd sure second that! Chris ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Andreas Pflug kirjutas E, 29.09.2003 kell 12:04: Christopher Kings-Lynne wrote: You could just as easily argue that the lack of integrity testing at data load time was equally a bug. I think we need someway of telling postgres to suppress a foreign key check. The main problem is that the foreign key column is often not indexed. So a db designer made a bloody mistake. The problem is there's no easy way to find out what's missing. I'd really like EXPLAIN to display all subsequent triggered queries also, to see the full scans caused by missing indexes. It could probably be doable for EXPLAIN ANALYZE (by actually tracing execution), but then you will see really _all_ queries, i.e. for a 1000 row update you would see 1 UPDATE query and 1000 fk checks ... OTOH, you probably can get that already from logs with right logging parameters. - Hannu ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
So a db designer made a bloody mistake. The problem is there's no easy way to find out what's missing. I'd really like EXPLAIN to display all subsequent triggered queries also, to see the full scans caused by missing indexes. It could probably be doable for EXPLAIN ANALYZE (by actually tracing execution), but then you will see really _all_ queries, i.e. for a 1000 row update you would see 1 UPDATE query and 1000 fk checks ... OTOH, you probably can get that already from logs with right logging parameters. Actually - it shouldn't be too hard to write a query that returns all unindexed foreign keys, surely? Chris ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
On Monday 29 September 2003 15:58, Christopher Kings-Lynne wrote: So a db designer made a bloody mistake. The problem is there's no easy way to find out what's missing. I'd really like EXPLAIN to display all subsequent triggered queries also, to see the full scans caused by missing indexes. It could probably be doable for EXPLAIN ANALYZE (by actually tracing execution), but then you will see really _all_ queries, i.e. for a 1000 row update you would see 1 UPDATE query and 1000 fk checks ... OTOH, you probably can get that already from logs with right logging parameters. Actually - it shouldn't be too hard to write a query that returns all unindexed foreign keys, surely? Correct me if I am wrong but I remember postgresql throwing error that foreign key field was not unique in foreign table. Obviously it can not detect that without an index. Either primary key or unique constraint would need an index. What am I missing here? IOW, how do I exactly create foreign keys without an index? Shridhar ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
On Mon, 29 Sep 2003, Christopher Kings-Lynne wrote: So a db designer made a bloody mistake. The problem is there's no easy way to find out what's missing. I'd really like EXPLAIN to display all subsequent triggered queries also, to see the full scans caused by missing indexes. I'd sure second that! That's only partially determinable though. The trigger code could branch and run two different queries depending on the values supplied in the 'input' tuple of the trigger. Nigel ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Shridhar Daithankar kirjutas E, 29.09.2003 kell 13:34: On Monday 29 September 2003 15:58, Christopher Kings-Lynne wrote: So a db designer made a bloody mistake. The problem is there's no easy way to find out what's missing. I'd really like EXPLAIN to display all subsequent triggered queries also, to see the full scans caused by missing indexes. It could probably be doable for EXPLAIN ANALYZE (by actually tracing execution), but then you will see really _all_ queries, i.e. for a 1000 row update you would see 1 UPDATE query and 1000 fk checks ... OTOH, you probably can get that already from logs with right logging parameters. Actually - it shouldn't be too hard to write a query that returns all unindexed foreign keys, surely? Correct me if I am wrong but I remember postgresql throwing error that foreign key field was not unique in foreign table. Obviously it can not detect that without an index. Either primary key or unique constraint would need an index. What am I missing here? IOW, how do I exactly create foreign keys without an index? hannu=# create table pkt(i int primary key); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index 'pkt_pkey' for table 'pkt' CREATE TABLE hannu=# create table fkt(j int references pkt); NOTICE: CREATE TABLE will create implicit trigger(s) for FOREIGN KEY check(s) CREATE TABLE hannu=# now the *foreygn key* column (fkt.j) is without index. As foreign keys are enforced both ways, this can be a problem when changing table pkt or bulk creating FK's on big tables. Hannu ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Nigel J. Andrews wrote: On Mon, 29 Sep 2003, Christopher Kings-Lynne wrote: So a db designer made a bloody mistake. The problem is there's no easy way to find out what's missing. I'd really like EXPLAIN to display all subsequent triggered queries also, to see the full scans caused by missing indexes. I'd sure second that! That's only partially determinable though. The trigger code could branch and run two different queries depending on the values supplied in the 'input' tuple of the trigger. That would be ok; if I got a problem with a certain query, I don't expect to find problems I might get with other queries. Though this would be nice, how about a general pg_gimme_all_problems() function :-) Regards, Andreas ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
So a db designer made a bloody mistake. Not necessarily. If I'm never going to update or delete from the parent table the index would be useless. I find very few of my foreign key relationships actually need indexes on the child table. I usually only have the unique index on the parent table. And often the child table is the big table. The index would be very large and have awful selectivity. The last thing I want is a 5-million record table with half a dozen indexes each with 10-20 unique values. The problem is there's no easy way to find out what's missing. I'd really like EXPLAIN to display all subsequent triggered queries also, to see the full scans caused by missing indexes. I'd sure second that! I think the root of problem here is the same as the root of the problem with foreign key checks being slow for large batch updates and inserts. Namely that foreign key constraint checks are being handled as a million small queries. To handle foreign key constraints optimally they would really have to be merged into the plan in a kind of join. For most inserts/updates something like a nested-loop join that is effectively the same as the current triggers would be used. But for large batch updates/inserts it's quite possible that it would look more like a hash join or even a merge join. To do that would probably mean throwing out the whole trigger-based implementation though, which seems like an awfully big project. And being able to disable and reenable constraints would still be nice. They're never going to be instantaneous. And besides, speed isn't the only reason to want to disable constraints temporarily. The database is a tool, it should be there to do the DBA's bidding, not the other way around :) -- greg ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
On Sun, 28 Sep 2003, Tom Lane wrote: Stephan Szabo [EMAIL PROTECTED] writes: On Sat, 27 Sep 2003, Tom Lane wrote: I thought of what seems to be a better design for the check query: use a LEFT JOIN and check for NULL in the righthand joined column. Hmm, my initial testing showed that it really was a little slower than a more complicated one with NOT EXISTS so I'd abandoned it. How does it fare for you compared to: select f1, f2 from fk where not exists (select 1 from pk where pk.f1=fk.f1 and pk.f2=pk.f2) where fk.f1 is not null and fk.f2 is not null; Were you testing against 7.3 or 7.4? On what kinds of tables? In 7.4 I think that the JOIN would yield as good or better a plan. The best possible plan for the NOT EXISTS query is effectively a nestloop with inner indexscan, which is great if the FK table is small and the PK table is large, but it sucks otherwise. The planner should choose a plan of this form for the LEFT JOIN given that combination of table sizes, and so there shouldn't be any great difference in runtime in that case. But in other combinations, such as large FK and small PK, other plan types will beat the pants off nestloop. As an update, so far I still am getting better results with NOT EXISTS than the left join. For a 50m row fk, 10k row pk where the rows are just the keys, I'm getting a plan like Merge Join Index scan on pktable Sort Seqscan on fktable which is taking about 2-4 times longer for me than the not exists depending on sort_mem (at 4096,64000,128000). When I lowered random_page_cost to 1, I got an indexscan on fktable, but that hadn't seemed to finish after about 2 hours (as opposed to about 30-35 minutes for the not exists and about 30 min - 1 1/2 hours for the sort+seqscan version. I want to do some more tests where there's extraneous data in both tables and see what that does to the results. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
On Mon, 29 Sep 2003, Stephan Szabo wrote: When I lowered random_page_cost to 1, I got an indexscan on fktable, but that hadn't seemed to finish after about 2 hours (as opposed to about 30-35 minutes for the not exists and about 30 min - 1 1/2 hours for the Small correction, I'd meant to type 20-35 minutes above, not 30-35. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Stephan Szabo [EMAIL PROTECTED] writes: As an update, so far I still am getting better results with NOT EXISTS than the left join. Hm. Don't suppose you were using EXPLAIN ANALYZE so we could see what's happening? This is clearly a planner failure, although I'm unsure if we can expect the planner to get the right answer with no pg_statistic entries. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
On Mon, 29 Sep 2003, Tom Lane wrote: Stephan Szabo [EMAIL PROTECTED] writes: As an update, so far I still am getting better results with NOT EXISTS than the left join. Hm. Don't suppose you were using EXPLAIN ANALYZE so we could see what's happening? This is clearly a planner failure, although I'm unsure if we can expect the planner to get the right answer with no pg_statistic entries. For the sort+seq one and the not exists, I had, but I'll re-run it (it's on my home desktop that I won't be able to access). The other when I forced it to use an index scan I haven't let complete yet, and I don't know how long that will take. I was also planning to run a set after running analyze, so I'll include those too. It'll probably be a few hours before the results are in. :) Are there any other options (enable_mergejoin, etc) that you want me to try with? ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Greg Stark wrote: So a db designer made a bloody mistake. Not necessarily. If I'm never going to update or delete from the parent table the index would be useless. I find very few of my foreign key relationships actually need indexes on the child table. I usually only have the unique index on the parent table. And often the child table is the big table. The index would be very large and have awful selectivity. The last thing I want is a 5-million record table with half a dozen indexes each with 10-20 unique values. The problem is there's no easy way to find out what's missing. I'd really like EXPLAIN to display all subsequent triggered queries also, to see the full scans caused by missing indexes. I'd sure second that! I think the root of problem here is the same as the root of the problem with foreign key checks being slow for large batch updates and inserts. Namely that foreign key constraint checks are being handled as a million small queries. Apart from missing indices, this is certainly a problem. Statement level triggers will solve this, as soon as they are fully implemented and support OLD and NEW record sets. Regards, Andreas ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Stephan Szabo [EMAIL PROTECTED] writes: Hm. Don't suppose you were using EXPLAIN ANALYZE so we could see what's happening? This is clearly a planner failure, although I'm unsure if we can expect the planner to get the right answer with no pg_statistic entries. The left join one seems to give me values like the following: There are some fishy row estimates in here: - Index Scan using pktest_a_key on pktest (cost=0.00..52.00 rows=1000 width=8) (actual time=17.82..1609.97 rows=1 loops=1) The system definitely should be expected to have the accurate row count for the PK table, since an index should have been created on it (and we do do that after loading the data, no?). It is possible that it'd have the default 1000 estimate for the FK table, if there are no indexes at all on the FK table; otherwise it should have the right number. It's not real clear to me what conditions you're testing under, but the estimates in the plans you're quoting aren't consistent ... regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
On Mon, 29 Sep 2003, Tom Lane wrote: Stephan Szabo [EMAIL PROTECTED] writes: Hm. Don't suppose you were using EXPLAIN ANALYZE so we could see what's happening? This is clearly a planner failure, although I'm unsure if we can expect the planner to get the right answer with no pg_statistic entries. The left join one seems to give me values like the following: There are some fishy row estimates in here: - Index Scan using pktest_a_key on pktest (cost=0.00..52.00 rows=1000 width=8) (actual time=17.82..1609.97 rows=1 loops=1) The system definitely should be expected to have the accurate row count for the PK table, since an index should have been created on it (and we do do that after loading the data, no?). It is possible that it'd have the default 1000 estimate for the FK table, if there are no indexes at all on the FK table; otherwise it should have the right number. It's not real clear to me what conditions you're testing under, but the estimates in the plans you're quoting aren't consistent ... Well, they're all from the same load of the same data with only stopping and starting in between, but I did make the index on the pk table first loaded the data and then built the fk table index ( because I'd wanted to try without the index as well), which meant that it wouldn't match the behavior of a dump. Ugh, I'd forgotten that the primary key didn't get created until later too. Okay, that's much better: Hash Left Join (cost=203.00..1487869.29 rows=49501250 width=4) (actual time=611632.67..611632.67 rows=0 loops=1) Hash Cond: ((outer.b = inner.a) AND (outer.c = inner.b)) Filter: (inner.a IS NULL) - Seq Scan on fktest (cost=0.00..745099.00 rows=49501250 width=8) (actual time=0.01..169642.48 rows=5000 loops=1) Filter: ((b IS NOT NULL) AND (c IS NOT NULL)) - Hash (cost=152.00..152.00 rows=1 width=8) (actual time=46.04..46.04 rows=0 loops=1) - Seq Scan on pktest (cost=0.00..152.00 rows=1 width=8) (actual time=0.02..21.38 rows=1 loops=1) Total runtime: 611632.95 msec (8 rows) That's much better. :) As long as the row estimates are reasonable it seems to be okay, but I do wonder why it chose the merge join for the case when it thought there was only 1000 rows though. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
(I think my previous attempt got aborted by a lost connection, so a message like this may arrive twice) On Mon, 29 Sep 2003, Tom Lane wrote: Stephan Szabo [EMAIL PROTECTED] writes: Hm. Don't suppose you were using EXPLAIN ANALYZE so we could see what's happening? This is clearly a planner failure, although I'm unsure if we can expect the planner to get the right answer with no pg_statistic entries. The left join one seems to give me values like the following: There are some fishy row estimates in here: - Index Scan using pktest_a_key on pktest (cost=0.00..52.00 rows=1000 width=8) (actual time=17.82..1609.97 rows=1 loops=1) The system definitely should be expected to have the accurate row count for the PK table, since an index should have been created on it (and we do do that after loading the data, no?). It is possible that it'd have the default 1000 estimate for the FK table, if there are no indexes at all on the FK table; otherwise it should have the right number. It's not real clear to me what conditions you're testing under, but the estimates in the plans you're quoting aren't consistent ... Also, the sequence was basically: CREATE TABLE pktest(a int, b int, unique(a,b)); CREATE TABLE fktest(b int, c int); COPY pktest FROM STDIN; ... COPY fktest FROM STDIN; ... run some tests I didn't mention here CREATE INDEX fki on fktest(b,c); run the above test With stopping and restarting the server involved and running the tests multiple times. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Tom Lane wrote: Christopher Kings-Lynne [EMAIL PROTECTED] writes: I think we need someway of telling postgres to suppress a foreign key check. Well, the subtext argument here is do we fix it by providing a way to suppress the check, or do we fix it by making the check fast enough to be tolerable? I think the advantages of choice (b) are obvious --- it doesn't allow bogus data to be loaded accidentally, and it doesn't create a problem with loading existing 7.3 dump files that don't know how to suppress the check. If we find there is no way to do (b) acceptably well, then and only then would I want to consider (a). I think I can accept it to be the choice of the DBA what to do. Pg_dump has that kind of options already, one can choose between COPY and INSERT for example. Why not adding the choice of dumping FKeys as ALTER TABLE or CREATE CONSTRAINT TRIGGER? The whole original idea (way back a few years ago) of doing it with the CREATE CONSTRAINT TRIGGER command was, that your backup ought to be consistent anyway. Finding out that your tape contains inconsistent garbage _after_ your harddisk made that suspicious noise ... is a bit late, isn't it? That ALTER TABLE ... ADD CONSTRAINT needs to be improved, because at the moment it is normally used we cannot make any assumptions about data consistency, no question. But just because we have such a nice and allways asked for ALTER TABLE command does not mean we have to force every DBA of every well maintained and stable system to perform hourly long nonsense-tests on known-to-be-good data. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Jan Wieck [EMAIL PROTECTED] writes: I think I can accept it to be the choice of the DBA what to do. Pg_dump has that kind of options already, one can choose between COPY and INSERT for example. Why not adding the choice of dumping FKeys as ALTER TABLE or CREATE CONSTRAINT TRIGGER? We don't want it dumping as CREATE CONSTRAINT TRIGGER, because (a) that loses pg_depend information and (b) it's too low-level a representation; we couldn't ever change the implementation of foreign keys as long as dumps look like that. Also, I don't see why you'd want to make such a choice at pg_dump time. Probably better to control it at restore time. Accordingly, my proposal if we were to go that route would be a boolean GUC variable that simply prevents ALTER TABLE ADD FOREIGN KEY from doing the validity checks. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Tom Lane wrote: Jan Wieck [EMAIL PROTECTED] writes: I think I can accept it to be the choice of the DBA what to do. Pg_dump has that kind of options already, one can choose between COPY and INSERT for example. Why not adding the choice of dumping FKeys as ALTER TABLE or CREATE CONSTRAINT TRIGGER? We don't want it dumping as CREATE CONSTRAINT TRIGGER, because (a) that loses pg_depend information and (b) it's too low-level a representation; we couldn't ever change the implementation of foreign keys as long as dumps look like that. That's finally 2 points, okay. Also, I don't see why you'd want to make such a choice at pg_dump time. Probably better to control it at restore time. Accordingly, my proposal if we were to go that route would be a boolean GUC variable that simply prevents ALTER TABLE ADD FOREIGN KEY from doing the validity checks. Okay too. And this would be simple and safe enough to add it at the time being. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Correct me if I am wrong but I remember postgresql throwing error that foreign key field was not unique in foreign table. Obviously it can not detect that without an index. Either primary key or unique constraint would need an index. What am I missing here? IOW, how do I exactly create foreign keys without an index? You are taling about the primary key or the referenced key, not the foreign key. (eg. the source column) Chris ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Jan Wieck wrote: Tom Lane wrote: Jan Wieck [EMAIL PROTECTED] writes: I think I can accept it to be the choice of the DBA what to do. Pg_dump has that kind of options already, one can choose between COPY and INSERT for example. Why not adding the choice of dumping FKeys as ALTER TABLE or CREATE CONSTRAINT TRIGGER? We don't want it dumping as CREATE CONSTRAINT TRIGGER, because (a) that loses pg_depend information and (b) it's too low-level a representation; we couldn't ever change the implementation of foreign keys as long as dumps look like that. That's finally 2 points, okay. Also, I don't see why you'd want to make such a choice at pg_dump time. Probably better to control it at restore time. Accordingly, my proposal if we were to go that route would be a boolean GUC variable that simply prevents ALTER TABLE ADD FOREIGN KEY from doing the validity checks. Okay too. And this would be simple and safe enough to add it at the time being. If we go that direction, why don't we just make a GUC variable to disable constraint checking. Is that what this will do, or is it more limited. I know it breaks referential integrity, but we have had many folks as for it, it is on the TODO list, and there are tons of server functions flying around that do just this by fiddling with pg_class. I would rather just have it be a GUC for that particular backend. People are going to need to turn it off anyway, so why not give them a clean way to do it. Also, how does someone turn it on at restore time if they are piping into psql? -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Bruce Momjian [EMAIL PROTECTED] writes: Jan Wieck wrote: Tom Lane wrote: if we were to go that route would be a boolean GUC variable that simply prevents ALTER TABLE ADD FOREIGN KEY from doing the validity checks. Okay too. And this would be simple and safe enough to add it at the time being. If we go that direction, why don't we just make a GUC variable to disable constraint checking. You mean in general, even for plain old insert/update/delete changes? Yipes. What happened to ACID compliance? What I actually expected to ensue was a discussion about how we could narrow down the effects of a disable-foreign-key-verification switch to reduce the odds of shooting oneself in the foot. (For example, maybe disallow it from being set in postgresql.conf.) I wasn't expecting proposals to enlarge the gauge of the foot-gun ... Also, how does someone turn it on at restore time if they are piping into psql? Something like export PGOPTIONS=-c disable-fk-verification=true then run psql or pg_restore. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
On Tue, 30 Sep 2003, Bruce Momjian wrote: Jan Wieck wrote: Tom Lane wrote: Jan Wieck [EMAIL PROTECTED] writes: I think I can accept it to be the choice of the DBA what to do. Pg_dump has that kind of options already, one can choose between COPY and INSERT for example. Why not adding the choice of dumping FKeys as ALTER TABLE or CREATE CONSTRAINT TRIGGER? We don't want it dumping as CREATE CONSTRAINT TRIGGER, because (a) that loses pg_depend information and (b) it's too low-level a representation; we couldn't ever change the implementation of foreign keys as long as dumps look like that. That's finally 2 points, okay. Also, I don't see why you'd want to make such a choice at pg_dump time. Probably better to control it at restore time. Accordingly, my proposal if we were to go that route would be a boolean GUC variable that simply prevents ALTER TABLE ADD FOREIGN KEY from doing the validity checks. Okay too. And this would be simple and safe enough to add it at the time being. If we go that direction, why don't we just make a GUC variable to disable constraint checking. Is that what this will do, or is it more limited. I know it breaks referential integrity, but we have had many folks as for it, it is on the TODO list, and there are tons of server functions flying around that do just this by fiddling with pg_class. I would rather just have it be a GUC for that particular backend. People are going to need to turn it off anyway, so why not give them a clean way to do it. But such a GUC wouldn't affect just one backend. It'd potentially affect all backends that were doing concurrent modifications that would be involved since the locks aren't taken. In addition, who would be allowed to set this value and what constraints would it affect? If it's only superusers, then it doesn't help for non-superuser restores. If it's settable by anyone and affects only constraints on tables that user owns and that refer to tables that user owns it might be okay. If it's settable by anyone and affects all tables it renders the constraints meaningless since anyone could break them. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Tom Lane wrote: If we go that direction, why don't we just make a GUC variable to disable constraint checking. You mean in general, even for plain old insert/update/delete changes? Yipes. What happened to ACID compliance? What I actually expected to ensue was a discussion about how we could narrow down the effects of a disable-foreign-key-verification switch to reduce the odds of shooting oneself in the foot. (For example, maybe disallow it from being set in postgresql.conf.) I wasn't expecting proposals to enlarge the gauge of the foot-gun ... Fact is, folks are doing it anyway by modifying pg_class. I know one guy who did it in a transaction so he was the only one to see the triggers disabled! The PostgreSQL cookbook page has an example too. People are always asking how to do this. Why not just make it setable only by the super-user. FYI, TODO has: * Allow triggers to be disabled [trigger] * With disabled triggers, allow pg_dump to use ALTER TABLE ADD FOREIGN KEY The second one is the one we are discussing. If we never want to do it, I will remove it from the TODO list. However, I think we might be making things too controlled by not allowing administrators to do this. Also, how does someone turn it on at restore time if they are piping into psql? Something like export PGOPTIONS=-c disable-fk-verification=true then run psql or pg_restore. How many folks are going to remember to do this? Why make it hard for them? Someone is going to forget too easily. Why is this restore taking so long? Oh, I forgot that switch. Or they put it in a login file and forget it is set. Seems safer for it to be in the dump file. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Stephan Szabo wrote: If we go that direction, why don't we just make a GUC variable to disable constraint checking. Is that what this will do, or is it more limited. I know it breaks referential integrity, but we have had many folks as for it, it is on the TODO list, and there are tons of server functions flying around that do just this by fiddling with pg_class. I would rather just have it be a GUC for that particular backend. People are going to need to turn it off anyway, so why not give them a clean way to do it. But such a GUC wouldn't affect just one backend. It'd potentially affect all backends that were doing concurrent modifications that would be involved since the locks aren't taken. In addition, who would be allowed to set this value and what constraints would it affect? If it's only superusers, then it doesn't help for non-superuser restores. If it's settable by anyone and affects only constraints on tables that user owns and that refer to tables that user owns it might be okay. If it's settable by anyone and affects all tables it renders the constraints meaningless since anyone could break them. I assume it would be only setable by the super-user. They are mucking around with pg_class anyway (and have permission to do so), so let them do it cleanly at least. Allowing non-supers to do it for tables they own would be OK, I guess. Is there a problem if some of the primary table is owned by someone else? Not sure. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Bruce Momjian [EMAIL PROTECTED] writes: How many folks are going to remember to do this? Why make it hard for them? Someone is going to forget too easily. Why is this restore taking so long? Oh, I forgot that switch. Or they put it in a login file and forget it is set. Seems safer for it to be in the dump file. I disagree. The how many folks are going to remember to do this argument applies just as well to magic pg_dump switches; that's not a tenable argument against doing it at restore time. The difference between controlling it at pg_dump time and pg_restore time is that if you change your mind after having made the dump, it's too late, if the decision was nailed down in the dump file. In an upgrade situation it's very likely that you no longer have the option to re-do your dump, because you already blew away your old installation. Since there's no performance difference at pg_dump time, I can't see any advantage to freezing your decision then. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Bruce Momjian [EMAIL PROTECTED] writes: I assume it would be only setable by the super-user. That might be a good restriction too (on top of my speculation about not allowing it in postgresql.conf). Only allow it to be SET per-session, and only by a superuser. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
On Tue, 30 Sep 2003, Bruce Momjian wrote: Stephan Szabo wrote: If we go that direction, why don't we just make a GUC variable to disable constraint checking. Is that what this will do, or is it more limited. I know it breaks referential integrity, but we have had many folks as for it, it is on the TODO list, and there are tons of server functions flying around that do just this by fiddling with pg_class. I would rather just have it be a GUC for that particular backend. People are going to need to turn it off anyway, so why not give them a clean way to do it. But such a GUC wouldn't affect just one backend. It'd potentially affect all backends that were doing concurrent modifications that would be involved since the locks aren't taken. In addition, who would be allowed to set this value and what constraints would it affect? If it's only superusers, then it doesn't help for non-superuser restores. If it's settable by anyone and affects only constraints on tables that user owns and that refer to tables that user owns it might be okay. If it's settable by anyone and affects all tables it renders the constraints meaningless since anyone could break them. I assume it would be only setable by the super-user. They are mucking around with pg_class anyway (and have permission to do so), so let them do it cleanly at least. Allowing non-supers to do it for tables they own would be OK, I guess. Is there a problem if some of the primary table is owned by someone else? Not sure. The problem I have with a super-user only solution is that it doesn't solve the problem for restores in general. I think we need a mechanism that works for any user that wants to restore a table (or tables) from dump(s), so for the dump/restore mechanism I think we should be looking in that direction. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Stephan Szabo [EMAIL PROTECTED] writes: On Sat, 27 Sep 2003, Tom Lane wrote: I thought of what seems to be a better design for the check query: use a LEFT JOIN and check for NULL in the righthand joined column. Hmm, my initial testing showed that it really was a little slower than a more complicated one with NOT EXISTS so I'd abandoned it. How does it fare for you compared to: select f1, f2 from fk where not exists (select 1 from pk where pk.f1=fk.f1 and pk.f2=pk.f2) where fk.f1 is not null and fk.f2 is not null; Were you testing against 7.3 or 7.4? On what kinds of tables? In 7.4 I think that the JOIN would yield as good or better a plan. The best possible plan for the NOT EXISTS query is effectively a nestloop with inner indexscan, which is great if the FK table is small and the PK table is large, but it sucks otherwise. The planner should choose a plan of this form for the LEFT JOIN given that combination of table sizes, and so there shouldn't be any great difference in runtime in that case. But in other combinations, such as large FK and small PK, other plan types will beat the pants off nestloop. I've actually got code (that no longer cleanly applies, but...) that uses the single query version with NOT EXISTS (which could be easily changed to either of the other forms) and was planning to put it together for a patch when 7.5 devel started because I figured it wasn't precisely a bug and wouldn't get accepted for 7.4. Well, Bruce has this on his open-items list, so I figure we have a green light to do something for 7.4 if we can work out what to do. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
On Sun, 28 Sep 2003, Bruce Momjian wrote: Stephan Szabo wrote: Hmm, my initial testing showed that it really was a little slower than a more complicated one with NOT EXISTS so I'd abandoned it. How does it fare for you compared to: select f1, f2 from fk where not exists (select 1 from pk where pk.f1=fk.f1 and pk.f2=pk.f2) where fk.f1 is not null and fk.f2 is not null; I believe the above is the appropriate not exists form for match unspecified. I've actually got code (that no longer cleanly applies, but...) that uses the single query version with NOT EXISTS (which could be easily changed to either of the other forms) and was planning to put it together for a patch when 7.5 devel started because I figured it wasn't precisely a bug and wouldn't get accepted for 7.4. I am a little lost on this point myself --- are we talking 7.4 or 7.5 for this change? I'd thought 7.5, but I now see that it's on the 7.4 open items list. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: Well, we haven't even *got* a proposed patch yet, but yeah we should tread carefully. OK. What releases had this slow restore problem? We introduced it in 7.3 --- before that, FKs were simply dumped as create trigger commands, and there was no check overhead. So arguably it is a bug; a performance bug maybe, but that's still a bug. No one has yet gone through a dump/reload cycle in which they had to face this penalty. regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Bruce Momjian [EMAIL PROTECTED] writes: Let's have multiple people eyeball the patch and give it an OK and we can add it for 7.4 if people want it. Well, we haven't even *got* a proposed patch yet, but yeah we should tread carefully. I do think it'd be okay to apply a patch if we can come up with one that Stephan and Jan and I all like. As you say, dump/reload speed normally doesn't get thought about except at this stage of the release cycle, so ... regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Let's have multiple people eyeball the patch and give it an OK and we can add it for 7.4 if people want it. Well, we haven't even *got* a proposed patch yet, but yeah we should tread carefully. I do think it'd be okay to apply a patch if we can come up with one that Stephan and Jan and I all like. As you say, dump/reload speed normally doesn't get thought about except at this stage of the release cycle, so ... OK. What releases had this slow restore problem? -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: Well, we haven't even *got* a proposed patch yet, but yeah we should tread carefully. OK. What releases had this slow restore problem? We introduced it in 7.3 --- before that, FKs were simply dumped as create trigger commands, and there was no check overhead. So arguably it is a bug; a performance bug maybe, but that's still a bug. No one has yet gone through a dump/reload cycle in which they had to face this penalty. Now that is a strong argument. I knew you would find one. :-) -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Stephan Szabo wrote: Hmm, my initial testing showed that it really was a little slower than a more complicated one with NOT EXISTS so I'd abandoned it. How does it fare for you compared to: select f1, f2 from fk where not exists (select 1 from pk where pk.f1=fk.f1 and pk.f2=pk.f2) where fk.f1 is not null and fk.f2 is not null; I believe the above is the appropriate not exists form for match unspecified. I've actually got code (that no longer cleanly applies, but...) that uses the single query version with NOT EXISTS (which could be easily changed to either of the other forms) and was planning to put it together for a patch when 7.5 devel started because I figured it wasn't precisely a bug and wouldn't get accepted for 7.4. I am a little lost on this point myself --- are we talking 7.4 or 7.5 for this change? -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Tom Lane wrote: I've actually got code (that no longer cleanly applies, but...) that uses the single query version with NOT EXISTS (which could be easily changed to either of the other forms) and was planning to put it together for a patch when 7.5 devel started because I figured it wasn't precisely a bug and wouldn't get accepted for 7.4. Well, Bruce has this on his open-items list, so I figure we have a green light to do something for 7.4 if we can work out what to do. I put it on because I wasn't clear exactly what was happening in the discussion. There also was discussion that we want to improve this now because everyone will be using for upgrading to 7.4, and with a ~50% db reload speed improvement, it is hard to ignore. I am not against the idea of adding it to 7.4 if we can do it cleanly, and in fact we are sort of waiting for more serious bug reports at this time, so doing something else to improve the code isn't out of the question if we can do it without stumbling --- seems dump/reload gets full attention only during beta, which makes sense. However, I think we have to be honest that this is a performance _improvement_, not a fix. Yea, you can say it was a bug that we did it the way we did in the past, but you have to look real hard to see it that way. :-) Let's have multiple people eyeball the patch and give it an OK and we can add it for 7.4 if people want it. If you look really hard, you can say it is a fix for a missing pg_upgrade! -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
On Sun, 28 Sep 2003, Tom Lane wrote: Stephan Szabo [EMAIL PROTECTED] writes: On Sat, 27 Sep 2003, Tom Lane wrote: I thought of what seems to be a better design for the check query: use a LEFT JOIN and check for NULL in the righthand joined column. Hmm, my initial testing showed that it really was a little slower than a more complicated one with NOT EXISTS so I'd abandoned it. How does it fare for you compared to: select f1, f2 from fk where not exists (select 1 from pk where pk.f1=fk.f1 and pk.f2=pk.f2) where fk.f1 is not null and fk.f2 is not null; Were you testing against 7.3 or 7.4? On what kinds of tables? 7.4 with tables of 1-2 integer key columns with 10k-1m pk rows of sequential data (in the 2 key case it was value and #rows-value iirc) and 1m-20m fk rows of randomly generated valid data. But it wasn't any sort of amazingly detailed test and those aren't huge tables, but I don't exactly have a huge machine. I can go back through, do more tests and report back. In 7.4 I think that the JOIN would yield as good or better a plan. The best possible plan for the NOT EXISTS query is effectively a nestloop with inner indexscan, which is great if the FK table is small and the PK table is large, but it sucks otherwise. The planner should choose a plan of this form for the LEFT JOIN given that combination of table sizes, and so there shouldn't be any great difference in runtime in that case. But in other combinations, such as large FK and small PK, other plan types will beat the pants off nestloop. That's what I was expecting too. I expected it to basically go, NOT IN, LEFT JOIN, NOT EXISTS in speed (at least when the hashing stuff happened for in given the not in enhancements), but didn't actually see that. I've actually got code (that no longer cleanly applies, but...) that uses the single query version with NOT EXISTS (which could be easily changed to either of the other forms) and was planning to put it together for a patch when 7.5 devel started because I figured it wasn't precisely a bug and wouldn't get accepted for 7.4. Well, Bruce has this on his open-items list, so I figure we have a green light to do something for 7.4 if we can work out what to do. I must have missed that. I'd have mentioned it earlier then. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
On Sat, 27 Sep 2003, Tom Lane wrote: [ continuing a discussion from mid-August ] Stephan Szabo [EMAIL PROTECTED] writes: I assume what you have in mind is to replace validateForeignKeyConstraint() with something that does a join of the two tables via an SPI command. It'd probably be: MATCH unspecified: SELECT keycolumns FROM referencing_table WHERE (keycolumns) NOT IN (SELECT refcols FROM referenced_table WHERE refcol1 IS NOT NULL AND ... ) AND keycolumn1 IS NOT NULL AND ...; MATCH FULL: (something like, I haven't tried it) SELECT keycolumns FROM referencing_table WHERE ((keycolumns) NOT IN (SELECT refcols FROM referenced_table WHERE refcol1 IS NOT NULL AND ...) AND (keycolumn1 IS NOT NULL AND ...) ) OR ((keycolumn1 IS NOT NULL)!=(keycolumn2 IS NOT NULL) OR ...) I thought of what seems to be a better design for the check query: use a LEFT JOIN and check for NULL in the righthand joined column. For example, I think a MATCH UNSPECIFIED on two columns could be tested like this: select f1,f2 from fk left join pk on (fk.f1=pk.f1 and fk.f2=pk.f2) where pk.f1 is null and (fk.f1 is not null and fk.f2 is not null); and MATCH FULL is the same except where pk.f1 is null and (fk.f1 is not null OR fk.f2 is not null); MATCH PARTIAL would be harder; I think you'd need to generate a separate query for each subset of the columns, in which you would probe for unmatched rows having exactly that subset non-null. But it could be done. Do you see any logical error here? In some preliminary tests, the planner seems to be able to choose reasonable plans for this type of query even without pg_statistic data, as long as it knows the table sizes (which it would do after CREATE INDEX). So it would work reasonably well during a pg_dump script, I think. Hmm, my initial testing showed that it really was a little slower than a more complicated one with NOT EXISTS so I'd abandoned it. How does it fare for you compared to: select f1, f2 from fk where not exists (select 1 from pk where pk.f1=fk.f1 and pk.f2=pk.f2) where fk.f1 is not null and fk.f2 is not null; I believe the above is the appropriate not exists form for match unspecified. I've actually got code (that no longer cleanly applies, but...) that uses the single query version with NOT EXISTS (which could be easily changed to either of the other forms) and was planning to put it together for a patch when 7.5 devel started because I figured it wasn't precisely a bug and wouldn't get accepted for 7.4. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
You could just as easily argue that the lack of integrity testing at data load time was equally a bug. I think we need someway of telling postgres to suppress a foreign key check. The main problem is that the foreign key column is often not indexed. Chris Bruce Momjian wrote: Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: Well, we haven't even *got* a proposed patch yet, but yeah we should tread carefully. OK. What releases had this slow restore problem? We introduced it in 7.3 --- before that, FKs were simply dumped as create trigger commands, and there was no check overhead. So arguably it is a bug; a performance bug maybe, but that's still a bug. No one has yet gone through a dump/reload cycle in which they had to face this penalty. Now that is a strong argument. I knew you would find one. :-) ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Christopher Kings-Lynne wrote: You could just as easily argue that the lack of integrity testing at data load time was equally a bug. I think we need someway of telling postgres to suppress a foreign key check. The main problem is that the foreign key column is often not indexed. As I remember, the new code is showing full table checks of a few seconds, rather than minutes, but I agree we do need a way to turn off checks some times. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Christopher Kings-Lynne [EMAIL PROTECTED] writes: I think we need someway of telling postgres to suppress a foreign key check. Well, the subtext argument here is do we fix it by providing a way to suppress the check, or do we fix it by making the check fast enough to be tolerable? I think the advantages of choice (b) are obvious --- it doesn't allow bogus data to be loaded accidentally, and it doesn't create a problem with loading existing 7.3 dump files that don't know how to suppress the check. If we find there is no way to do (b) acceptably well, then and only then would I want to consider (a). regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
I think the advantages of choice (b) are obvious --- it doesn't allow bogus data to be loaded accidentally, and it doesn't create a problem with loading existing 7.3 dump files that don't know how to suppress the check. OK, I didn't realise there was a (b). I volunteer to do speed tests on data reloading on real data for our site. Chris ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
[ continuing a discussion from mid-August ] Stephan Szabo [EMAIL PROTECTED] writes: I assume what you have in mind is to replace validateForeignKeyConstraint() with something that does a join of the two tables via an SPI command. It'd probably be: MATCH unspecified: SELECT keycolumns FROM referencing_table WHERE (keycolumns) NOT IN (SELECT refcols FROM referenced_table WHERE refcol1 IS NOT NULL AND ... ) AND keycolumn1 IS NOT NULL AND ...; MATCH FULL: (something like, I haven't tried it) SELECT keycolumns FROM referencing_table WHERE ((keycolumns) NOT IN (SELECT refcols FROM referenced_table WHERE refcol1 IS NOT NULL AND ...) AND (keycolumn1 IS NOT NULL AND ...) ) OR ((keycolumn1 IS NOT NULL)!=(keycolumn2 IS NOT NULL) OR ...) I thought of what seems to be a better design for the check query: use a LEFT JOIN and check for NULL in the righthand joined column. For example, I think a MATCH UNSPECIFIED on two columns could be tested like this: select f1,f2 from fk left join pk on (fk.f1=pk.f1 and fk.f2=pk.f2) where pk.f1 is null and (fk.f1 is not null and fk.f2 is not null); and MATCH FULL is the same except where pk.f1 is null and (fk.f1 is not null OR fk.f2 is not null); MATCH PARTIAL would be harder; I think you'd need to generate a separate query for each subset of the columns, in which you would probe for unmatched rows having exactly that subset non-null. But it could be done. Do you see any logical error here? In some preliminary tests, the planner seems to be able to choose reasonable plans for this type of query even without pg_statistic data, as long as it knows the table sizes (which it would do after CREATE INDEX). So it would work reasonably well during a pg_dump script, I think. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
Stephan Szabo [EMAIL PROTECTED] writes: On Fri, 15 Aug 2003, Christopher Kings-Lynne wrote: I can also attest to the horrendously long time it takes to restore the ADD FOREIGN KEY section... That really needs to be rewritten to do a single check over the table rather than running the constraint for every row. I keep meaning to get around to it and never actually do. :( I'm not sure that in practice you'll get a better plan at restore time depending on what the default statistics give you. In simple cases I think that the creation of indexes would be enough to get you a passable plan --- CREATE INDEX does update pg_class.reltuples, so the planner will know how big the tables are, and for single-column primary keys the existence of a unique index is enough to cue the planner that the column is unique, even without any ANALYZE stats. Those are the biggest levers on the plan choice. This assumes that pg_dump always dumps CREATE INDEX before ADD FOREIGN KEY; I'm not certain if there's anything to enforce that at the moment... I assume what you have in mind is to replace validateForeignKeyConstraint() with something that does a join of the two tables via an SPI command. But supposing that we want to keep the present ability to report (one of) the failing key values, it seems like the query has to look like SELECT keycolumns FROM referencing_table WHERE keycolumns NOT IN (SELECT refcols FROM referenced_table); which is only gonna do the right thing for one of the MATCH styles (not sure which, offhand ... actually it may not do the right thing for any match style if there are nulls in referenced_table ...). How would you make it work for all the MATCH styles? And will it really be all that efficient? (NOT IN is a lot more circumscribed than IN.) regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)
On Fri, 15 Aug 2003, Tom Lane wrote: Stephan Szabo [EMAIL PROTECTED] writes: On Fri, 15 Aug 2003, Christopher Kings-Lynne wrote: I can also attest to the horrendously long time it takes to restore the ADD FOREIGN KEY section... That really needs to be rewritten to do a single check over the table rather than running the constraint for every row. I keep meaning to get around to it and never actually do. :( I'm not sure that in practice you'll get a better plan at restore time depending on what the default statistics give you. In simple cases I think that the creation of indexes would be enough to get you a passable plan --- CREATE INDEX does update pg_class.reltuples, so the planner will know how big the tables are, and for single-column primary keys the existence of a unique index is enough to cue the planner that the column is unique, even without any ANALYZE stats. Those are the biggest levers on the plan choice. This assumes that pg_dump always dumps CREATE INDEX before ADD FOREIGN KEY; I'm not certain if there's anything to enforce that at the moment... I assume what you have in mind is to replace validateForeignKeyConstraint() with something that does a join of the two tables via an SPI command. But supposing that we want to keep the present ability to report (one of) the failing key values, it seems like the query has to look like SELECT keycolumns FROM referencing_table WHERE keycolumns NOT IN (SELECT refcols FROM referenced_table); which is only gonna do the right thing for one of the MATCH styles (not sure which, offhand ... actually it may not do the right thing for any match style if there are nulls in referenced_table ...). Yes, in practice, you'd have to put IS NOT NULL checks in the subselect, which is fine for the two match types we support since a referenced row with a NULL isn't a choice for a referenced row for those. I think MATCH PARTIAL might have to fall back to the repeated check unless we can make the query work which would be harder because you only want to compare the columns for a particular row where the keycolumn case is not null and I can't think of a query for that that'd be particularly clean and likely to be fast, then again I don't think the constraint would be either. :( It'd probably be: MATCH unspecified: SELECT keycolumns FROM referencing_table WHERE (keycolumns) NOT IN (SELECT refcols FROM referenced_table WHERE refcol1 IS NOT NULL AND ... ) AND keycolumn1 IS NOT NULL AND ...; MATCH FULL: (something like, I haven't tried it) SELECT keycolumns FROM referencing_table WHERE ((keycolumns) NOT IN (SELECT refcols FROM referenced_table WHERE refcol1 IS NOT NULL AND ...) AND (keycolumn1 IS NOT NULL AND ...) ) OR ((keycolumn1 IS NOT NULL)!=(keycolumn2 IS NOT NULL) OR ...) How would you make it work for all the MATCH styles? And will it really be all that efficient? (NOT IN is a lot more circumscribed than IN.) I'm not really sure yet. Limited tests seem to show that it'll probably be as fast if not faster for all reasonable cases, but I'd want to generate a much larger random data set and actually put it in to make a fair comparison (maybe temporarily with a set to allow people to try both cases on real world data). One other advantage here is that we don't need to get row locks while checking this if we've already gotten the exclusive table locks on both tables involved. I'm not sure if we do that currently though. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly