Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-10-08 Thread Bruce Momjian
Jan Wieck wrote:
  I think this is a larger argument than the one that was being discussed
  above. Given a dump of objects I own, can I restore them without requiring
  the fk check to be done if I alter table add constraint a foreign key? If
  the answer to that is no, then the option can be put in as a superuser
  only option and it's relatively easy. If the answer to that is yes, then
  there are additional issues that need to be resolved.
 
 Well, with the CREATE CONSTRAINT TRIGGER you _can_, but we already have 
 a consensus that we don't _want_ that. Probably we should declare it 
 deprecated and remove it in 7.5. And the option currently under 
 discussion is exactly what will cause ALTER TABLE to let you, but IMHO 
 that _should_ be restricted.

Added to TODO:

* Remove CREATE CONSTRAINT TRIGGER

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-10-08 Thread Christopher Kings-Lynne

Well, with the CREATE CONSTRAINT TRIGGER you _can_, but we already have 
a consensus that we don't _want_ that. Probably we should declare it 
deprecated and remove it in 7.5. And the option currently under 
discussion is exactly what will cause ALTER TABLE to let you, but IMHO 
that _should_ be restricted.
How can we ever remove it - what about people upgrading from 7.0, 7.1, 
7.2?  Also, people upgrading from 7.3 who've never heard of adddepend...

Chris



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-10-08 Thread Bruce Momjian
Christopher Kings-Lynne wrote:
 
 Well, with the CREATE CONSTRAINT TRIGGER you _can_, but we already have 
 a consensus that we don't _want_ that. Probably we should declare it 
 deprecated and remove it in 7.5. And the option currently under 
 discussion is exactly what will cause ALTER TABLE to let you, but IMHO 
 that _should_ be restricted.
 
 How can we ever remove it - what about people upgrading from 7.0, 7.1, 
 7.2?  Also, people upgrading from 7.3 who've never heard of adddepend...

Not sure.  We can remove documentation about it, at least.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-10-08 Thread Jan Wieck
Bruce Momjian wrote:

Christopher Kings-Lynne wrote:
Well, with the CREATE CONSTRAINT TRIGGER you _can_, but we already have 
a consensus that we don't _want_ that. Probably we should declare it 
deprecated and remove it in 7.5. And the option currently under 
discussion is exactly what will cause ALTER TABLE to let you, but IMHO 
that _should_ be restricted.

How can we ever remove it - what about people upgrading from 7.0, 7.1, 
7.2?  Also, people upgrading from 7.3 who've never heard of adddepend...
Not sure.  We can remove documentation about it, at least.

If the idea is to support any 7.n - 7.m (where n  m) upgrade directly, 
then it's IMHO time for 8.0 and clearly stating that 7.x - 8.y only is 
supported as 7.x - 7.3 - 8.0 - 8.y and you're on your own with any 
other attempt.

Don't get this wrong, I am a big friend of easy upgrades. But I am not a 
big friend of making improvements impossible by backward compatibility 
forever. It was the backward compatibility to CP/M-80 (v2.2) that 
caused MS-DOS 7.0 to have a maximum commandline length of 127 characters 
... that was taking compatibility too far. Well, M$ took it too far the 
other way from there and is compatible to nothing any more, not even to 
themself ... but at least they learned from that mistake.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-30 Thread Bruce Momjian
Stephan Szabo wrote:
 On Tue, 30 Sep 2003, Bruce Momjian wrote:
 
  Stephan Szabo wrote:
If we go that direction, why don't we just make a GUC variable to
disable constraint checking.  Is that what this will do, or is it more
limited.  I know it breaks referential integrity, but we have had many
folks as for it, it is on the TODO list, and there are tons of server
functions flying around that do just this by fiddling with pg_class.  I
would rather just have it be a GUC for that particular backend.  People
are going to need to turn it off anyway, so why not give them a clean
way to do it.
  
   But such a GUC wouldn't affect just one backend.  It'd potentially affect
   all backends that were doing concurrent modifications that would be
   involved since the locks aren't taken.  In addition, who would be allowed
   to set this value and what constraints would it affect? If it's only
   superusers, then it doesn't help for non-superuser restores.  If it's
   settable by anyone and affects only constraints on tables that user owns
   and that refer to tables that user owns it might be okay.  If it's
   settable by anyone and affects all tables it renders the constraints
   meaningless since anyone could break them.
 
  I assume it would be only setable by the super-user.  They are mucking
  around with pg_class anyway (and have permission to do so), so let them
  do it cleanly at least.  Allowing non-supers to do it for tables they
  own would be OK, I guess.  Is there a problem if some of the primary
  table is owned by someone else?  Not sure.
 
 The problem I have with a super-user only solution is that it doesn't
 solve the problem for restores in general.  I think we need a mechanism
 that works for any user that wants to restore a table (or tables) from
 dump(s), so for the dump/restore mechanism I think we should be looking in
 that direction.

OK.  Let's explore that.  What does ownership mean?  If I grant all
permissions on an object I own to you, what can you not do?  I think
GRANT/REVOKE and ALTER TABLE are the only two ones, right?

So, if I own it, I am the only one who can ALTER the table to add/remove
the foreign key constraint.  So, if I already have a foreign key
constraint on a table, I can easily remove it if I am the owner and do
whatever I want with the table.

Now, the big question is, is there harm in my saying in the system
catalogs that I have a foreign key constraint on a table, when I might
have turned off the constraint via GUC and modified the table so the
foreign key constraint isn't valid?  I think that is the big question
--- is there harm to others in saying something I own has a foreign key,
when it might not?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-30 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  How many folks are going to remember to do this?  Why make it hard for
  them?  Someone is going to forget too easily.  Why is this restore
  taking so long?  Oh, I forgot that switch.  Or they put it in a login
  file and forget it is set.  Seems safer for it to be in the dump file.
 
 I disagree.  The how many folks are going to remember to do this
 argument applies just as well to magic pg_dump switches; that's not
 a tenable argument against doing it at restore time.
 
 The difference between controlling it at pg_dump time and pg_restore
 time is that if you change your mind after having made the dump, it's
 too late, if the decision was nailed down in the dump file.  In an
 upgrade situation it's very likely that you no longer have the option
 to re-do your dump, because you already blew away your old installation.
 
 Since there's no performance difference at pg_dump time, I can't see any
 advantage to freezing your decision then.

I understand, and if everyone used pg_restore, then adding a flag to
pg_restore to do this would make sense.  However, everyone is used to
treating that dump file as a simple dump and throwing it into psql. 
Psql doesn't have any special dump flags, so you have to do the
environment variable trick, which you must admit is pretty ugly looking
and prone to typing errors, and forgetting, because they are used to
invoking psql all the time.

Maybe we need a psql dump reload flag?  Would we be able to do any other
optimizations, like increasing sort_mem or something?  That would be a
clean solution, and perhaps allow additional optimizations.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-30 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  I assume it would be only setable by the super-user.
 
 That might be a good restriction too (on top of my speculation about not
 allowing it in postgresql.conf).  Only allow it to be SET per-session,

We don't have a way to make something unsetable in postgresql.conf right
now, do we?

 and only by a superuser.

See my recent email on this about ownership.  I personally am happy
with super-user only (or db-owner and super-user only).  As I said, it
is a question of what documenting a foreign key in the system catalogs
means to folks who don't own the table.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-30 Thread Andreas Pflug
Bruce Momjian wrote:

Fact is, folks are doing it anyway by modifying pg_class.  I know one
guy who did it in a transaction so he was the only one to see the
triggers disabled!  The PostgreSQL cookbook page has an example too. 
People are always asking how to do this.  Why not just make it setable
only by the super-user.

FYI, TODO has:

* Allow triggers to be disabled [trigger]
* With disabled triggers, allow pg_dump to use ALTER TABLE ADD FOREIGN
  KEY
For practical reasons, I'd prefer the disable trigger not to influence 
fk triggers, or at least to have such a default flavor. When restoring a 
database, you might consider the data as consistent and complete, so no 
triggers and ref checks are needed at all. But in the cases of some kind 
of application data import, you might like the data to have fk ref 
checked, but don't want to trigger all user triggers.
The implementation of fk checking by triggers should normally be hidden 
to the user.

Regards,
Andreas
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-30 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Since there's no performance difference at pg_dump time, I can't see any
 advantage to freezing your decision then.

 I understand, and if everyone used pg_restore, then adding a flag to
 pg_restore to do this would make sense.  However, everyone is used to
 treating that dump file as a simple dump and throwing it into psql. 

So?  A GUC variable could be set equally easily either way.  In fact
more so.

 Psql doesn't have any special dump flags, so you have to do the
 environment variable trick,

You forgot SET ...

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-30 Thread Stephan Szabo
On Tue, 30 Sep 2003, Jan Wieck wrote:

 Stephan Szabo wrote:
  On Tue, 30 Sep 2003, Tom Lane wrote:
 
  I see where Stephan is coming from, but in my mind disabling consistency
  checks ought to be a feature reserved to the DBA (ie superuser), who
  presumably has some clue about the tradeoffs involved.  I don't think
  ordinary users should be able to do it.  If we can get the cost of
  performing the initial check down to something reasonable (and I don't
  mean near zero, I mean something that's small in comparison to the
  other costs of loading data and creating indexes), then I think we've
  done as much as we should do for ordinary users.
 
  Limiting the cases under which constraint ignoring works is certainly
  fine by me, but I was assuming that we were trying to make it accessable
  to any restore. If that's not true, then we don't need to worry about that
  part of the issue.

 It is not true.

 Fact is that restoring can require more rights than creating the dump.
 That is already the case if you want to restore anything that contains
 objects owned by different users. Trying to enable everyone who can take
 a dump also to restore it, by whatever mechanism, gives someone the
 right to revert things in time and create a situation (consistent or
 not) that he could not (re)create without doing dump/restore. This is
 wrong and should not be possible.

I think this is a larger argument than the one that was being discussed
above. Given a dump of objects I own, can I restore them without requiring
the fk check to be done if I alter table add constraint a foreign key? If
the answer to that is no, then the option can be put in as a superuser
only option and it's relatively easy. If the answer to that is yes, then
there are additional issues that need to be resolved.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-30 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 That might be a good restriction too (on top of my speculation about not
 allowing it in postgresql.conf).  Only allow it to be SET per-session,

 We don't have a way to make something unsetable in postgresql.conf right
 now, do we?

Yeah, we do --- see GUC_DISALLOW_IN_FILE.  The existing variable
zero_damaged_pages has the same restrictions we're talking about here,
and for largely the same reasons: you can shoot yourself in the foot
with it.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-30 Thread Jan Wieck
Stephan Szabo wrote:

On Tue, 30 Sep 2003, Jan Wieck wrote:

Stephan Szabo wrote:
 On Tue, 30 Sep 2003, Tom Lane wrote:

 I see where Stephan is coming from, but in my mind disabling consistency
 checks ought to be a feature reserved to the DBA (ie superuser), who
 presumably has some clue about the tradeoffs involved.  I don't think
 ordinary users should be able to do it.  If we can get the cost of
 performing the initial check down to something reasonable (and I don't
 mean near zero, I mean something that's small in comparison to the
 other costs of loading data and creating indexes), then I think we've
 done as much as we should do for ordinary users.

 Limiting the cases under which constraint ignoring works is certainly
 fine by me, but I was assuming that we were trying to make it accessable
 to any restore. If that's not true, then we don't need to worry about that
 part of the issue.
It is not true.

Fact is that restoring can require more rights than creating the dump.
That is already the case if you want to restore anything that contains
objects owned by different users. Trying to enable everyone who can take
a dump also to restore it, by whatever mechanism, gives someone the
right to revert things in time and create a situation (consistent or
not) that he could not (re)create without doing dump/restore. This is
wrong and should not be possible.
I think this is a larger argument than the one that was being discussed
above. Given a dump of objects I own, can I restore them without requiring
the fk check to be done if I alter table add constraint a foreign key? If
the answer to that is no, then the option can be put in as a superuser
only option and it's relatively easy. If the answer to that is yes, then
there are additional issues that need to be resolved.
Well, with the CREATE CONSTRAINT TRIGGER you _can_, but we already have 
a consensus that we don't _want_ that. Probably we should declare it 
deprecated and remove it in 7.5. And the option currently under 
discussion is exactly what will cause ALTER TABLE to let you, but IMHO 
that _should_ be restricted.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-30 Thread Stephan Szabo

On Tue, 30 Sep 2003, Tom Lane wrote:

 I see where Stephan is coming from, but in my mind disabling consistency
 checks ought to be a feature reserved to the DBA (ie superuser), who
 presumably has some clue about the tradeoffs involved.  I don't think
 ordinary users should be able to do it.  If we can get the cost of
 performing the initial check down to something reasonable (and I don't
 mean near zero, I mean something that's small in comparison to the
 other costs of loading data and creating indexes), then I think we've
 done as much as we should do for ordinary users.

Limiting the cases under which constraint ignoring works is certainly
fine by me, but I was assuming that we were trying to make it accessable
to any restore. If that's not true, then we don't need to worry about that
part of the issue.

As a side note, in the partial implementation I'd already done, I noticed
a potential problem if the person doing the alter table didn't have read
permissions on the pktable. I'd written it to bail and do the slow check
in that case (well actually in most error cases that didn't themselves
cause an elog), does anyone have a better idea?

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-30 Thread Tom Lane
Stephan Szabo [EMAIL PROTECTED] writes:
 As a side note, in the partial implementation I'd already done, I noticed
 a potential problem if the person doing the alter table didn't have read
 permissions on the pktable. I'd written it to bail and do the slow check
 in that case (well actually in most error cases that didn't themselves
 cause an elog), does anyone have a better idea?

Wouldn't all the subsequent triggers fail also in such a case?  (For
that matter, wouldn't the existing implementation of the initial check
fail?)  I can't see a reason to expend code to avoid failing here.  It's
not very sensible to be able to create an FK on a table you don't have
read permission for.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-30 Thread Stephan Szabo

On Tue, 30 Sep 2003, Tom Lane wrote:

 Stephan Szabo [EMAIL PROTECTED] writes:
  As a side note, in the partial implementation I'd already done, I noticed
  a potential problem if the person doing the alter table didn't have read
  permissions on the pktable. I'd written it to bail and do the slow check
  in that case (well actually in most error cases that didn't themselves
  cause an elog), does anyone have a better idea?

 Wouldn't all the subsequent triggers fail also in such a case?  (For
 that matter, wouldn't the existing implementation of the initial check
 fail?)  I can't see a reason to expend code to avoid failing here.  It's

No, because the triggers change permissions to the owner of the
appropriate (either fk or pk) table before running the query, so the old
method works as well as the final constraint would. However, if the two
owners are not the same, you can't set to both during the single query.

 not very sensible to be able to create an FK on a table you don't have
 read permission for.

IIRC, you only need references permissions to make an fk constraint, not
select.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-30 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Stephan Szabo wrote:
 The problem I have with a super-user only solution is that it doesn't
 solve the problem for restores in general.

 OK.  Let's explore that.  What does ownership mean?

It does not normally mean the ability to bypass consistency checks;
for example, if you put a CHECK constraint on a table, you don't get
to violate it because you own the table.  (Of course superuserness
doesn't let you do so either...)

I see where Stephan is coming from, but in my mind disabling consistency
checks ought to be a feature reserved to the DBA (ie superuser), who
presumably has some clue about the tradeoffs involved.  I don't think
ordinary users should be able to do it.  If we can get the cost of
performing the initial check down to something reasonable (and I don't
mean near zero, I mean something that's small in comparison to the
other costs of loading data and creating indexes), then I think we've
done as much as we should do for ordinary users.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-30 Thread Jan Wieck
Stephan Szabo wrote:
On Tue, 30 Sep 2003, Tom Lane wrote:

I see where Stephan is coming from, but in my mind disabling consistency
checks ought to be a feature reserved to the DBA (ie superuser), who
presumably has some clue about the tradeoffs involved.  I don't think
ordinary users should be able to do it.  If we can get the cost of
performing the initial check down to something reasonable (and I don't
mean near zero, I mean something that's small in comparison to the
other costs of loading data and creating indexes), then I think we've
done as much as we should do for ordinary users.
Limiting the cases under which constraint ignoring works is certainly
fine by me, but I was assuming that we were trying to make it accessable
to any restore. If that's not true, then we don't need to worry about that
part of the issue.
It is not true.

Fact is that restoring can require more rights than creating the dump. 
That is already the case if you want to restore anything that contains 
objects owned by different users. Trying to enable everyone who can take 
a dump also to restore it, by whatever mechanism, gives someone the 
right to revert things in time and create a situation (consistent or 
not) that he could not (re)create without doing dump/restore. This is 
wrong and should not be possible.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Andreas Pflug
Christopher Kings-Lynne wrote:

You could just as easily argue that the lack of integrity testing at 
data load time was equally a bug.

I think we need someway of telling postgres to suppress a foreign key 
check.

The main problem is that the foreign key column is often not indexed. 
So a db designer made a bloody mistake.
The problem is there's no easy way to find out what's missing.
I'd really like EXPLAIN to display all subsequent triggered queries 
also, to see the full scans caused by missing indexes.

Regards,
Andreas


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Christopher Kings-Lynne
So a db designer made a bloody mistake.
The problem is there's no easy way to find out what's missing.
I'd really like EXPLAIN to display all subsequent triggered queries 
also, to see the full scans caused by missing indexes.
I'd sure second that!

Chris



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Hannu Krosing
Andreas Pflug kirjutas E, 29.09.2003 kell 12:04:
 Christopher Kings-Lynne wrote:
 
  You could just as easily argue that the lack of integrity testing at 
  data load time was equally a bug.
 
  I think we need someway of telling postgres to suppress a foreign key 
  check.
 
  The main problem is that the foreign key column is often not indexed. 
 
 So a db designer made a bloody mistake.
 The problem is there's no easy way to find out what's missing.
 I'd really like EXPLAIN to display all subsequent triggered queries 
 also, to see the full scans caused by missing indexes.

It could probably be doable for EXPLAIN ANALYZE (by actually tracing
execution), but then you will see really _all_ queries, i.e. for a 1000
row update you would see 1 UPDATE query and 1000 fk checks ...

OTOH, you probably can get that already from logs with right logging
parameters.

-
Hannu


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Christopher Kings-Lynne

So a db designer made a bloody mistake.
The problem is there's no easy way to find out what's missing.
I'd really like EXPLAIN to display all subsequent triggered queries 
also, to see the full scans caused by missing indexes.


It could probably be doable for EXPLAIN ANALYZE (by actually tracing
execution), but then you will see really _all_ queries, i.e. for a 1000
row update you would see 1 UPDATE query and 1000 fk checks ...
OTOH, you probably can get that already from logs with right logging
parameters.
Actually - it shouldn't be too hard to write a query that returns all 
unindexed foreign keys, surely?

Chris



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Shridhar Daithankar
On Monday 29 September 2003 15:58, Christopher Kings-Lynne wrote:
 So a db designer made a bloody mistake.
 The problem is there's no easy way to find out what's missing.
 I'd really like EXPLAIN to display all subsequent triggered queries
 also, to see the full scans caused by missing indexes.
 
  It could probably be doable for EXPLAIN ANALYZE (by actually tracing
  execution), but then you will see really _all_ queries, i.e. for a 1000
  row update you would see 1 UPDATE query and 1000 fk checks ...
 
  OTOH, you probably can get that already from logs with right logging
  parameters.

 Actually - it shouldn't be too hard to write a query that returns all
 unindexed foreign keys, surely?

Correct me if I am wrong but I remember postgresql throwing error that foreign 
key field was not unique in foreign table. Obviously it can not detect that 
without an index. Either primary key or unique constraint would need an 
index.

What am I missing here?


IOW, how do I exactly create foreign keys without an index?

 Shridhar


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Nigel J. Andrews
On Mon, 29 Sep 2003, Christopher Kings-Lynne wrote:

  So a db designer made a bloody mistake.
  The problem is there's no easy way to find out what's missing.
  I'd really like EXPLAIN to display all subsequent triggered queries 
  also, to see the full scans caused by missing indexes.
 
 I'd sure second that!

That's only partially determinable though. The trigger code could branch and
run two different queries depending on the values supplied in the 'input' tuple
of the trigger.

Nigel



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Hannu Krosing
Shridhar Daithankar kirjutas E, 29.09.2003 kell 13:34:
 On Monday 29 September 2003 15:58, Christopher Kings-Lynne wrote:
  So a db designer made a bloody mistake.
  The problem is there's no easy way to find out what's missing.
  I'd really like EXPLAIN to display all subsequent triggered queries
  also, to see the full scans caused by missing indexes.
  
   It could probably be doable for EXPLAIN ANALYZE (by actually tracing
   execution), but then you will see really _all_ queries, i.e. for a 1000
   row update you would see 1 UPDATE query and 1000 fk checks ...
  
   OTOH, you probably can get that already from logs with right logging
   parameters.
 
  Actually - it shouldn't be too hard to write a query that returns all
  unindexed foreign keys, surely?
 
 Correct me if I am wrong but I remember postgresql throwing error that foreign 
 key field was not unique in foreign table. Obviously it can not detect that 
 without an index. Either primary key or unique constraint would need an 
 index.
 
 What am I missing here?
 
 
 IOW, how do I exactly create foreign keys without an index?

hannu=# create table pkt(i int primary key);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
'pkt_pkey' for table 'pkt'
CREATE TABLE
hannu=# create table fkt(j int references pkt);
NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN KEY
check(s)
CREATE TABLE
hannu=#

now the *foreygn key* column (fkt.j) is without index. As foreign keys
are enforced both ways, this can be a problem when changing table pkt or
bulk creating FK's on big tables.


Hannu


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Andreas Pflug
Nigel J. Andrews wrote:

On Mon, 29 Sep 2003, Christopher Kings-Lynne wrote:

 

So a db designer made a bloody mistake.
The problem is there's no easy way to find out what's missing.
I'd really like EXPLAIN to display all subsequent triggered queries 
also, to see the full scans caused by missing indexes.
 

I'd sure second that!
   

That's only partially determinable though. The trigger code could branch and
run two different queries depending on the values supplied in the 'input' tuple
of the trigger.
That would be ok; if I got a problem with a certain query, I don't 
expect to find problems I might get with other queries. Though this 
would be nice, how about a general pg_gimme_all_problems() function :-)

Regards,
Andreas


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Greg Stark


   So a db designer made a bloody mistake.

Not necessarily. If I'm never going to update or delete from the parent table
the index would be useless. I find very few of my foreign key relationships
actually need indexes on the child table. I usually only have the unique index
on the parent table.

And often the child table is the big table. The index would be very large and
have awful selectivity. The last thing I want is a 5-million record table with
half a dozen indexes each with 10-20 unique values.

   The problem is there's no easy way to find out what's missing.
   I'd really like EXPLAIN to display all subsequent triggered queries 
   also, to see the full scans caused by missing indexes.
  
  I'd sure second that!

I think the root of problem here is the same as the root of the problem with
foreign key checks being slow for large batch updates and inserts. Namely that
foreign key constraint checks are being handled as a million small queries.

To handle foreign key constraints optimally they would really have to be
merged into the plan in a kind of join. For most inserts/updates something
like a nested-loop join that is effectively the same as the current triggers
would be used. But for large batch updates/inserts it's quite possible that it
would look more like a hash join or even a merge join.

To do that would probably mean throwing out the whole trigger-based
implementation though, which seems like an awfully big project.

And being able to disable and reenable constraints would still be nice.
They're never going to be instantaneous. And besides, speed isn't the only
reason to want to disable constraints temporarily. The database is a tool, it
should be there to do the DBA's bidding, not the other way around :)

-- 
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Stephan Szabo
On Sun, 28 Sep 2003, Tom Lane wrote:

 Stephan Szabo [EMAIL PROTECTED] writes:
  On Sat, 27 Sep 2003, Tom Lane wrote:
  I thought of what seems to be a better design for the check query: use
  a LEFT JOIN and check for NULL in the righthand joined column.

  Hmm, my initial testing showed that it really was a little slower
  than a more complicated one with NOT EXISTS so I'd abandoned it. How does
  it fare for you compared to:
  select f1, f2 from fk where not exists (select 1 from pk where pk.f1=fk.f1
  and pk.f2=pk.f2) where fk.f1 is not null and fk.f2 is not null;

 Were you testing against 7.3 or 7.4?  On what kinds of tables?

 In 7.4 I think that the JOIN would yield as good or better a plan.  The
 best possible plan for the NOT EXISTS query is effectively a nestloop
 with inner indexscan, which is great if the FK table is small and the
 PK table is large, but it sucks otherwise.  The planner should choose a
 plan of this form for the LEFT JOIN given that combination of table
 sizes, and so there shouldn't be any great difference in runtime in that
 case.  But in other combinations, such as large FK and small PK, other
 plan types will beat the pants off nestloop.

As an update, so far I still am getting better results with NOT EXISTS
than the left join.  For a 50m row fk, 10k row pk where the rows are just
the keys, I'm getting a plan like
Merge Join
 Index scan on pktable
 Sort
  Seqscan on fktable

which is taking about 2-4 times longer for me than the not exists
depending on sort_mem (at 4096,64000,128000).

When I lowered random_page_cost to 1, I got an indexscan on fktable, but
that hadn't seemed to finish after about 2 hours (as opposed to about
30-35 minutes for the not exists and about 30 min - 1 1/2 hours for the
sort+seqscan version.

I want to do some more tests where there's extraneous data in both tables
and see what that does to the results.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Stephan Szabo

On Mon, 29 Sep 2003, Stephan Szabo wrote:

 When I lowered random_page_cost to 1, I got an indexscan on fktable, but
 that hadn't seemed to finish after about 2 hours (as opposed to about
 30-35 minutes for the not exists and about 30 min - 1 1/2 hours for the

Small correction, I'd meant to type 20-35 minutes above, not 30-35.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Tom Lane
Stephan Szabo [EMAIL PROTECTED] writes:
 As an update, so far I still am getting better results with NOT EXISTS
 than the left join.

Hm.  Don't suppose you were using EXPLAIN ANALYZE so we could see what's
happening?  This is clearly a planner failure, although I'm unsure if we
can expect the planner to get the right answer with no pg_statistic entries.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Stephan Szabo
On Mon, 29 Sep 2003, Tom Lane wrote:

 Stephan Szabo [EMAIL PROTECTED] writes:
  As an update, so far I still am getting better results with NOT EXISTS
  than the left join.

 Hm.  Don't suppose you were using EXPLAIN ANALYZE so we could see what's
 happening?  This is clearly a planner failure, although I'm unsure if we
 can expect the planner to get the right answer with no pg_statistic entries.

For the sort+seq one and the not exists, I had, but I'll re-run it (it's
on my home desktop that I won't be able to access).  The other when I
forced it to use an index scan I haven't let complete yet, and I don't
know how long that will take.

I was also planning to run a set after running analyze, so I'll include
those too.  It'll probably be a few hours before the results are in. :)

Are there any other options (enable_mergejoin, etc) that you want me to
try with?



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Andreas Pflug
Greg Stark wrote:

 

So a db designer made a bloody mistake.
   

Not necessarily. If I'm never going to update or delete from the parent table
the index would be useless. I find very few of my foreign key relationships
actually need indexes on the child table. I usually only have the unique index
on the parent table.
And often the child table is the big table. The index would be very large and
have awful selectivity. The last thing I want is a 5-million record table with
half a dozen indexes each with 10-20 unique values.
 

The problem is there's no easy way to find out what's missing.
I'd really like EXPLAIN to display all subsequent triggered queries 
also, to see the full scans caused by missing indexes.
   

I'd sure second that!
 

I think the root of problem here is the same as the root of the problem with
foreign key checks being slow for large batch updates and inserts. Namely that
foreign key constraint checks are being handled as a million small queries.
Apart from missing indices, this is certainly a problem. Statement level 
triggers will solve this, as soon as they are fully implemented and 
support OLD and NEW record sets.

Regards,
Andreas
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Tom Lane
Stephan Szabo [EMAIL PROTECTED] writes:
 Hm.  Don't suppose you were using EXPLAIN ANALYZE so we could see what's
 happening?  This is clearly a planner failure, although I'm unsure if we
 can expect the planner to get the right answer with no pg_statistic entries.

 The left join one seems to give me values like the following:

There are some fishy row estimates in here:

-  Index Scan using pktest_a_key on pktest  (cost=0.00..52.00
 rows=1000 width=8) (actual time=17.82..1609.97 rows=1 loops=1)

The system definitely should be expected to have the accurate row count
for the PK table, since an index should have been created on it (and we
do do that after loading the data, no?).  It is possible that it'd have
the default 1000 estimate for the FK table, if there are no indexes at
all on the FK table; otherwise it should have the right number.  It's
not real clear to me what conditions you're testing under, but the
estimates in the plans you're quoting aren't consistent ...

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Stephan Szabo
On Mon, 29 Sep 2003, Tom Lane wrote:

 Stephan Szabo [EMAIL PROTECTED] writes:
  Hm.  Don't suppose you were using EXPLAIN ANALYZE so we could see what's
  happening?  This is clearly a planner failure, although I'm unsure if we
  can expect the planner to get the right answer with no pg_statistic entries.

  The left join one seems to give me values like the following:

 There are some fishy row estimates in here:

 -  Index Scan using pktest_a_key on pktest  (cost=0.00..52.00
  rows=1000 width=8) (actual time=17.82..1609.97 rows=1 loops=1)

 The system definitely should be expected to have the accurate row count
 for the PK table, since an index should have been created on it (and we
 do do that after loading the data, no?).  It is possible that it'd have
 the default 1000 estimate for the FK table, if there are no indexes at
 all on the FK table; otherwise it should have the right number.  It's
 not real clear to me what conditions you're testing under, but the
 estimates in the plans you're quoting aren't consistent ...

Well, they're all from the same load of the same data with only stopping
and starting in between, but I did make the index on the pk table first
loaded the data and then built the fk table index ( because I'd wanted to
try without the index as well), which meant that it wouldn't match the
behavior of a dump. Ugh, I'd forgotten that the primary key didn't get
created until later too.

Okay, that's much better:
 Hash Left Join  (cost=203.00..1487869.29 rows=49501250 width=4) (actual
time=611632.67..611632.67 rows=0 loops=1)
   Hash Cond: ((outer.b = inner.a) AND (outer.c = inner.b))
   Filter: (inner.a IS NULL)
   -  Seq Scan on fktest  (cost=0.00..745099.00 rows=49501250 width=8)
(actual time=0.01..169642.48 rows=5000 loops=1)
 Filter: ((b IS NOT NULL) AND (c IS NOT NULL))
   -  Hash  (cost=152.00..152.00 rows=1 width=8) (actual
time=46.04..46.04 rows=0 loops=1)
 -  Seq Scan on pktest  (cost=0.00..152.00 rows=1 width=8)
(actual time=0.02..21.38 rows=1 loops=1)
 Total runtime: 611632.95 msec
(8 rows)

That's much better. :)  As long as the row estimates are reasonable it
seems to be okay, but I do wonder why it chose the merge join for the case
when it thought there was only 1000 rows though.

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Stephan Szabo

(I think my previous attempt got aborted by a lost connection, so a
message like this may arrive twice)

On Mon, 29 Sep 2003, Tom Lane wrote:

 Stephan Szabo [EMAIL PROTECTED] writes:
  Hm.  Don't suppose you were using EXPLAIN ANALYZE so we could see what's
  happening?  This is clearly a planner failure, although I'm unsure if we
  can expect the planner to get the right answer with no pg_statistic entries.

  The left join one seems to give me values like the following:

 There are some fishy row estimates in here:

 -  Index Scan using pktest_a_key on pktest  (cost=0.00..52.00
  rows=1000 width=8) (actual time=17.82..1609.97 rows=1 loops=1)

 The system definitely should be expected to have the accurate row count
 for the PK table, since an index should have been created on it (and we
 do do that after loading the data, no?).  It is possible that it'd have
 the default 1000 estimate for the FK table, if there are no indexes at
 all on the FK table; otherwise it should have the right number.  It's
 not real clear to me what conditions you're testing under, but the
 estimates in the plans you're quoting aren't consistent ...

Also, the sequence was basically:
CREATE TABLE pktest(a int, b int, unique(a,b));
CREATE TABLE fktest(b int, c int);
COPY pktest FROM STDIN;
...
COPY fktest FROM STDIN;
...
run some tests I didn't mention here
CREATE INDEX fki on fktest(b,c);
run the above test

With stopping and restarting the server involved and running the tests
multiple times.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Jan Wieck
Tom Lane wrote:
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
I think we need someway of telling postgres to suppress a foreign key check.
Well, the subtext argument here is do we fix it by providing a way to
suppress the check, or do we fix it by making the check fast enough to
be tolerable?
I think the advantages of choice (b) are obvious --- it doesn't allow
bogus data to be loaded accidentally, and it doesn't create a problem
with loading existing 7.3 dump files that don't know how to suppress the
check.
If we find there is no way to do (b) acceptably well, then and only then
would I want to consider (a).
I think I can accept it to be the choice of the DBA what to do. Pg_dump 
has that kind of options already, one can choose between COPY and INSERT 
for example. Why not adding the choice of dumping FKeys as ALTER TABLE 
or CREATE CONSTRAINT TRIGGER?

The whole original idea (way back a few years ago) of doing it with 
the CREATE CONSTRAINT TRIGGER command was, that your backup ought to 
be consistent anyway. Finding out that your tape contains inconsistent 
garbage _after_ your harddisk made that suspicious noise ... is a bit 
late, isn't it?

That ALTER TABLE ... ADD CONSTRAINT needs to be improved, because at the 
moment it is normally used we cannot make any assumptions about data 
consistency, no question. But just because we have such a nice and 
allways asked for ALTER TABLE command does not mean we have to force 
every DBA of every well maintained and stable system to perform hourly 
long nonsense-tests on known-to-be-good data.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Tom Lane
Jan Wieck [EMAIL PROTECTED] writes:
 I think I can accept it to be the choice of the DBA what to do. Pg_dump 
 has that kind of options already, one can choose between COPY and INSERT 
 for example. Why not adding the choice of dumping FKeys as ALTER TABLE 
 or CREATE CONSTRAINT TRIGGER?

We don't want it dumping as CREATE CONSTRAINT TRIGGER, because (a) that
loses pg_depend information and (b) it's too low-level a representation;
we couldn't ever change the implementation of foreign keys as long as
dumps look like that.

Also, I don't see why you'd want to make such a choice at pg_dump time.
Probably better to control it at restore time.  Accordingly, my proposal
if we were to go that route would be a boolean GUC variable that simply
prevents ALTER TABLE ADD FOREIGN KEY from doing the validity checks.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Jan Wieck


Tom Lane wrote:

Jan Wieck [EMAIL PROTECTED] writes:
I think I can accept it to be the choice of the DBA what to do. Pg_dump 
has that kind of options already, one can choose between COPY and INSERT 
for example. Why not adding the choice of dumping FKeys as ALTER TABLE 
or CREATE CONSTRAINT TRIGGER?
We don't want it dumping as CREATE CONSTRAINT TRIGGER, because (a) that
loses pg_depend information and (b) it's too low-level a representation;
we couldn't ever change the implementation of foreign keys as long as
dumps look like that.
That's finally 2 points, okay.

Also, I don't see why you'd want to make such a choice at pg_dump time.
Probably better to control it at restore time.  Accordingly, my proposal
if we were to go that route would be a boolean GUC variable that simply
prevents ALTER TABLE ADD FOREIGN KEY from doing the validity checks.
Okay too. And this would be simple and safe enough to add it at the time 
being.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Christopher Kings-Lynne

Correct me if I am wrong but I remember postgresql throwing error that foreign 
key field was not unique in foreign table. Obviously it can not detect that 
without an index. Either primary key or unique constraint would need an 
index.

What am I missing here?

IOW, how do I exactly create foreign keys without an index?
You are taling about the primary key or the referenced key, not the 
foreign key.  (eg. the source column)

Chris



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Bruce Momjian
Jan Wieck wrote:
 
 
 Tom Lane wrote:
 
  Jan Wieck [EMAIL PROTECTED] writes:
  I think I can accept it to be the choice of the DBA what to do. Pg_dump 
  has that kind of options already, one can choose between COPY and INSERT 
  for example. Why not adding the choice of dumping FKeys as ALTER TABLE 
  or CREATE CONSTRAINT TRIGGER?
  
  We don't want it dumping as CREATE CONSTRAINT TRIGGER, because (a) that
  loses pg_depend information and (b) it's too low-level a representation;
  we couldn't ever change the implementation of foreign keys as long as
  dumps look like that.
 
 That's finally 2 points, okay.
 
  
  Also, I don't see why you'd want to make such a choice at pg_dump time.
  Probably better to control it at restore time.  Accordingly, my proposal
  if we were to go that route would be a boolean GUC variable that simply
  prevents ALTER TABLE ADD FOREIGN KEY from doing the validity checks.
 
 Okay too. And this would be simple and safe enough to add it at the time 
 being.

If we go that direction, why don't we just make a GUC variable to
disable constraint checking.  Is that what this will do, or is it more
limited.  I know it breaks referential integrity, but we have had many
folks as for it, it is on the TODO list, and there are tons of server
functions flying around that do just this by fiddling with pg_class.  I
would rather just have it be a GUC for that particular backend.  People
are going to need to turn it off anyway, so why not give them a clean
way to do it.

Also, how does someone turn it on at restore time if they are piping
into psql?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Jan Wieck wrote:
 Tom Lane wrote:
 if we were to go that route would be a boolean GUC variable that simply
 prevents ALTER TABLE ADD FOREIGN KEY from doing the validity checks.
 
 Okay too. And this would be simple and safe enough to add it at the time 
 being.

 If we go that direction, why don't we just make a GUC variable to
 disable constraint checking.

You mean in general, even for plain old insert/update/delete changes?
Yipes.  What happened to ACID compliance?

What I actually expected to ensue was a discussion about how we could
narrow down the effects of a disable-foreign-key-verification switch to
reduce the odds of shooting oneself in the foot.  (For example, maybe
disallow it from being set in postgresql.conf.)  I wasn't expecting
proposals to enlarge the gauge of the foot-gun ...

 Also, how does someone turn it on at restore time if they are piping
 into psql?

Something like
export PGOPTIONS=-c disable-fk-verification=true
then run psql or pg_restore.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Stephan Szabo
On Tue, 30 Sep 2003, Bruce Momjian wrote:

 Jan Wieck wrote:
 
 
  Tom Lane wrote:
 
   Jan Wieck [EMAIL PROTECTED] writes:
   I think I can accept it to be the choice of the DBA what to do. Pg_dump
   has that kind of options already, one can choose between COPY and INSERT
   for example. Why not adding the choice of dumping FKeys as ALTER TABLE
   or CREATE CONSTRAINT TRIGGER?
  
   We don't want it dumping as CREATE CONSTRAINT TRIGGER, because (a) that
   loses pg_depend information and (b) it's too low-level a representation;
   we couldn't ever change the implementation of foreign keys as long as
   dumps look like that.
 
  That's finally 2 points, okay.
 
  
   Also, I don't see why you'd want to make such a choice at pg_dump time.
   Probably better to control it at restore time.  Accordingly, my proposal
   if we were to go that route would be a boolean GUC variable that simply
   prevents ALTER TABLE ADD FOREIGN KEY from doing the validity checks.
 
  Okay too. And this would be simple and safe enough to add it at the time
  being.

 If we go that direction, why don't we just make a GUC variable to
 disable constraint checking.  Is that what this will do, or is it more
 limited.  I know it breaks referential integrity, but we have had many
 folks as for it, it is on the TODO list, and there are tons of server
 functions flying around that do just this by fiddling with pg_class.  I
 would rather just have it be a GUC for that particular backend.  People
 are going to need to turn it off anyway, so why not give them a clean
 way to do it.

But such a GUC wouldn't affect just one backend.  It'd potentially affect
all backends that were doing concurrent modifications that would be
involved since the locks aren't taken.  In addition, who would be allowed
to set this value and what constraints would it affect? If it's only
superusers, then it doesn't help for non-superuser restores.  If it's
settable by anyone and affects only constraints on tables that user owns
and that refer to tables that user owns it might be okay.  If it's
settable by anyone and affects all tables it renders the constraints
meaningless since anyone could break them.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Bruce Momjian
Tom Lane wrote:
  If we go that direction, why don't we just make a GUC variable to
  disable constraint checking.
 
 You mean in general, even for plain old insert/update/delete changes?
 Yipes.  What happened to ACID compliance?
 
 What I actually expected to ensue was a discussion about how we could
 narrow down the effects of a disable-foreign-key-verification switch to
 reduce the odds of shooting oneself in the foot.  (For example, maybe
 disallow it from being set in postgresql.conf.)  I wasn't expecting
 proposals to enlarge the gauge of the foot-gun ...

Fact is, folks are doing it anyway by modifying pg_class.  I know one
guy who did it in a transaction so he was the only one to see the
triggers disabled!  The PostgreSQL cookbook page has an example too. 
People are always asking how to do this.  Why not just make it setable
only by the super-user.

FYI, TODO has:

* Allow triggers to be disabled [trigger]
* With disabled triggers, allow pg_dump to use ALTER TABLE ADD FOREIGN
  KEY

The second one is the one we are discussing.  If we never want to do it,
I will remove it from the TODO list.  However, I think we might be
making things too controlled by not allowing administrators to do this.


  Also, how does someone turn it on at restore time if they are piping
  into psql?
 
 Something like
 export PGOPTIONS=-c disable-fk-verification=true
 then run psql or pg_restore.

How many folks are going to remember to do this?  Why make it hard for
them?  Someone is going to forget too easily.  Why is this restore
taking so long?  Oh, I forgot that switch.  Or they put it in a login
file and forget it is set.  Seems safer for it to be in the dump file.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Bruce Momjian
Stephan Szabo wrote:
  If we go that direction, why don't we just make a GUC variable to
  disable constraint checking.  Is that what this will do, or is it more
  limited.  I know it breaks referential integrity, but we have had many
  folks as for it, it is on the TODO list, and there are tons of server
  functions flying around that do just this by fiddling with pg_class.  I
  would rather just have it be a GUC for that particular backend.  People
  are going to need to turn it off anyway, so why not give them a clean
  way to do it.
 
 But such a GUC wouldn't affect just one backend.  It'd potentially affect
 all backends that were doing concurrent modifications that would be
 involved since the locks aren't taken.  In addition, who would be allowed
 to set this value and what constraints would it affect? If it's only
 superusers, then it doesn't help for non-superuser restores.  If it's
 settable by anyone and affects only constraints on tables that user owns
 and that refer to tables that user owns it might be okay.  If it's
 settable by anyone and affects all tables it renders the constraints
 meaningless since anyone could break them.

I assume it would be only setable by the super-user.  They are mucking
around with pg_class anyway (and have permission to do so), so let them
do it cleanly at least.  Allowing non-supers to do it for tables they
own would be OK, I guess.  Is there a problem if some of the primary
table is owned by someone else?  Not sure.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 How many folks are going to remember to do this?  Why make it hard for
 them?  Someone is going to forget too easily.  Why is this restore
 taking so long?  Oh, I forgot that switch.  Or they put it in a login
 file and forget it is set.  Seems safer for it to be in the dump file.

I disagree.  The how many folks are going to remember to do this
argument applies just as well to magic pg_dump switches; that's not
a tenable argument against doing it at restore time.

The difference between controlling it at pg_dump time and pg_restore
time is that if you change your mind after having made the dump, it's
too late, if the decision was nailed down in the dump file.  In an
upgrade situation it's very likely that you no longer have the option
to re-do your dump, because you already blew away your old installation.

Since there's no performance difference at pg_dump time, I can't see any
advantage to freezing your decision then.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 I assume it would be only setable by the super-user.

That might be a good restriction too (on top of my speculation about not
allowing it in postgresql.conf).  Only allow it to be SET per-session,
and only by a superuser.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Stephan Szabo
On Tue, 30 Sep 2003, Bruce Momjian wrote:

 Stephan Szabo wrote:
   If we go that direction, why don't we just make a GUC variable to
   disable constraint checking.  Is that what this will do, or is it more
   limited.  I know it breaks referential integrity, but we have had many
   folks as for it, it is on the TODO list, and there are tons of server
   functions flying around that do just this by fiddling with pg_class.  I
   would rather just have it be a GUC for that particular backend.  People
   are going to need to turn it off anyway, so why not give them a clean
   way to do it.
 
  But such a GUC wouldn't affect just one backend.  It'd potentially affect
  all backends that were doing concurrent modifications that would be
  involved since the locks aren't taken.  In addition, who would be allowed
  to set this value and what constraints would it affect? If it's only
  superusers, then it doesn't help for non-superuser restores.  If it's
  settable by anyone and affects only constraints on tables that user owns
  and that refer to tables that user owns it might be okay.  If it's
  settable by anyone and affects all tables it renders the constraints
  meaningless since anyone could break them.

 I assume it would be only setable by the super-user.  They are mucking
 around with pg_class anyway (and have permission to do so), so let them
 do it cleanly at least.  Allowing non-supers to do it for tables they
 own would be OK, I guess.  Is there a problem if some of the primary
 table is owned by someone else?  Not sure.

The problem I have with a super-user only solution is that it doesn't
solve the problem for restores in general.  I think we need a mechanism
that works for any user that wants to restore a table (or tables) from
dump(s), so for the dump/restore mechanism I think we should be looking in
that direction.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-28 Thread Tom Lane
Stephan Szabo [EMAIL PROTECTED] writes:
 On Sat, 27 Sep 2003, Tom Lane wrote:
 I thought of what seems to be a better design for the check query: use
 a LEFT JOIN and check for NULL in the righthand joined column.

 Hmm, my initial testing showed that it really was a little slower
 than a more complicated one with NOT EXISTS so I'd abandoned it. How does
 it fare for you compared to:
 select f1, f2 from fk where not exists (select 1 from pk where pk.f1=fk.f1
 and pk.f2=pk.f2) where fk.f1 is not null and fk.f2 is not null;

Were you testing against 7.3 or 7.4?  On what kinds of tables?

In 7.4 I think that the JOIN would yield as good or better a plan.  The
best possible plan for the NOT EXISTS query is effectively a nestloop
with inner indexscan, which is great if the FK table is small and the
PK table is large, but it sucks otherwise.  The planner should choose a
plan of this form for the LEFT JOIN given that combination of table
sizes, and so there shouldn't be any great difference in runtime in that
case.  But in other combinations, such as large FK and small PK, other
plan types will beat the pants off nestloop.

 I've actually got code (that no longer cleanly applies, but...) that uses
 the single query version with NOT EXISTS (which could be easily changed to
 either of the other forms) and was planning to put it together for a patch
 when 7.5 devel started because I figured it wasn't precisely a bug and
 wouldn't get accepted for 7.4.

Well, Bruce has this on his open-items list, so I figure we have a green
light to do something for 7.4 if we can work out what to do.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-28 Thread Stephan Szabo

On Sun, 28 Sep 2003, Bruce Momjian wrote:

 Stephan Szabo wrote:
  Hmm, my initial testing showed that it really was a little slower
  than a more complicated one with NOT EXISTS so I'd abandoned it. How does
  it fare for you compared to:
  select f1, f2 from fk where not exists (select 1 from pk where pk.f1=fk.f1
  and pk.f2=pk.f2) where fk.f1 is not null and fk.f2 is not null;
 
  I believe the above is the appropriate not exists form for match
  unspecified.
 
  I've actually got code (that no longer cleanly applies, but...) that uses
  the single query version with NOT EXISTS (which could be easily changed to
  either of the other forms) and was planning to put it together for a patch
  when 7.5 devel started because I figured it wasn't precisely a bug and
  wouldn't get accepted for 7.4.

 I am a little lost on this point myself --- are we talking 7.4 or 7.5
 for this change?

I'd thought 7.5, but I now see that it's on the 7.4 open items list.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-28 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Well, we haven't even *got* a proposed patch yet, but yeah we should
 tread carefully.

 OK.  What releases had this slow restore problem?

We introduced it in 7.3 --- before that, FKs were simply dumped as
create trigger commands, and there was no check overhead.  So arguably
it is a bug; a performance bug maybe, but that's still a bug.  No one
has yet gone through a dump/reload cycle in which they had to face this
penalty.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-28 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Let's have multiple people eyeball the patch and give it an OK and we
 can add it for 7.4 if people want it.

Well, we haven't even *got* a proposed patch yet, but yeah we should
tread carefully.  I do think it'd be okay to apply a patch if we can
come up with one that Stephan and Jan and I all like.  As you say,
dump/reload speed normally doesn't get thought about except at this
stage of the release cycle, so ...

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-28 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Let's have multiple people eyeball the patch and give it an OK and we
  can add it for 7.4 if people want it.
 
 Well, we haven't even *got* a proposed patch yet, but yeah we should
 tread carefully.  I do think it'd be okay to apply a patch if we can
 come up with one that Stephan and Jan and I all like.  As you say,
 dump/reload speed normally doesn't get thought about except at this
 stage of the release cycle, so ...

OK.  What releases had this slow restore problem?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-28 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  Well, we haven't even *got* a proposed patch yet, but yeah we should
  tread carefully.
 
  OK.  What releases had this slow restore problem?
 
 We introduced it in 7.3 --- before that, FKs were simply dumped as
 create trigger commands, and there was no check overhead.  So arguably
 it is a bug; a performance bug maybe, but that's still a bug.  No one
 has yet gone through a dump/reload cycle in which they had to face this
 penalty.

Now that is a strong argument. I knew you would find one.  :-)

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-28 Thread Bruce Momjian
Stephan Szabo wrote:
 Hmm, my initial testing showed that it really was a little slower
 than a more complicated one with NOT EXISTS so I'd abandoned it. How does
 it fare for you compared to:
 select f1, f2 from fk where not exists (select 1 from pk where pk.f1=fk.f1
 and pk.f2=pk.f2) where fk.f1 is not null and fk.f2 is not null;
 
 I believe the above is the appropriate not exists form for match
 unspecified.
 
 I've actually got code (that no longer cleanly applies, but...) that uses
 the single query version with NOT EXISTS (which could be easily changed to
 either of the other forms) and was planning to put it together for a patch
 when 7.5 devel started because I figured it wasn't precisely a bug and
 wouldn't get accepted for 7.4.

I am a little lost on this point myself --- are we talking 7.4 or 7.5
for this change?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-28 Thread Bruce Momjian
Tom Lane wrote:
  I've actually got code (that no longer cleanly applies, but...) that uses
  the single query version with NOT EXISTS (which could be easily changed to
  either of the other forms) and was planning to put it together for a patch
  when 7.5 devel started because I figured it wasn't precisely a bug and
  wouldn't get accepted for 7.4.
 
 Well, Bruce has this on his open-items list, so I figure we have a green
 light to do something for 7.4 if we can work out what to do.

I put it on because I wasn't clear exactly what was happening in the
discussion.  There also was discussion that we want to improve this now
because everyone will be using for upgrading to 7.4, and with a ~50% db
reload speed improvement, it is hard to ignore.

I am not against the idea of adding it to 7.4 if we can do it cleanly,
and in fact we are sort of waiting for more serious bug reports at this
time, so doing something else to improve the code isn't out of the
question if we can do it without stumbling --- seems dump/reload gets
full attention only during beta, which makes sense.

However, I think we have to be honest that this is a performance
_improvement_, not a fix.  Yea, you can say it was a bug that we did it
the way we did in the past, but you have to look real hard to see it
that way.  :-)

Let's have multiple people eyeball the patch and give it an OK and we
can add it for 7.4 if people want it.  If you look really hard, you can
say it is a fix for a missing pg_upgrade!

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-28 Thread Stephan Szabo
On Sun, 28 Sep 2003, Tom Lane wrote:

 Stephan Szabo [EMAIL PROTECTED] writes:
  On Sat, 27 Sep 2003, Tom Lane wrote:
  I thought of what seems to be a better design for the check query: use
  a LEFT JOIN and check for NULL in the righthand joined column.

  Hmm, my initial testing showed that it really was a little slower
  than a more complicated one with NOT EXISTS so I'd abandoned it. How does
  it fare for you compared to:
  select f1, f2 from fk where not exists (select 1 from pk where pk.f1=fk.f1
  and pk.f2=pk.f2) where fk.f1 is not null and fk.f2 is not null;

 Were you testing against 7.3 or 7.4?  On what kinds of tables?

7.4 with tables of 1-2 integer key columns with 10k-1m pk rows of
sequential data (in the 2 key case it was value and #rows-value iirc) and
1m-20m fk rows of randomly generated valid data.  But it wasn't any sort
of amazingly detailed test and those aren't huge tables, but I don't
exactly have a huge machine.  I can go back through, do more tests and
report back.

 In 7.4 I think that the JOIN would yield as good or better a plan.  The
 best possible plan for the NOT EXISTS query is effectively a nestloop
 with inner indexscan, which is great if the FK table is small and the
 PK table is large, but it sucks otherwise.  The planner should choose a
 plan of this form for the LEFT JOIN given that combination of table
 sizes, and so there shouldn't be any great difference in runtime in that
 case.  But in other combinations, such as large FK and small PK, other
 plan types will beat the pants off nestloop.

That's what I was expecting too.  I expected it to basically go, NOT IN,
LEFT JOIN, NOT EXISTS in speed (at least when the hashing stuff
happened for in given the not in enhancements), but didn't actually see
that.

  I've actually got code (that no longer cleanly applies, but...) that uses
  the single query version with NOT EXISTS (which could be easily changed to
  either of the other forms) and was planning to put it together for a patch
  when 7.5 devel started because I figured it wasn't precisely a bug and
  wouldn't get accepted for 7.4.

 Well, Bruce has this on his open-items list, so I figure we have a green
 light to do something for 7.4 if we can work out what to do.

I must have missed that.  I'd have mentioned it earlier then.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-28 Thread Stephan Szabo

On Sat, 27 Sep 2003, Tom Lane wrote:

 [ continuing a discussion from mid-August ]

 Stephan Szabo [EMAIL PROTECTED] writes:
  I assume what you have in mind is to replace
  validateForeignKeyConstraint() with something that does a join of the
  two tables via an SPI command.

  It'd probably be:
  MATCH unspecified:
   SELECT keycolumns FROM referencing_table WHERE
(keycolumns) NOT IN (SELECT refcols FROM referenced_table
  WHERE refcol1 IS NOT NULL AND ... )
   AND keycolumn1 IS NOT NULL AND ...;

  MATCH FULL: (something like, I haven't tried it)
   SELECT keycolumns FROM referencing_table WHERE
((keycolumns) NOT IN (SELECT refcols FROM referenced_table
 WHERE refcol1 IS NOT NULL AND ...)
AND
 (keycolumn1 IS NOT NULL AND ...)
)
OR ((keycolumn1 IS NOT NULL)!=(keycolumn2 IS NOT NULL) OR ...)

 I thought of what seems to be a better design for the check query: use
 a LEFT JOIN and check for NULL in the righthand joined column.  For
 example, I think a MATCH UNSPECIFIED on two columns could be tested like
 this:

 select f1,f2
   from fk left join pk on (fk.f1=pk.f1 and fk.f2=pk.f2)
   where pk.f1 is null and (fk.f1 is not null and fk.f2 is not null);

 and MATCH FULL is the same except

   where pk.f1 is null and (fk.f1 is not null OR fk.f2 is not null);

 MATCH PARTIAL would be harder; I think you'd need to generate a separate
 query for each subset of the columns, in which you would probe for
 unmatched rows having exactly that subset non-null.  But it could be
 done.

 Do you see any logical error here?

 In some preliminary tests, the planner seems to be able to choose
 reasonable plans for this type of query even without pg_statistic data,
 as long as it knows the table sizes (which it would do after CREATE INDEX).
 So it would work reasonably well during a pg_dump script, I think.

Hmm, my initial testing showed that it really was a little slower
than a more complicated one with NOT EXISTS so I'd abandoned it. How does
it fare for you compared to:
select f1, f2 from fk where not exists (select 1 from pk where pk.f1=fk.f1
and pk.f2=pk.f2) where fk.f1 is not null and fk.f2 is not null;

I believe the above is the appropriate not exists form for match
unspecified.

I've actually got code (that no longer cleanly applies, but...) that uses
the single query version with NOT EXISTS (which could be easily changed to
either of the other forms) and was planning to put it together for a patch
when 7.5 devel started because I figured it wasn't precisely a bug and
wouldn't get accepted for 7.4.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-28 Thread Christopher Kings-Lynne
You could just as easily argue that the lack of integrity testing at 
data load time was equally a bug.

I think we need someway of telling postgres to suppress a foreign key check.

The main problem is that the foreign key column is often not indexed.

Chris

Bruce Momjian wrote:
Tom Lane wrote:

Bruce Momjian [EMAIL PROTECTED] writes:

Tom Lane wrote:

Well, we haven't even *got* a proposed patch yet, but yeah we should
tread carefully.

OK.  What releases had this slow restore problem?
We introduced it in 7.3 --- before that, FKs were simply dumped as
create trigger commands, and there was no check overhead.  So arguably
it is a bug; a performance bug maybe, but that's still a bug.  No one
has yet gone through a dump/reload cycle in which they had to face this
penalty.


Now that is a strong argument. I knew you would find one.  :-)



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-28 Thread Bruce Momjian
Christopher Kings-Lynne wrote:
 You could just as easily argue that the lack of integrity testing at 
 data load time was equally a bug.
 
 I think we need someway of telling postgres to suppress a foreign key check.
 
 The main problem is that the foreign key column is often not indexed.

As I remember, the new code is showing full table checks of a few
seconds, rather than minutes, but I agree we do need a way to turn off
checks some times.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-28 Thread Tom Lane
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 I think we need someway of telling postgres to suppress a foreign key check.

Well, the subtext argument here is do we fix it by providing a way to
suppress the check, or do we fix it by making the check fast enough to
be tolerable?

I think the advantages of choice (b) are obvious --- it doesn't allow
bogus data to be loaded accidentally, and it doesn't create a problem
with loading existing 7.3 dump files that don't know how to suppress the
check.

If we find there is no way to do (b) acceptably well, then and only then
would I want to consider (a).

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-28 Thread Christopher Kings-Lynne

I think the advantages of choice (b) are obvious --- it doesn't allow
bogus data to be loaded accidentally, and it doesn't create a problem
with loading existing 7.3 dump files that don't know how to suppress the
check.
OK, I didn't realise there was a (b).  I volunteer to do speed tests on 
data reloading on real data for our site.

Chris



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-27 Thread Tom Lane
[ continuing a discussion from mid-August ]

Stephan Szabo [EMAIL PROTECTED] writes:
 I assume what you have in mind is to replace
 validateForeignKeyConstraint() with something that does a join of the
 two tables via an SPI command.

 It'd probably be:
 MATCH unspecified:
  SELECT keycolumns FROM referencing_table WHERE
   (keycolumns) NOT IN (SELECT refcols FROM referenced_table
 WHERE refcol1 IS NOT NULL AND ... )
  AND keycolumn1 IS NOT NULL AND ...;

 MATCH FULL: (something like, I haven't tried it)
  SELECT keycolumns FROM referencing_table WHERE
   ((keycolumns) NOT IN (SELECT refcols FROM referenced_table
WHERE refcol1 IS NOT NULL AND ...)
   AND
(keycolumn1 IS NOT NULL AND ...)
   )
   OR ((keycolumn1 IS NOT NULL)!=(keycolumn2 IS NOT NULL) OR ...)

I thought of what seems to be a better design for the check query: use
a LEFT JOIN and check for NULL in the righthand joined column.  For
example, I think a MATCH UNSPECIFIED on two columns could be tested like
this:

select f1,f2
  from fk left join pk on (fk.f1=pk.f1 and fk.f2=pk.f2)
  where pk.f1 is null and (fk.f1 is not null and fk.f2 is not null);

and MATCH FULL is the same except

  where pk.f1 is null and (fk.f1 is not null OR fk.f2 is not null);

MATCH PARTIAL would be harder; I think you'd need to generate a separate
query for each subset of the columns, in which you would probe for
unmatched rows having exactly that subset non-null.  But it could be
done.

Do you see any logical error here?

In some preliminary tests, the planner seems to be able to choose
reasonable plans for this type of query even without pg_statistic data,
as long as it knows the table sizes (which it would do after CREATE INDEX).
So it would work reasonably well during a pg_dump script, I think.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-08-15 Thread Tom Lane
Stephan Szabo [EMAIL PROTECTED] writes:
 On Fri, 15 Aug 2003, Christopher Kings-Lynne wrote:
 I can also attest to the horrendously long time it takes to restore the ADD
 FOREIGN KEY section...

 That really needs to be rewritten to do a single check over the table
 rather than running the constraint for every row.  I keep meaning to get
 around to it and never actually do. :(  I'm not sure that in practice
 you'll get a better plan at restore time depending on what the default
 statistics give you.

In simple cases I think that the creation of indexes would be enough to
get you a passable plan --- CREATE INDEX does update pg_class.reltuples,
so the planner will know how big the tables are, and for single-column
primary keys the existence of a unique index is enough to cue the
planner that the column is unique, even without any ANALYZE stats.
Those are the biggest levers on the plan choice.

This assumes that pg_dump always dumps CREATE INDEX before ADD FOREIGN
KEY; I'm not certain if there's anything to enforce that at the
moment...

I assume what you have in mind is to replace
validateForeignKeyConstraint() with something that does a join of the
two tables via an SPI command.  But supposing that we want to keep the
present ability to report (one of) the failing key values, it seems
like the query has to look like
SELECT keycolumns FROM referencing_table WHERE
keycolumns NOT IN (SELECT refcols FROM referenced_table);
which is only gonna do the right thing for one of the MATCH styles
(not sure which, offhand ... actually it may not do the right thing
for any match style if there are nulls in referenced_table ...).
How would you make it work for all the MATCH styles?  And will it
really be all that efficient?  (NOT IN is a lot more circumscribed
than IN.)

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-08-15 Thread Stephan Szabo

On Fri, 15 Aug 2003, Tom Lane wrote:

 Stephan Szabo [EMAIL PROTECTED] writes:
  On Fri, 15 Aug 2003, Christopher Kings-Lynne wrote:
  I can also attest to the horrendously long time it takes to restore the ADD
  FOREIGN KEY section...

  That really needs to be rewritten to do a single check over the table
  rather than running the constraint for every row.  I keep meaning to get
  around to it and never actually do. :(  I'm not sure that in practice
  you'll get a better plan at restore time depending on what the default
  statistics give you.

 In simple cases I think that the creation of indexes would be enough to
 get you a passable plan --- CREATE INDEX does update pg_class.reltuples,
 so the planner will know how big the tables are, and for single-column
 primary keys the existence of a unique index is enough to cue the
 planner that the column is unique, even without any ANALYZE stats.
 Those are the biggest levers on the plan choice.

 This assumes that pg_dump always dumps CREATE INDEX before ADD FOREIGN
 KEY; I'm not certain if there's anything to enforce that at the
 moment...

 I assume what you have in mind is to replace
 validateForeignKeyConstraint() with something that does a join of the
 two tables via an SPI command.  But supposing that we want to keep the
 present ability to report (one of) the failing key values, it seems
 like the query has to look like
   SELECT keycolumns FROM referencing_table WHERE
   keycolumns NOT IN (SELECT refcols FROM referenced_table);
 which is only gonna do the right thing for one of the MATCH styles
 (not sure which, offhand ... actually it may not do the right thing
 for any match style if there are nulls in referenced_table ...).

Yes, in practice, you'd have to put IS NOT NULL checks in the subselect,
which is fine for the two match types we support since a referenced row
with a NULL isn't a choice for a referenced row for those.  I think MATCH
PARTIAL might have to fall back to the repeated check unless we can make
the query work which would be harder because you only want to compare the
columns for a particular row where the keycolumn case is not null and I
can't think of a query for that that'd be particularly clean and likely to
be fast, then again I don't think the constraint would be either. :(

It'd probably be:
MATCH unspecified:
 SELECT keycolumns FROM referencing_table WHERE
  (keycolumns) NOT IN (SELECT refcols FROM referenced_table
WHERE refcol1 IS NOT NULL AND ... )
 AND keycolumn1 IS NOT NULL AND ...;

MATCH FULL: (something like, I haven't tried it)
 SELECT keycolumns FROM referencing_table WHERE
  ((keycolumns) NOT IN (SELECT refcols FROM referenced_table
   WHERE refcol1 IS NOT NULL AND ...)
  AND
   (keycolumn1 IS NOT NULL AND ...)
  )
  OR ((keycolumn1 IS NOT NULL)!=(keycolumn2 IS NOT NULL) OR ...)

 How would you make it work for all the MATCH styles?  And will it
 really be all that efficient?  (NOT IN is a lot more circumscribed
 than IN.)

I'm not really sure yet.  Limited tests seem to show that it'll probably
be as fast if not faster for all reasonable cases, but I'd want to
generate a much larger random data set and actually put it in to make a
fair comparison (maybe temporarily with a set to allow people to try both
cases on real world data).  One other advantage here is that we don't need
to get row locks while checking this if we've already gotten the exclusive
table locks on both tables involved.  I'm not sure if we do that currently
though.



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly