Re: [HACKERS] Tweaking Foreign Keys for larger tables

2014-11-07 Thread Andreas Karlsson

On 11/07/2014 08:15 AM, Simon Riggs wrote:

How about we set lock level on each Foreign Key like this

[USING LOCK [lock level]]

level is one of
KEY - [FOR KEY SHARE] - default
ROW -  [FOR SHARE]
TABLE SHARE - [ ]
TABLE EXCLUSIVE - [FOR TABLE EXCLUSIVE]


I like the idea and thinks it solves the problem in a pretty neat way, 
but I do not see any practical need for other levels than the highest 
and the lowest of those.


Andreas



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tweaking Foreign Keys for larger tables

2014-11-06 Thread David G Johnston
Peter Eisentraut-2 wrote
 On 10/31/14 6:19 AM, Simon Riggs wrote:
 Various ways of tweaking Foreign Keys are suggested that are helpful
 for larger databases.
 
 *INITIALLY NOT ENFORCED
 FK created, but is not enforced during DML.
 Will be/Must be marked NOT VALID when first created.
 We can run a VALIDATE on the constraint at any time; if it passes the
 check it is marked VALID and presumed to stay that way until the next
 VALIDATE run.
 
 Does that mean the FK would become invalid after every DML operation,
 until you expicitly revalidate it?  Is that practical?

My read is that it means that you can insert invalid data but the system
will pretend it is valid unless someone asks it for confirmation.  Upon
validation the FK will become invalid until the discrepancy is fixed and
another validation is performed.


 ON DELETE IGNORE
 ON UPDATE IGNORE
 If we allow this specification then the FK is one way - we check the
 existence of a row in the referenced table, but there is no need for a
 trigger on the referenced table to enforce an action on delete or
 update, so no need to lock the referenced table when adding FKs.
 
 Are you worried about locking the table at all, or about having to lock
 many rows?

Wouldn't you at least need some kind of trigger to make the constraint
invalid as soon as any record is updated or removed from the referenced
table since in all likelihood the FK relationship has just been broken?


How expensive is validation going to be?  Especially, can validation occur
incrementally or does every record need to be validated each time?

Is this useful for master-detail setups, record-category, or both (others?)?

Will optimizations over invalid data give incorrect answers and in what
specific scenarios can that be expected?

I get the idea of having a system that let's you skip constant data
validation since in all likelihood once in production some scenarios would
be extremely resistant to the introduction of errors and can be dealt with
on-the-fly.  Trust only since the verify is expensive - but keep the option
open and the model faithfully represented.

I don't know that I would ever think to use this in my world since the
additional admin effort is obvious but the cost of the thing I'd be avoiding
is vague.  As it is now someone could simply drop their FK constraints and
run a validation query periodically to see if the data being inserted is
correct.  That doesn't allow for optimizations to take place though and so
this is an improvement; but the documentation and support aspects for a
keep/drop decision can be fleshed out first as that would be valuable in its
own right.  Then go about figuring out how to make a hybrid implementation
work.

Put another way: at what point does the cost of the FK constraint outweigh
the optimization savings?  While size is obvious both schema and read/write
patterns likely have a significant influence.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Tweaking-Foreign-Keys-for-larger-tables-tp5825162p5825891.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tweaking Foreign Keys for larger tables

2014-11-06 Thread Simon Riggs
On 5 November 2014 21:15, Peter Eisentraut pete...@gmx.net wrote:

 ON DELETE IGNORE
 ON UPDATE IGNORE
 If we allow this specification then the FK is one way - we check the
 existence of a row in the referenced table, but there is no need for a
 trigger on the referenced table to enforce an action on delete or
 update, so no need to lock the referenced table when adding FKs.

 Are you worried about locking the table at all, or about having to lock
 many rows?

This is useful for smaller, highly referenced tables that don't change
much, if ever.

In that case the need for correctness thru locking is minimal. If we
do lock it will cause very high multixact traffic, so that is worth
avoiding alone.

The main issue is referencing a table many times. Getting a full table
lock can halt all FK checks, so skipping adding the trigger altogether
avoids freezing up everything just for a trigger that doesn't actually
do much.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tweaking Foreign Keys for larger tables

2014-11-06 Thread Simon Riggs
On 5 November 2014 21:15, Peter Eisentraut pete...@gmx.net wrote:
 On 10/31/14 6:19 AM, Simon Riggs wrote:
 Various ways of tweaking Foreign Keys are suggested that are helpful
 for larger databases.

 *INITIALLY NOT ENFORCED
 FK created, but is not enforced during DML.
 Will be/Must be marked NOT VALID when first created.
 We can run a VALIDATE on the constraint at any time; if it passes the
 check it is marked VALID and presumed to stay that way until the next
 VALIDATE run.

 Does that mean the FK would become invalid after every DML operation,
 until you expicitly revalidate it?  Is that practical?

I think so.

We store the validity on the relcache entry.

Constraint would add a statement-level after trigger for insert,
update, delete and trigger, which issues a relcache invalidation if
the state was marked valid. Marked as deferrable initially deferred.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tweaking Foreign Keys for larger tables

2014-11-06 Thread Jim Nasby

On 11/6/14, 2:58 AM, Simon Riggs wrote:

On 5 November 2014 21:15, Peter Eisentraut pete...@gmx.net wrote:

On 10/31/14 6:19 AM, Simon Riggs wrote:

Various ways of tweaking Foreign Keys are suggested that are helpful
for larger databases.



*INITIALLY NOT ENFORCED
FK created, but is not enforced during DML.
Will be/Must be marked NOT VALID when first created.
We can run a VALIDATE on the constraint at any time; if it passes the
check it is marked VALID and presumed to stay that way until the next
VALIDATE run.


Does that mean the FK would become invalid after every DML operation,
until you expicitly revalidate it?  Is that practical?


I think so.

We store the validity on the relcache entry.

Constraint would add a statement-level after trigger for insert,
update, delete and trigger, which issues a relcache invalidation if
the state was marked valid. Marked as deferrable initially deferred.


I don't think you'd need to invalidate on insert, or on an update that didn't 
touch a referenced key.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tweaking Foreign Keys for larger tables

2014-11-06 Thread David G Johnston
On Thu, Nov 6, 2014 at 10:29 AM, Jim Nasby-5 [via PostgreSQL] 
ml-node+s1045698n582596...@n5.nabble.com wrote:

 On 11/6/14, 2:58 AM, Simon Riggs wrote:

  On 5 November 2014 21:15, Peter Eisentraut [hidden email]
 http://user/SendEmail.jtp?type=nodenode=5825967i=0 wrote:
  On 10/31/14 6:19 AM, Simon Riggs wrote:
  Various ways of tweaking Foreign Keys are suggested that are helpful
  for larger databases.
 
  *INITIALLY NOT ENFORCED
  FK created, but is not enforced during DML.
  Will be/Must be marked NOT VALID when first created.
  We can run a VALIDATE on the constraint at any time; if it passes the
  check it is marked VALID and presumed to stay that way until the next
  VALIDATE run.
 
  Does that mean the FK would become invalid after every DML operation,
  until you expicitly revalidate it?  Is that practical?
 
  I think so.
 
  We store the validity on the relcache entry.
 
  Constraint would add a statement-level after trigger for insert,
  update, delete and trigger, which issues a relcache invalidation if
  the state was marked valid. Marked as deferrable initially deferred.

 I don't think you'd need to invalidate on insert,


​Why?  Since the FK is not enforced there is no guarantee that what you
just inserted is valid
​

 or on an update that didn't touch a referenced key.


​OK​ - but you would still need the trigger on the FK columns

DELETE is OK as well since you cannot invalidate the constraint by simply
removing the referencing row.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Tweaking-Foreign-Keys-for-larger-tables-tp5825162p5825970.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

Re: [HACKERS] Tweaking Foreign Keys for larger tables

2014-11-06 Thread Alvaro Herrera
Simon Riggs wrote:
 On 5 November 2014 21:15, Peter Eisentraut pete...@gmx.net wrote:
 
  ON DELETE IGNORE
  ON UPDATE IGNORE
  If we allow this specification then the FK is one way - we check the
  existence of a row in the referenced table, but there is no need for a
  trigger on the referenced table to enforce an action on delete or
  update, so no need to lock the referenced table when adding FKs.
 
  Are you worried about locking the table at all, or about having to lock
  many rows?
 
 This is useful for smaller, highly referenced tables that don't change
 much, if ever.
 
 In that case the need for correctness thru locking is minimal. If we
 do lock it will cause very high multixact traffic, so that is worth
 avoiding alone.

This seems like a can of worms to me.  How about the ability to mark a
table READ ONLY, so that insert/update/delete operations on it raise an
error?  For such tables, you can just assume that tuples never go away,
which can help optimize some ri_triggers.c queries by doing plain
SELECT, not SELECT FOR KEY SHARE.

If you later need to add rows to the table, you set it READ WRITE, and
then ri_triggers.c automatically start using FOR KEY SHARE; add/modify
to your liking, then set READ ONLY again.  So you incur the cost of
tuple locking only while you have the table open for writes.

This way we don't get into the mess of reasoning about foreign keys that
might be violated some of the time.

There's a side effect of tables being READ ONLY which is that tuple
freezing can be optimized as well.  I vaguely recall we have discussed
this.  It's something like SET READ ONLY, then freeze it, which sets its
relfrozenxid to 0 or maybe FrozenXid; vacuum knows it can ignore the
table for freezing purposes.  When SET READ WRITE, relfrozenxid jumps to
RecentXmin.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tweaking Foreign Keys for larger tables

2014-11-06 Thread Kevin Grittner
Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 Simon Riggs wrote:
 On 5 November 2014 21:15, Peter Eisentraut pete...@gmx.net wrote:

 ON DELETE IGNORE
 ON UPDATE IGNORE
 If we allow this specification then the FK is one way - we check the
 existence of a row in the referenced table, but there is no need for a
 trigger on the referenced table to enforce an action on delete or
 update, so no need to lock the referenced table when adding FKs.

 Are you worried about locking the table at all, or about having to lock
 many rows?

 This is useful for smaller, highly referenced tables that don't change
 much, if ever.

 In that case the need for correctness thru locking is minimal. If we
 do lock it will cause very high multixact traffic, so that is worth
 avoiding alone.

 This seems like a can of worms to me.  How about the ability to mark a
 table READ ONLY, so that insert/update/delete operations on it raise an
 error?  For such tables, you can just assume that tuples never go away,
 which can help optimize some ri_triggers.c queries by doing plain
 SELECT, not SELECT FOR KEY SHARE.

 If you later need to add rows to the table, you set it READ WRITE, and
 then ri_triggers.c automatically start using FOR KEY SHARE; add/modify
 to your liking, then set READ ONLY again.  So you incur the cost of
 tuple locking only while you have the table open for writes.

 This way we don't get into the mess of reasoning about foreign keys that
 might be violated some of the time.

On its face, that sounds more promising to me.

 There's a side effect of tables being READ ONLY which is that tuple
 freezing can be optimized as well.  I vaguely recall we have discussed
 this.  It's something like SET READ ONLY, then freeze it, which sets its
 relfrozenxid to 0 or maybe FrozenXid; vacuum knows it can ignore the
 table for freezing purposes.  When SET READ WRITE, relfrozenxid jumps to
 RecentXmin.

It could also allow a (potentially large) optimization to 
serializable transactions -- there is no need to take any predicate 
locks on a table or its indexes if it is read only.  To safely 
transition a table from read only to read write you would need at 
least two flags (similar in some ways to indisvalid and indisready) 
-- one to say whether any of these read only optimizations are 
allowed, and another flag that would only be set after all 
transactions which might have seen the read only state have 
completed which actually allows writes.  Or that could be done with 
a char column with three states.  So on transition to read only 
you would flag it as non-writable, and after all transactions which 
might have seen it in a writable state complete you flag it as 
allowing read only optimizations.  To transition to read write you 
disable the optimizations first and wait before actually flagging 
it as read write.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tweaking Foreign Keys for larger tables

2014-11-06 Thread Jim Nasby

On 11/6/14, 11:49 AM, David G Johnston wrote:

 Constraint would add a statement-level after trigger for insert,
 update, delete and trigger, which issues a relcache invalidation if
 the state was marked valid. Marked as deferrable initially deferred.
I don't think you'd need to invalidate on insert,


​Why?  Since the FK is not enforced there is no guarantee that what you just 
inserted is valid


I'm talking about the referenced (aka 'parent') table, not the referring table.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tweaking Foreign Keys for larger tables

2014-11-06 Thread Simon Riggs
On 6 November 2014 20:47, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 Simon Riggs wrote:
...
 In that case the need for correctness thru locking is minimal. If we
 do lock it will cause very high multixact traffic, so that is worth
 avoiding alone.

 This seems like a can of worms to me.  How about the ability to mark a
 table READ ONLY, so that insert/update/delete operations on it raise an
 error?  For such tables, you can just assume that tuples never go away,
 which can help optimize some ri_triggers.c queries by doing plain
 SELECT, not SELECT FOR KEY SHARE.

 If you later need to add rows to the table, you set it READ WRITE, and
 then ri_triggers.c automatically start using FOR KEY SHARE; add/modify
 to your liking, then set READ ONLY again.  So you incur the cost of
 tuple locking only while you have the table open for writes.

How about we set lock level on each Foreign Key like this

[USING LOCK [lock level]]

level is one of
KEY - [FOR KEY SHARE] - default
ROW -  [FOR SHARE]
TABLE SHARE - [ ]
TABLE EXCLUSIVE - [FOR TABLE EXCLUSIVE]

which introduces these new level descriptions
TABLE SHARE - is default behavior of SELECT
TABLE EXCLUSIVE - we lock the referenced table against all writes -
this allows the table to be fully cached for use in speeding up checks
 [FOR TABLE EXCLUSIVE] - uses ShareRowExclusiveLock

The last level is like Read Only tables apart from the fact that
they can be written to when needed, but we optimize things on the
assumption that such writes are very rare.

We could also add Read Only tables as well, but I don't see as much
use for them. Sounds like you'd spend a lot of time with ALTER TABLE
as you turn it on and off. I'd like to be able to do that
automatically as needed.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tweaking Foreign Keys for larger tables

2014-11-05 Thread Peter Eisentraut
On 10/31/14 6:19 AM, Simon Riggs wrote:
 Various ways of tweaking Foreign Keys are suggested that are helpful
 for larger databases.

 *INITIALLY NOT ENFORCED
 FK created, but is not enforced during DML.
 Will be/Must be marked NOT VALID when first created.
 We can run a VALIDATE on the constraint at any time; if it passes the
 check it is marked VALID and presumed to stay that way until the next
 VALIDATE run.

Does that mean the FK would become invalid after every DML operation,
until you expicitly revalidate it?  Is that practical?

 ON DELETE IGNORE
 ON UPDATE IGNORE
 If we allow this specification then the FK is one way - we check the
 existence of a row in the referenced table, but there is no need for a
 trigger on the referenced table to enforce an action on delete or
 update, so no need to lock the referenced table when adding FKs.

Are you worried about locking the table at all, or about having to lock
many rows?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Tweaking Foreign Keys for larger tables

2014-10-31 Thread Simon Riggs
Various ways of tweaking Foreign Keys are suggested that are helpful
for larger databases.

* Deferrable Enforcement Timing Clause

* NOT DEFERRABLE - immediate execution
* DEFERRABLE
*INITIALLY IMMEDIATE - existing
*INITIALLY DEFERRED - existing
*INITIALLY NOT ENFORCED
FK created, but is not enforced during DML.
Will be/Must be marked NOT VALID when first created.
We can run a VALIDATE on the constraint at any time; if it passes the
check it is marked VALID and presumed to stay that way until the next
VALIDATE run. If it fails that check the FK would be marked as NOT
VALID, causing it to be no longer useful for optimization.
This allows FKs to be checked in bulk, rather than executing during
front-end code path, but yet still be there for optimization and
documentation (or visibility by tools etc).

There is no corresponding SET CONSTRAINTs call for the NOT ENFORCED
case, since that would require us to mark the constraint as not valid.

* Referenced Table actions

ON DELETE IGNORE
ON UPDATE IGNORE
If we allow this specification then the FK is one way - we check the
existence of a row in the referenced table, but there is no need for a
trigger on the referenced table to enforce an action on delete or
update, so no need to lock the referenced table when adding FKs.
This is very useful for very highly referenced tables.
Or for larger tables where we aren't planning on deleting or updating
the referenced table without also deleting or updating the referencing
table.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers