Re: [HACKERS] Tweaking Foreign Keys for larger tables
On 11/07/2014 08:15 AM, Simon Riggs wrote: How about we set lock level on each Foreign Key like this [USING LOCK [lock level]] level is one of KEY - [FOR KEY SHARE] - default ROW - [FOR SHARE] TABLE SHARE - [ ] TABLE EXCLUSIVE - [FOR TABLE EXCLUSIVE] I like the idea and thinks it solves the problem in a pretty neat way, but I do not see any practical need for other levels than the highest and the lowest of those. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Tweaking Foreign Keys for larger tables
Peter Eisentraut-2 wrote On 10/31/14 6:19 AM, Simon Riggs wrote: Various ways of tweaking Foreign Keys are suggested that are helpful for larger databases. *INITIALLY NOT ENFORCED FK created, but is not enforced during DML. Will be/Must be marked NOT VALID when first created. We can run a VALIDATE on the constraint at any time; if it passes the check it is marked VALID and presumed to stay that way until the next VALIDATE run. Does that mean the FK would become invalid after every DML operation, until you expicitly revalidate it? Is that practical? My read is that it means that you can insert invalid data but the system will pretend it is valid unless someone asks it for confirmation. Upon validation the FK will become invalid until the discrepancy is fixed and another validation is performed. ON DELETE IGNORE ON UPDATE IGNORE If we allow this specification then the FK is one way - we check the existence of a row in the referenced table, but there is no need for a trigger on the referenced table to enforce an action on delete or update, so no need to lock the referenced table when adding FKs. Are you worried about locking the table at all, or about having to lock many rows? Wouldn't you at least need some kind of trigger to make the constraint invalid as soon as any record is updated or removed from the referenced table since in all likelihood the FK relationship has just been broken? How expensive is validation going to be? Especially, can validation occur incrementally or does every record need to be validated each time? Is this useful for master-detail setups, record-category, or both (others?)? Will optimizations over invalid data give incorrect answers and in what specific scenarios can that be expected? I get the idea of having a system that let's you skip constant data validation since in all likelihood once in production some scenarios would be extremely resistant to the introduction of errors and can be dealt with on-the-fly. Trust only since the verify is expensive - but keep the option open and the model faithfully represented. I don't know that I would ever think to use this in my world since the additional admin effort is obvious but the cost of the thing I'd be avoiding is vague. As it is now someone could simply drop their FK constraints and run a validation query periodically to see if the data being inserted is correct. That doesn't allow for optimizations to take place though and so this is an improvement; but the documentation and support aspects for a keep/drop decision can be fleshed out first as that would be valuable in its own right. Then go about figuring out how to make a hybrid implementation work. Put another way: at what point does the cost of the FK constraint outweigh the optimization savings? While size is obvious both schema and read/write patterns likely have a significant influence. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Tweaking-Foreign-Keys-for-larger-tables-tp5825162p5825891.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Tweaking Foreign Keys for larger tables
On 5 November 2014 21:15, Peter Eisentraut pete...@gmx.net wrote: ON DELETE IGNORE ON UPDATE IGNORE If we allow this specification then the FK is one way - we check the existence of a row in the referenced table, but there is no need for a trigger on the referenced table to enforce an action on delete or update, so no need to lock the referenced table when adding FKs. Are you worried about locking the table at all, or about having to lock many rows? This is useful for smaller, highly referenced tables that don't change much, if ever. In that case the need for correctness thru locking is minimal. If we do lock it will cause very high multixact traffic, so that is worth avoiding alone. The main issue is referencing a table many times. Getting a full table lock can halt all FK checks, so skipping adding the trigger altogether avoids freezing up everything just for a trigger that doesn't actually do much. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Tweaking Foreign Keys for larger tables
On 5 November 2014 21:15, Peter Eisentraut pete...@gmx.net wrote: On 10/31/14 6:19 AM, Simon Riggs wrote: Various ways of tweaking Foreign Keys are suggested that are helpful for larger databases. *INITIALLY NOT ENFORCED FK created, but is not enforced during DML. Will be/Must be marked NOT VALID when first created. We can run a VALIDATE on the constraint at any time; if it passes the check it is marked VALID and presumed to stay that way until the next VALIDATE run. Does that mean the FK would become invalid after every DML operation, until you expicitly revalidate it? Is that practical? I think so. We store the validity on the relcache entry. Constraint would add a statement-level after trigger for insert, update, delete and trigger, which issues a relcache invalidation if the state was marked valid. Marked as deferrable initially deferred. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Tweaking Foreign Keys for larger tables
On 11/6/14, 2:58 AM, Simon Riggs wrote: On 5 November 2014 21:15, Peter Eisentraut pete...@gmx.net wrote: On 10/31/14 6:19 AM, Simon Riggs wrote: Various ways of tweaking Foreign Keys are suggested that are helpful for larger databases. *INITIALLY NOT ENFORCED FK created, but is not enforced during DML. Will be/Must be marked NOT VALID when first created. We can run a VALIDATE on the constraint at any time; if it passes the check it is marked VALID and presumed to stay that way until the next VALIDATE run. Does that mean the FK would become invalid after every DML operation, until you expicitly revalidate it? Is that practical? I think so. We store the validity on the relcache entry. Constraint would add a statement-level after trigger for insert, update, delete and trigger, which issues a relcache invalidation if the state was marked valid. Marked as deferrable initially deferred. I don't think you'd need to invalidate on insert, or on an update that didn't touch a referenced key. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Tweaking Foreign Keys for larger tables
On Thu, Nov 6, 2014 at 10:29 AM, Jim Nasby-5 [via PostgreSQL] ml-node+s1045698n582596...@n5.nabble.com wrote: On 11/6/14, 2:58 AM, Simon Riggs wrote: On 5 November 2014 21:15, Peter Eisentraut [hidden email] http://user/SendEmail.jtp?type=nodenode=5825967i=0 wrote: On 10/31/14 6:19 AM, Simon Riggs wrote: Various ways of tweaking Foreign Keys are suggested that are helpful for larger databases. *INITIALLY NOT ENFORCED FK created, but is not enforced during DML. Will be/Must be marked NOT VALID when first created. We can run a VALIDATE on the constraint at any time; if it passes the check it is marked VALID and presumed to stay that way until the next VALIDATE run. Does that mean the FK would become invalid after every DML operation, until you expicitly revalidate it? Is that practical? I think so. We store the validity on the relcache entry. Constraint would add a statement-level after trigger for insert, update, delete and trigger, which issues a relcache invalidation if the state was marked valid. Marked as deferrable initially deferred. I don't think you'd need to invalidate on insert, Why? Since the FK is not enforced there is no guarantee that what you just inserted is valid or on an update that didn't touch a referenced key. OK - but you would still need the trigger on the FK columns DELETE is OK as well since you cannot invalidate the constraint by simply removing the referencing row. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Tweaking-Foreign-Keys-for-larger-tables-tp5825162p5825970.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
Re: [HACKERS] Tweaking Foreign Keys for larger tables
Simon Riggs wrote: On 5 November 2014 21:15, Peter Eisentraut pete...@gmx.net wrote: ON DELETE IGNORE ON UPDATE IGNORE If we allow this specification then the FK is one way - we check the existence of a row in the referenced table, but there is no need for a trigger on the referenced table to enforce an action on delete or update, so no need to lock the referenced table when adding FKs. Are you worried about locking the table at all, or about having to lock many rows? This is useful for smaller, highly referenced tables that don't change much, if ever. In that case the need for correctness thru locking is minimal. If we do lock it will cause very high multixact traffic, so that is worth avoiding alone. This seems like a can of worms to me. How about the ability to mark a table READ ONLY, so that insert/update/delete operations on it raise an error? For such tables, you can just assume that tuples never go away, which can help optimize some ri_triggers.c queries by doing plain SELECT, not SELECT FOR KEY SHARE. If you later need to add rows to the table, you set it READ WRITE, and then ri_triggers.c automatically start using FOR KEY SHARE; add/modify to your liking, then set READ ONLY again. So you incur the cost of tuple locking only while you have the table open for writes. This way we don't get into the mess of reasoning about foreign keys that might be violated some of the time. There's a side effect of tables being READ ONLY which is that tuple freezing can be optimized as well. I vaguely recall we have discussed this. It's something like SET READ ONLY, then freeze it, which sets its relfrozenxid to 0 or maybe FrozenXid; vacuum knows it can ignore the table for freezing purposes. When SET READ WRITE, relfrozenxid jumps to RecentXmin. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Tweaking Foreign Keys for larger tables
Alvaro Herrera alvhe...@2ndquadrant.com wrote: Simon Riggs wrote: On 5 November 2014 21:15, Peter Eisentraut pete...@gmx.net wrote: ON DELETE IGNORE ON UPDATE IGNORE If we allow this specification then the FK is one way - we check the existence of a row in the referenced table, but there is no need for a trigger on the referenced table to enforce an action on delete or update, so no need to lock the referenced table when adding FKs. Are you worried about locking the table at all, or about having to lock many rows? This is useful for smaller, highly referenced tables that don't change much, if ever. In that case the need for correctness thru locking is minimal. If we do lock it will cause very high multixact traffic, so that is worth avoiding alone. This seems like a can of worms to me. How about the ability to mark a table READ ONLY, so that insert/update/delete operations on it raise an error? For such tables, you can just assume that tuples never go away, which can help optimize some ri_triggers.c queries by doing plain SELECT, not SELECT FOR KEY SHARE. If you later need to add rows to the table, you set it READ WRITE, and then ri_triggers.c automatically start using FOR KEY SHARE; add/modify to your liking, then set READ ONLY again. So you incur the cost of tuple locking only while you have the table open for writes. This way we don't get into the mess of reasoning about foreign keys that might be violated some of the time. On its face, that sounds more promising to me. There's a side effect of tables being READ ONLY which is that tuple freezing can be optimized as well. I vaguely recall we have discussed this. It's something like SET READ ONLY, then freeze it, which sets its relfrozenxid to 0 or maybe FrozenXid; vacuum knows it can ignore the table for freezing purposes. When SET READ WRITE, relfrozenxid jumps to RecentXmin. It could also allow a (potentially large) optimization to serializable transactions -- there is no need to take any predicate locks on a table or its indexes if it is read only. To safely transition a table from read only to read write you would need at least two flags (similar in some ways to indisvalid and indisready) -- one to say whether any of these read only optimizations are allowed, and another flag that would only be set after all transactions which might have seen the read only state have completed which actually allows writes. Or that could be done with a char column with three states. So on transition to read only you would flag it as non-writable, and after all transactions which might have seen it in a writable state complete you flag it as allowing read only optimizations. To transition to read write you disable the optimizations first and wait before actually flagging it as read write. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Tweaking Foreign Keys for larger tables
On 11/6/14, 11:49 AM, David G Johnston wrote: Constraint would add a statement-level after trigger for insert, update, delete and trigger, which issues a relcache invalidation if the state was marked valid. Marked as deferrable initially deferred. I don't think you'd need to invalidate on insert, Why? Since the FK is not enforced there is no guarantee that what you just inserted is valid I'm talking about the referenced (aka 'parent') table, not the referring table. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Tweaking Foreign Keys for larger tables
On 6 November 2014 20:47, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Simon Riggs wrote: ... In that case the need for correctness thru locking is minimal. If we do lock it will cause very high multixact traffic, so that is worth avoiding alone. This seems like a can of worms to me. How about the ability to mark a table READ ONLY, so that insert/update/delete operations on it raise an error? For such tables, you can just assume that tuples never go away, which can help optimize some ri_triggers.c queries by doing plain SELECT, not SELECT FOR KEY SHARE. If you later need to add rows to the table, you set it READ WRITE, and then ri_triggers.c automatically start using FOR KEY SHARE; add/modify to your liking, then set READ ONLY again. So you incur the cost of tuple locking only while you have the table open for writes. How about we set lock level on each Foreign Key like this [USING LOCK [lock level]] level is one of KEY - [FOR KEY SHARE] - default ROW - [FOR SHARE] TABLE SHARE - [ ] TABLE EXCLUSIVE - [FOR TABLE EXCLUSIVE] which introduces these new level descriptions TABLE SHARE - is default behavior of SELECT TABLE EXCLUSIVE - we lock the referenced table against all writes - this allows the table to be fully cached for use in speeding up checks [FOR TABLE EXCLUSIVE] - uses ShareRowExclusiveLock The last level is like Read Only tables apart from the fact that they can be written to when needed, but we optimize things on the assumption that such writes are very rare. We could also add Read Only tables as well, but I don't see as much use for them. Sounds like you'd spend a lot of time with ALTER TABLE as you turn it on and off. I'd like to be able to do that automatically as needed. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Tweaking Foreign Keys for larger tables
On 10/31/14 6:19 AM, Simon Riggs wrote: Various ways of tweaking Foreign Keys are suggested that are helpful for larger databases. *INITIALLY NOT ENFORCED FK created, but is not enforced during DML. Will be/Must be marked NOT VALID when first created. We can run a VALIDATE on the constraint at any time; if it passes the check it is marked VALID and presumed to stay that way until the next VALIDATE run. Does that mean the FK would become invalid after every DML operation, until you expicitly revalidate it? Is that practical? ON DELETE IGNORE ON UPDATE IGNORE If we allow this specification then the FK is one way - we check the existence of a row in the referenced table, but there is no need for a trigger on the referenced table to enforce an action on delete or update, so no need to lock the referenced table when adding FKs. Are you worried about locking the table at all, or about having to lock many rows? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Tweaking Foreign Keys for larger tables
Various ways of tweaking Foreign Keys are suggested that are helpful for larger databases. * Deferrable Enforcement Timing Clause * NOT DEFERRABLE - immediate execution * DEFERRABLE *INITIALLY IMMEDIATE - existing *INITIALLY DEFERRED - existing *INITIALLY NOT ENFORCED FK created, but is not enforced during DML. Will be/Must be marked NOT VALID when first created. We can run a VALIDATE on the constraint at any time; if it passes the check it is marked VALID and presumed to stay that way until the next VALIDATE run. If it fails that check the FK would be marked as NOT VALID, causing it to be no longer useful for optimization. This allows FKs to be checked in bulk, rather than executing during front-end code path, but yet still be there for optimization and documentation (or visibility by tools etc). There is no corresponding SET CONSTRAINTs call for the NOT ENFORCED case, since that would require us to mark the constraint as not valid. * Referenced Table actions ON DELETE IGNORE ON UPDATE IGNORE If we allow this specification then the FK is one way - we check the existence of a row in the referenced table, but there is no need for a trigger on the referenced table to enforce an action on delete or update, so no need to lock the referenced table when adding FKs. This is very useful for very highly referenced tables. Or for larger tables where we aren't planning on deleting or updating the referenced table without also deleting or updating the referencing table. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers