Re: simplifying foreign key/RI checks
On Mon, Apr 11, 2022 at 4:47 PM Amit Langote wrote: > This one has been marked Returned with Feedback in the CF app, which > makes sense given the discussion on -committers [1]. > > Agree with the feedback given that it would be better to address *all* > RI trigger check/action functions in the project of sidestepping SPI > when doing those checks/actions, not only RI_FKey_check_ins / upd() as > the current patch does. I guess that will require thinking a little > bit harder about how to modularize the new implementation so that the > various trigger functions don't end up with their own bespoke > check/action implementations. > > I'll think about that, also consider what Corey proposed in [2], and > try to reformulate this for v16. I've been thinking about this and wondering if the SPI overhead is too big in the other cases (cases where it is the FK table that is to be scanned) that it makes sense to replace the actual planner (invoked via SPI) by a hard-coded mini-planner for the task of figuring out the best way to scan the FK table for a given PK row affected by the main query. Planner's involvement seems necessary in those cases, because the choice of how to scan the FK table is not as clear cut as how to scan the PK table. ISTM, the SPI overhead consists mainly of performing GetCachedPlan() and executor setup/shutdown, which can seem substantial when compared to the core task of scanning the PK/FK table, and does add up over many rows affected by the main query, as seen by the over 2x speedup for the PK table case gained by shaving it off with the proposed patch [1]. In the other cases, the mini-planner will need some cycles of its own, even though maybe not as many as by the use of SPI, so the speedup might be less impressive. Other than coming up with an acceptable implementation for the mini-planner (maybe we have an example in plan_cluster_use_sort() to ape), one more challenge is to figure out a way to implement the CASCADE/SET trigger routines. For those, we might need to introduce restricted forms of ExecUpdate(), ExecDelete() that can be called directly, that is, without a full-fledged plan. Not having to worry about those things does seem like a benefit of just continuing to use the SPI in those cases. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com [1] drop table pk, fk; create table pk (a int primary key); create table fk (a int references pk); insert into pk select generate_series(1, 100); insert into fk select i%100+1 from generate_series(1, 1000) i; Time for the last statement: HEAD: 67566.845 ms (01:07.567) Patched: 26759.627 ms (00:26.760)
Re: simplifying foreign key/RI checks
On Thu, Apr 7, 2022 at 10:05 AM Amit Langote wrote: > There were rebase conflicts with the recently committed > execPartition.c/h changes. While fixing them, I thought maybe > find_leaf_part_for_key() doesn't quite match in style with its > neighbors in execPartition.h, so changed it to > ExecGetLeafPartitionForKey(). This one has been marked Returned with Feedback in the CF app, which makes sense given the discussion on -committers [1]. Agree with the feedback given that it would be better to address *all* RI trigger check/action functions in the project of sidestepping SPI when doing those checks/actions, not only RI_FKey_check_ins / upd() as the current patch does. I guess that will require thinking a little bit harder about how to modularize the new implementation so that the various trigger functions don't end up with their own bespoke check/action implementations. I'll think about that, also consider what Corey proposed in [2], and try to reformulate this for v16. -- Amit Langote EDB: http://www.enterprisedb.com [1] https://www.postgresql.org/message-id/flat/E1ncXX2-000mFt-Pe%40gemulon.postgresql.org [2] https://www.postgresql.org/message-id/flat/CADkLM%3DeZJddpx6RDop-oCrQ%2BJ9R-wfbf6MoLxUUGjbpwTkoUXQ%40mail.gmail.com
Re: simplifying foreign key/RI checks
There were rebase conflicts with the recently committed execPartition.c/h changes. While fixing them, I thought maybe find_leaf_part_for_key() doesn't quite match in style with its neighbors in execPartition.h, so changed it to ExecGetLeafPartitionForKey(). -- Amit Langote EDB: http://www.enterprisedb.com v15-0001-Add-isolation-tests-for-snapshot-behavior-in-ri_.patch Description: Binary data v15-0002-Avoid-using-SPI-for-some-RI-checks.patch Description: Binary data
Re: simplifying foreign key/RI checks
On Mon, Mar 14, 2022 at 6:28 PM Zhihong Yu wrote: > On Mon, Mar 14, 2022 at 1:33 AM Amit Langote wrote: >> On Tue, Jan 18, 2022 at 3:30 PM Amit Langote wrote: >> > v13 is attached. >> >> I noticed that the recent 641f3dffcdf's changes to >> get_constraint_index() made it basically unusable for this patch's >> purposes. >> >> Reading in the thread that led to 641f3dffcdf why >> get_constraint_index() was changed the way it was, I invented in the >> attached updated patch a get_fkey_constraint_index() that is local to >> ri_triggers.c for use by the new ri_ReferencedKeyExists(), replacing >> get_constraint_index() that no longer gives it the index it's looking >> for. >> > > Hi, > + partkey_isnull[j] = (key_nulls[k] == 'n' ? true : false); > > The above can be shortened as: > > partkey_isnull[j] = key_nulls[k] == 'n'; > > +* May neeed to cast each of the individual values of the foreign key > > neeed -> need Both fixed, thanks. -- Amit Langote EDB: http://www.enterprisedb.com v15-0001-Add-isolation-tests-for-snapshot-behavior-in-ri_.patch Description: Binary data v15-0002-Avoid-using-SPI-for-some-RI-checks.patch Description: Binary data
Re: simplifying foreign key/RI checks
On Mon, Mar 14, 2022 at 1:33 AM Amit Langote wrote: > On Tue, Jan 18, 2022 at 3:30 PM Amit Langote > wrote: > > v13 is attached. > > I noticed that the recent 641f3dffcdf's changes to > get_constraint_index() made it basically unusable for this patch's > purposes. > > Reading in the thread that led to 641f3dffcdf why > get_constraint_index() was changed the way it was, I invented in the > attached updated patch a get_fkey_constraint_index() that is local to > ri_triggers.c for use by the new ri_ReferencedKeyExists(), replacing > get_constraint_index() that no longer gives it the index it's looking > for. > > -- > Amit Langote > EDB: http://www.enterprisedb.com Hi, + partkey_isnull[j] = (key_nulls[k] == 'n' ? true : false); The above can be shortened as: partkey_isnull[j] = key_nulls[k] == 'n'; +* May neeed to cast each of the individual values of the foreign key neeed -> need Cheers
Re: simplifying foreign key/RI checks
On Tue, Jan 18, 2022 at 3:30 PM Amit Langote wrote: > v13 is attached. I noticed that the recent 641f3dffcdf's changes to get_constraint_index() made it basically unusable for this patch's purposes. Reading in the thread that led to 641f3dffcdf why get_constraint_index() was changed the way it was, I invented in the attached updated patch a get_fkey_constraint_index() that is local to ri_triggers.c for use by the new ri_ReferencedKeyExists(), replacing get_constraint_index() that no longer gives it the index it's looking for. -- Amit Langote EDB: http://www.enterprisedb.com v14-0001-Add-isolation-tests-for-snapshot-behavior-in-ri_.patch Description: Binary data v14-0002-Avoid-using-SPI-for-some-RI-checks.patch Description: Binary data
Re: simplifying foreign key/RI checks
Thanks for the review. On Tue, Dec 21, 2021 at 5:54 PM Zhihong Yu wrote: > Hi, > > + int lockflags = 0; > + TM_Result test; > + > + lockflags = TUPLE_LOCK_FLAG_LOCK_UPDATE_IN_PROGRESS; > > The above assignment can be meged with the line where variable lockflags is > declared. Sure. > + GetUserIdAndSecContext(_userid, _sec_context); > > save_userid -> saved_userid > save_sec_context -> saved_sec_context I agree that's better though I guess I had kept the names as they were in other functions. Fixed nevertheless. > +* the transaction-snapshot mode. If we didn't push one already, do > > didn't push -> haven't pushed Done. > For ri_PerformCheck(): > > + boolsource_is_pk = true; > > It seems the value of source_is_pk doesn't change - the value true can be > plugged into ri_ExtractValues() calls directly. OK, done. v13 is attached. -- Amit Langote EDB: http://www.enterprisedb.com v13-0002-Avoid-using-SPI-for-some-RI-checks.patch Description: Binary data v13-0001-Add-isolation-tests-for-snapshot-behavior-in-ri_.patch Description: Binary data
Re: simplifying foreign key/RI checks
On Mon, Dec 20, 2021 at 5:17 AM Amit Langote wrote: > On Mon, Dec 20, 2021 at 6:19 PM Zhihong Yu wrote: > > On Sun, Dec 19, 2021 at 10:20 PM Amit Langote > wrote: > >> > >> On Mon, Dec 20, 2021 at 2:00 PM Corey Huinker > wrote: > >> > Sorry for the delay. This patch no longer applies, it has some > conflict with d6f96ed94e73052f99a2e545ed17a8b2fdc1fb8a > >> > >> Thanks Corey for the heads up. Rebased with some cosmetic adjustments. > >> > > Hi, > > > > + Assert(partidx < 0 || partidx < partdesc->nparts); > > + partoid = partdesc->oids[partidx]; > > > > If partidx < 0, do we still need to fill out partoid and is_leaf ? It > seems we can return early based on (should call table_close(rel) first): > > > > + /* No partition found. */ > > + if (partidx < 0) > > + return NULL; > > Good catch, thanks. Patch updated. > > Hi, + int lockflags = 0; + TM_Result test; + + lockflags = TUPLE_LOCK_FLAG_LOCK_UPDATE_IN_PROGRESS; The above assignment can be meged with the line where variable lockflags is declared. + GetUserIdAndSecContext(_userid, _sec_context); save_userid -> saved_userid save_sec_context -> saved_sec_context +* the transaction-snapshot mode. If we didn't push one already, do didn't push -> haven't pushed For ri_PerformCheck(): + boolsource_is_pk = true; It seems the value of source_is_pk doesn't change - the value true can be plugged into ri_ExtractValues() calls directly. Cheers
Re: simplifying foreign key/RI checks
> > > > Good catch, thanks. Patch updated. > > > Applies clean. Passes check-world.
Re: simplifying foreign key/RI checks
On Mon, Dec 20, 2021 at 6:19 PM Zhihong Yu wrote: > On Sun, Dec 19, 2021 at 10:20 PM Amit Langote wrote: >> >> On Mon, Dec 20, 2021 at 2:00 PM Corey Huinker >> wrote: >> > Sorry for the delay. This patch no longer applies, it has some conflict >> > with d6f96ed94e73052f99a2e545ed17a8b2fdc1fb8a >> >> Thanks Corey for the heads up. Rebased with some cosmetic adjustments. >> > Hi, > > + Assert(partidx < 0 || partidx < partdesc->nparts); > + partoid = partdesc->oids[partidx]; > > If partidx < 0, do we still need to fill out partoid and is_leaf ? It seems > we can return early based on (should call table_close(rel) first): > > + /* No partition found. */ > + if (partidx < 0) > + return NULL; Good catch, thanks. Patch updated. -- Amit Langote EDB: http://www.enterprisedb.com v12-0001-Add-isolation-tests-for-snapshot-behavior-in-ri_.patch Description: Binary data v12-0002-Avoid-using-SPI-for-some-RI-checks.patch Description: Binary data
Re: simplifying foreign key/RI checks
On Sun, Dec 19, 2021 at 10:20 PM Amit Langote wrote: > On Mon, Dec 20, 2021 at 2:00 PM Corey Huinker > wrote: > > Sorry for the delay. This patch no longer applies, it has some conflict > with d6f96ed94e73052f99a2e545ed17a8b2fdc1fb8a > > Thanks Corey for the heads up. Rebased with some cosmetic adjustments. > > Hi, + Assert(partidx < 0 || partidx < partdesc->nparts); + partoid = partdesc->oids[partidx]; If partidx < 0, do we still need to fill out partoid and is_leaf ? It seems we can return early based on (should call table_close(rel) first): + /* No partition found. */ + if (partidx < 0) + return NULL; Cheers
Re: simplifying foreign key/RI checks
On Mon, Dec 20, 2021 at 2:00 PM Corey Huinker wrote: > Sorry for the delay. This patch no longer applies, it has some conflict with > d6f96ed94e73052f99a2e545ed17a8b2fdc1fb8a Thanks Corey for the heads up. Rebased with some cosmetic adjustments. -- Amit Langote EDB: http://www.enterprisedb.com v11-0001-Add-isolation-tests-for-snapshot-behavior-in-ri_.patch Description: Binary data v11-0002-Avoid-using-SPI-for-some-RI-checks.patch Description: Binary data
Re: simplifying foreign key/RI checks
> > > > I wasn't able to make much inroads into how we might be able to get > rid of the DETACH-related partition descriptor hacks, the item (3), > though I made some progress on items (1) and (2). > > For (1), the attached 0001 patch adds a new isolation suite > fk-snapshot.spec to exercise snapshot behaviors in the cases where we > no longer go through SPI. It helped find some problems with the > snapshot handling in the earlier versions of the patch, mainly with > partitioned PK tables. It also contains a test along the lines of the > example you showed upthread, which shows that the partition descriptor > hack requiring ActiveSnapshot to be set results in wrong results. > Patch includes the buggy output for that test case and marked as such > in a comment above the test. > > In updated 0002, I fixed things such that the snapshot-setting > required by the partition descriptor hack is independent of > snapshot-setting of the RI query such that it no longer causes the PK > index scan to return rows that the RI query mustn't see. That fixes > the visibility bug illustrated in your example, and as mentioned, also > exercised in the new test suite. > > I also moved find_leaf_pk_rel() into execPartition.c with a new name > and a new set of parameters. > > -- > Amit Langote > EDB: http://www.enterprisedb.com Sorry for the delay. This patch no longer applies, it has some conflict with d6f96ed94e73052f99a2e545ed17a8b2fdc1fb8a
Re: simplifying foreign key/RI checks
On Sat, Nov 13, 2021 at 5:43 AM Tom Lane wrote: > Amit Langote writes: > > On Fri, Nov 12, 2021 at 8:19 AM Tom Lane wrote: > >> Anyway, I think that (1) we should write some more test cases around > >> this behavior, (2) you need to establish the snapshot to use in two > >> different ways for the RI_FKey_check and ri_Check_Pk_Match cases, > >> and (3) something's going to have to be done to repair the behavior > >> in v14 (unless we want to back-patch this into v14, which seems a > >> bit scary). > > > Okay, I'll look into getting 1 and 2 done for this patch and I guess > > work with Alvaro on 3. > > Actually, it seems that DETACH PARTITION is broken for concurrent > serializable/repeatable-read transactions quite independently of > whether they attempt to make any FK checks [1]. If we do what > I speculated about there, namely wait out all such xacts before > detaching, it might be possible to fix (3) just by reverting the > problematic change in ri_triggers.c. I'm thinking the wait would > render it unnecessary to get FK checks to do anything weird about > partition lookup. But I might well be missing something. I wasn't able to make much inroads into how we might be able to get rid of the DETACH-related partition descriptor hacks, the item (3), though I made some progress on items (1) and (2). For (1), the attached 0001 patch adds a new isolation suite fk-snapshot.spec to exercise snapshot behaviors in the cases where we no longer go through SPI. It helped find some problems with the snapshot handling in the earlier versions of the patch, mainly with partitioned PK tables. It also contains a test along the lines of the example you showed upthread, which shows that the partition descriptor hack requiring ActiveSnapshot to be set results in wrong results. Patch includes the buggy output for that test case and marked as such in a comment above the test. In updated 0002, I fixed things such that the snapshot-setting required by the partition descriptor hack is independent of snapshot-setting of the RI query such that it no longer causes the PK index scan to return rows that the RI query mustn't see. That fixes the visibility bug illustrated in your example, and as mentioned, also exercised in the new test suite. I also moved find_leaf_pk_rel() into execPartition.c with a new name and a new set of parameters. -- Amit Langote EDB: http://www.enterprisedb.com v10-0001-Add-isolation-tests-for-snapshot-behavior-in-ri_.patch Description: Binary data v10-0002-Avoid-using-SPI-for-some-RI-checks.patch Description: Binary data
Re: simplifying foreign key/RI checks
Amit Langote writes: > On Fri, Nov 12, 2021 at 8:19 AM Tom Lane wrote: >> Anyway, I think that (1) we should write some more test cases around >> this behavior, (2) you need to establish the snapshot to use in two >> different ways for the RI_FKey_check and ri_Check_Pk_Match cases, >> and (3) something's going to have to be done to repair the behavior >> in v14 (unless we want to back-patch this into v14, which seems a >> bit scary). > Okay, I'll look into getting 1 and 2 done for this patch and I guess > work with Alvaro on 3. Actually, it seems that DETACH PARTITION is broken for concurrent serializable/repeatable-read transactions quite independently of whether they attempt to make any FK checks [1]. If we do what I speculated about there, namely wait out all such xacts before detaching, it might be possible to fix (3) just by reverting the problematic change in ri_triggers.c. I'm thinking the wait would render it unnecessary to get FK checks to do anything weird about partition lookup. But I might well be missing something. regards, tom lane [1] https://www.postgresql.org/message-id/1849918.1636748862%40sss.pgh.pa.us
Re: simplifying foreign key/RI checks
Amit Langote writes: > Whatever mechanism we will use would still need to involve setting a > global Snapshot variable though, right? In v14 we'll certainly still be passing the snapshot(s) to SPI, which will eventually make the snapshot active. With your patch, since we're just handing the snapshot to the scan mechanism, it seems at least theoretically possible that we'd not have to do PushActiveSnapshot on it. Not doing so might be a bad idea however; if there is any user-defined code getting called, it might have expectations about ActiveSnapshot being relevant. On the whole I'd be inclined to say that we still want the RI test_snapshot to be the ActiveSnapshot while performing the test. regards, tom lane
Re: simplifying foreign key/RI checks
On Fri, Nov 12, 2021 at 10:58 AM Tom Lane wrote: > I wrote: > > Anyway, I think that (1) we should write some more test cases around > > this behavior, (2) you need to establish the snapshot to use in two > > different ways for the RI_FKey_check and ri_Check_Pk_Match cases, > > and (3) something's going to have to be done to repair the behavior > > in v14 (unless we want to back-patch this into v14, which seems a > > bit scary). > > I wrote that thinking that point (2), ie fix the choice of snapshots for > these RI queries, would solve the brokenness in partitioned tables, > so that (3) would potentially only require hacking up v14. > > However after thinking more I realize that (2) will break the desired > behavior for concurrent partition detaches, because that's being driven > off ActiveSnapshot. So we really need a solution that decouples the > partition detachment logic from ActiveSnapshot, in both branches. ISTM that the latest snapshot would still have to be passed to the find_inheritance_children_extended() *somehow* by ri_trigger.c. IIUC the problem with using the ActiveSnapshot mechanism to do that is that it causes the SPI query to see even user table rows that it shouldn't be able to, so that is why you say it is too global a mechanism for this hack. Whatever mechanism we will use would still need to involve setting a global Snapshot variable though, right? -- Amit Langote EDB: http://www.enterprisedb.com
Re: simplifying foreign key/RI checks
On Fri, Nov 12, 2021 at 8:19 AM Tom Lane wrote: > Amit Langote writes: > > Rebased patches attached. > > I've spent some more time digging into the snapshot-management angle. Thanks for looking at this. > I think you are right that the crosscheck_snapshot isn't really an > issue because the executor pays no attention to it for SELECT, but > that doesn't mean that there's no problem, because the test_snapshot > behavior is different too. By my reading of it, the intention of the > existing code is to insist that when IsolationUsesXactSnapshot() > is true and we *weren't* saying detectNewRows, the query should be > restricted to only see rows visible to the transaction snapshot. > Which I think is proper: an RR transaction shouldn't be allowed to > insert referencing rows that depend on a referenced row it can't see. > On the other hand, it's reasonable for ri_Check_Pk_Match to use > detectNewRows=true, because in that case what we're doing is allowing > an RR transaction to depend on the continued existence of a PK value > that was deleted and replaced since the start of its transaction. > > It appears to me that commit 71f4c8c6f (DETACH PARTITION CONCURRENTLY) > broke the semantics here, because now things work differently with a > partitioned PK table than with a plain table, thanks to not bothering > to distinguish questions of how to handle partition detachment from > questions of visibility of individual data tuples. We evidently > haven't got test coverage for this :-(, which is perhaps not so > surprising because all this behavior long predates the isolationtester > infrastructure that would've allowed us to test it mechanically. > > Anyway, I think that (1) we should write some more test cases around > this behavior, (2) you need to establish the snapshot to use in two > different ways for the RI_FKey_check and ri_Check_Pk_Match cases, > and (3) something's going to have to be done to repair the behavior > in v14 (unless we want to back-patch this into v14, which seems a > bit scary). Okay, I'll look into getting 1 and 2 done for this patch and I guess work with Alvaro on 3. > It looks like you've addressed the other complaints I raised back in > March, so that's forward progress anyway. I do still find myself a > bit dissatisfied with the code factorization, because it seems like > find_leaf_pk_rel() doesn't belong here but rather in some partitioning > module. OTOH, if that means exposing RI_ConstraintInfo to the world, > that wouldn't be nice either. Hm yeah, fair point about the undesirability of putting partitioning details into ri_triggers.c, so will look into refactoring to avoid that. -- Amit Langote EDB: http://www.enterprisedb.com
Re: simplifying foreign key/RI checks
I wrote: > Anyway, I think that (1) we should write some more test cases around > this behavior, (2) you need to establish the snapshot to use in two > different ways for the RI_FKey_check and ri_Check_Pk_Match cases, > and (3) something's going to have to be done to repair the behavior > in v14 (unless we want to back-patch this into v14, which seems a > bit scary). I wrote that thinking that point (2), ie fix the choice of snapshots for these RI queries, would solve the brokenness in partitioned tables, so that (3) would potentially only require hacking up v14. However after thinking more I realize that (2) will break the desired behavior for concurrent partition detaches, because that's being driven off ActiveSnapshot. So we really need a solution that decouples the partition detachment logic from ActiveSnapshot, in both branches. regards, tom lane
Re: simplifying foreign key/RI checks
Alvaro Herrera writes: > I think we (I) should definitely pursue fixing whatever was broken by > DETACH CONCURRENTLY, back to pg14, independently of this patch ... but > I would appreciate some insight into what the problem is. Here's what I'm on about: regression=# create table pk (f1 int primary key); CREATE TABLE regression=# insert into pk values(1); INSERT 0 1 regression=# create table fk (f1 int references pk); CREATE TABLE regression=# begin isolation level repeatable read ; BEGIN regression=*# select * from pk; -- to establish xact snapshot f1 1 (1 row) now, in another session, do: regression=# insert into pk values(2); INSERT 0 1 back at the RR transaction, we can't see that: regression=*# select * from pk; -- still no row 2 f1 1 (1 row) so we get: regression=*# insert into fk values(1); INSERT 0 1 regression=*# insert into fk values(2); ERROR: insert or update on table "fk" violates foreign key constraint "fk_f1_fkey" DETAIL: Key (f1)=(2) is not present in table "pk". IMO that behavior is correct. If you use READ COMMITTED, then SELECT can see row 2 as soon as it's committed, and so can the FK check, and again that's correct. In v13, the behavior is the same if "pk" is a partitioned table instead of a plain one. In HEAD, it's not: regression=# drop table pk, fk; DROP TABLE regression=# create table pk (f1 int primary key) partition by list(f1); CREATE TABLE regression=# create table pk1 partition of pk for values in (1,2); CREATE TABLE regression=# insert into pk values(1); INSERT 0 1 regression=# create table fk (f1 int references pk); CREATE TABLE regression=# begin isolation level repeatable read ; BEGIN regression=*# select * from pk; -- to establish xact snapshot f1 1 (1 row) --- now insert row 2 in another session regression=*# select * from pk; -- still no row 2 f1 1 (1 row) regression=*# insert into fk values(1); INSERT 0 1 regression=*# insert into fk values(2); INSERT 0 1 regression=*# So I say that's busted, and the cause is this hunk from 71f4c8c6f: @@ -392,11 +392,15 @@ RI_FKey_check(TriggerData *trigdata) /* * Now check that foreign key exists in PK table + * + * XXX detectNewRows must be true when a partitioned table is on the + * referenced side. The reason is that our snapshot must be fresh + * in order for the hack in find_inheritance_children() to work. */ ri_PerformCheck(riinfo, , qplan, fk_rel, pk_rel, NULL, newslot, -false, +pk_rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE, SPI_OK_SELECT); if (SPI_finish() != SPI_OK_FINISH) I think you need some signalling mechanism that's less global than ActiveSnapshot to tell the partition-lookup machinery what to do in this context. regards, tom lane
Re: simplifying foreign key/RI checks
On 2021-Nov-11, Tom Lane wrote: > It appears to me that commit 71f4c8c6f (DETACH PARTITION CONCURRENTLY) > broke the semantics here, because now things work differently with a > partitioned PK table than with a plain table, thanks to not bothering > to distinguish questions of how to handle partition detachment from > questions of visibility of individual data tuples. We evidently > haven't got test coverage for this :-(, which is perhaps not so > surprising because all this behavior long predates the isolationtester > infrastructure that would've allowed us to test it mechanically. > > Anyway, I think that (1) we should write some more test cases around > this behavior, (2) you need to establish the snapshot to use in two > different ways for the RI_FKey_check and ri_Check_Pk_Match cases, > and (3) something's going to have to be done to repair the behavior > in v14 (unless we want to back-patch this into v14, which seems a > bit scary). I think we (I) should definitely pursue fixing whatever was broken by DETACH CONCURRENTLY, back to pg14, independently of this patch ... but I would appreciate some insight into what the problem is. -- Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/ "Find a bug in a program, and fix it, and the program will work today. Show the program how to find and fix a bug, and the program will work forever" (Oliver Silfridge)
Re: simplifying foreign key/RI checks
Amit Langote writes: > Rebased patches attached. I've spent some more time digging into the snapshot-management angle. I think you are right that the crosscheck_snapshot isn't really an issue because the executor pays no attention to it for SELECT, but that doesn't mean that there's no problem, because the test_snapshot behavior is different too. By my reading of it, the intention of the existing code is to insist that when IsolationUsesXactSnapshot() is true and we *weren't* saying detectNewRows, the query should be restricted to only see rows visible to the transaction snapshot. Which I think is proper: an RR transaction shouldn't be allowed to insert referencing rows that depend on a referenced row it can't see. On the other hand, it's reasonable for ri_Check_Pk_Match to use detectNewRows=true, because in that case what we're doing is allowing an RR transaction to depend on the continued existence of a PK value that was deleted and replaced since the start of its transaction. It appears to me that commit 71f4c8c6f (DETACH PARTITION CONCURRENTLY) broke the semantics here, because now things work differently with a partitioned PK table than with a plain table, thanks to not bothering to distinguish questions of how to handle partition detachment from questions of visibility of individual data tuples. We evidently haven't got test coverage for this :-(, which is perhaps not so surprising because all this behavior long predates the isolationtester infrastructure that would've allowed us to test it mechanically. Anyway, I think that (1) we should write some more test cases around this behavior, (2) you need to establish the snapshot to use in two different ways for the RI_FKey_check and ri_Check_Pk_Match cases, and (3) something's going to have to be done to repair the behavior in v14 (unless we want to back-patch this into v14, which seems a bit scary). It looks like you've addressed the other complaints I raised back in March, so that's forward progress anyway. I do still find myself a bit dissatisfied with the code factorization, because it seems like find_leaf_pk_rel() doesn't belong here but rather in some partitioning module. OTOH, if that means exposing RI_ConstraintInfo to the world, that wouldn't be nice either. regards, tom lane
Re: simplifying foreign key/RI checks
> > Rebased patches attached. I'm reviewing the changes since v6, which was my last review. Making ExecLockTableTuple() it's own function makes sense. Snapshots are now accounted for. The changes that account for n-level partitioning makes sense as well. Passes make check-world. Not user facing, so no user documentation required. Marking as ready for committer again.
Re: simplifying foreign key/RI checks
On Tue, Jul 6, 2021 at 1:56 AM vignesh C wrote: > The 2nd patch does not apply on Head, please post a rebased version: > error: patch failed: src/backend/utils/adt/ri_triggers.c:337 > error: src/backend/utils/adt/ri_triggers.c: patch does not apply Thanks for the heads up. Rebased patches attached. -- Amit Langote EDB: http://www.enterprisedb.com v9-0001-Export-get_partition_for_tuple.patch Description: Binary data v9-0002-Avoid-using-SPI-for-some-RI-checks.patch Description: Binary data
Re: simplifying foreign key/RI checks
On Sun, Apr 4, 2021 at 1:51 PM Amit Langote wrote: > > On Fri, Apr 2, 2021 at 11:55 PM Zhihong Yu wrote: > > > > Hi, > > > > + skip = !ExecLockTableTuple(erm->relation, , markSlot, > > + estate->es_snapshot, > > estate->es_output_cid, > > + lockmode, erm->waitPolicy, _needed); > > + if (skip) > > > > It seems the variable skip is only used above. The variable is not needed - > > if statement can directly check the return value. > > > > + * Locks tuple with given TID with given lockmode following given > > wait > > > > given appears three times in the above sentence. Maybe the following is bit > > easier to read: > > > > Locks tuple with the specified TID, lockmode following given wait policy > > > > + * Checks whether a tuple containing the same unique key as extracted from > > the > > + * tuple provided in 'slot' exists in 'pk_rel'. > > > > I think 'same' is not needed here since the remaining part of the sentence > > has adequately identified the key. > > > > + if (leaf_pk_rel == NULL) > > + goto done; > > > > It would be better to avoid goto by including the cleanup statements in the > > if block and return. > > > > + if (index_getnext_slot(scan, ForwardScanDirection, outslot)) > > + found = true; > > + > > + /* Found tuple, try to lock it in key share mode. */ > > + if (found) > > > > Since found is only assigned in one place, the two if statements can be > > combined into one. > > Thanks for taking a look. I agree with most of your suggestions and > have incorporated them in the v8 just posted. The 2nd patch does not apply on Head, please post a rebased version: error: patch failed: src/backend/utils/adt/ri_triggers.c:337 error: src/backend/utils/adt/ri_triggers.c: patch does not apply Regards, Vignesh
Re: simplifying foreign key/RI checks
On Fri, Apr 2, 2021 at 11:55 PM Zhihong Yu wrote: > > Hi, > > + skip = !ExecLockTableTuple(erm->relation, , markSlot, > + estate->es_snapshot, estate->es_output_cid, > + lockmode, erm->waitPolicy, _needed); > + if (skip) > > It seems the variable skip is only used above. The variable is not needed - > if statement can directly check the return value. > > + * Locks tuple with given TID with given lockmode following given > wait > > given appears three times in the above sentence. Maybe the following is bit > easier to read: > > Locks tuple with the specified TID, lockmode following given wait policy > > + * Checks whether a tuple containing the same unique key as extracted from > the > + * tuple provided in 'slot' exists in 'pk_rel'. > > I think 'same' is not needed here since the remaining part of the sentence > has adequately identified the key. > > + if (leaf_pk_rel == NULL) > + goto done; > > It would be better to avoid goto by including the cleanup statements in the > if block and return. > > + if (index_getnext_slot(scan, ForwardScanDirection, outslot)) > + found = true; > + > + /* Found tuple, try to lock it in key share mode. */ > + if (found) > > Since found is only assigned in one place, the two if statements can be > combined into one. Thanks for taking a look. I agree with most of your suggestions and have incorporated them in the v8 just posted. -- Amit Langote EDB: http://www.enterprisedb.com
Re: simplifying foreign key/RI checks
Hi Alvaro, On Sat, Apr 3, 2021 at 12:01 AM Alvaro Herrera wrote: > On 2021-Apr-02, Amit Langote wrote: > > > On Sat, Mar 20, 2021 at 10:21 PM Amit Langote > > wrote: > > > Updated patches attached. Sorry about the delay. > > > > Rebased over the recent DETACH PARTITION CONCURRENTLY work. > > Apparently, ri_ReferencedKeyExists() was using the wrong snapshot. > > Hmm, I wonder if that stuff should be using a PartitionDirectory? (I > didn't actually understand what your code is doing, so please forgive if > this is a silly question.) No problem, I wondered about that too when rebasing. My instinct *was* that maybe there's no need for it, because find_leaf_pk_rel()'s use of a PartitionDesc is pretty limited in duration and scope of the kind of things it calls that there's no need to worry about it getting invalidated while in use. But I may be wrong about that, because get_partition_for_tuple() can call arbitrary user-defined functions, which may result in invalidation messages being processed and an unguarded PartitionDesc getting wiped out under us. So, I've added PartitionDirectory protection in find_leaf_pk_rel() in the attached updated version. -- Amit Langote EDB: http://www.enterprisedb.com v8-0001-Export-get_partition_for_tuple.patch Description: Binary data v8-0002-Avoid-using-SPI-for-some-RI-checks.patch Description: Binary data
Re: simplifying foreign key/RI checks
On 2021-Apr-02, Amit Langote wrote: > On Sat, Mar 20, 2021 at 10:21 PM Amit Langote wrote: > > Updated patches attached. Sorry about the delay. > > Rebased over the recent DETACH PARTITION CONCURRENTLY work. > Apparently, ri_ReferencedKeyExists() was using the wrong snapshot. Hmm, I wonder if that stuff should be using a PartitionDirectory? (I didn't actually understand what your code is doing, so please forgive if this is a silly question.) -- Álvaro Herrera39°49'30"S 73°17'W "After a quick R of TFM, all I can say is HOLY CR** THAT IS COOL! PostgreSQL was amazing when I first started using it at 7.2, and I'm continually astounded by learning new features and techniques made available by the continuing work of the development team." Berend Tober, http://archives.postgresql.org/pgsql-hackers/2007-08/msg01009.php
Re: simplifying foreign key/RI checks
Hi, + skip = !ExecLockTableTuple(erm->relation, , markSlot, + estate->es_snapshot, estate->es_output_cid, + lockmode, erm->waitPolicy, _needed); + if (skip) It seems the variable skip is only used above. The variable is not needed - if statement can directly check the return value. + * Locks tuple with given TID with given lockmode following given wait given appears three times in the above sentence. Maybe the following is bit easier to read: Locks tuple with the specified TID, lockmode following given wait policy + * Checks whether a tuple containing the same unique key as extracted from the + * tuple provided in 'slot' exists in 'pk_rel'. I think 'same' is not needed here since the remaining part of the sentence has adequately identified the key. + if (leaf_pk_rel == NULL) + goto done; It would be better to avoid goto by including the cleanup statements in the if block and return. + if (index_getnext_slot(scan, ForwardScanDirection, outslot)) + found = true; + + /* Found tuple, try to lock it in key share mode. */ + if (found) Since found is only assigned in one place, the two if statements can be combined into one. Cheers On Fri, Apr 2, 2021 at 5:46 AM Amit Langote wrote: > On Sat, Mar 20, 2021 at 10:21 PM Amit Langote > wrote: > > Updated patches attached. Sorry about the delay. > > Rebased over the recent DETACH PARTITION CONCURRENTLY work. > Apparently, ri_ReferencedKeyExists() was using the wrong snapshot. > > -- > Amit Langote > EDB: http://www.enterprisedb.com >
Re: simplifying foreign key/RI checks
On Sat, Mar 20, 2021 at 10:21 PM Amit Langote wrote: > Updated patches attached. Sorry about the delay. Rebased over the recent DETACH PARTITION CONCURRENTLY work. Apparently, ri_ReferencedKeyExists() was using the wrong snapshot. -- Amit Langote EDB: http://www.enterprisedb.com v8-0001-Export-get_partition_for_tuple.patch Description: Binary data v8-0002-Avoid-using-SPI-for-some-RI-checks.patch Description: Binary data
Re: simplifying foreign key/RI checks
On Mon, Mar 8, 2021 at 11:41 PM Amit Langote wrote: > On Thu, Mar 4, 2021 at 5:15 AM Tom Lane wrote: > > I guess I'm disturbed by the idea > > that we'd totally replace the implementation technology for only one > > variant of foreign key checks. That means that there'll be a lot > > of minor details that don't act the same depending on context. One > > point I was just reminded of by [1] is that the SPI approach enforces > > permissions checks on the table access, which I do not see being done > > anywhere in your patch. Now, maybe it's fine not to have such checks, > > on the grounds that the existence of the RI constraint is sufficient > > permission (the creator had to have REFERENCES permission to make it). > > But I'm not sure about that. Should we add SELECT permissions checks > > to this code path to make it less different? > > > > In the same vein, the existing code actually runs the query as the > > table owner (cf. SetUserIdAndSecContext in ri_PerformCheck), another > > nicety you haven't bothered with. Maybe that is invisible for a > > pure SELECT query but I'm not sure I would bet on it. At the very > > least you're betting that the index-related operators you invoke > > aren't going to care, and that nobody is going to try to use this > > difference to create a security exploit via a trojan-horse index. > > How about we do at the top of ri_ReferencedKeyExists() what > ri_PerformCheck() always does before executing a query, which is this: > >/* Switch to proper UID to perform check as */ >GetUserIdAndSecContext(_userid, _sec_context); >SetUserIdAndSecContext(RelationGetForm(query_rel)->relowner, > save_sec_context | SECURITY_LOCAL_USERID_CHANGE | > SECURITY_NOFORCE_RLS); > > And then also check the permissions of the switched user on the scan > target relation's schema (ACL_USAGE) and the relation itself > (ACL_SELECT). > > IOW, this: > > + Oid save_userid; > + int save_sec_context; > + AclResult aclresult; > + > + /* Switch to proper UID to perform check as */ > + GetUserIdAndSecContext(_userid, _sec_context); > + SetUserIdAndSecContext(RelationGetForm(pk_rel)->relowner, > + save_sec_context | SECURITY_LOCAL_USERID_CHANGE | > + SECURITY_NOFORCE_RLS); > + > + /* Check namespace permissions. */ > + aclresult = pg_namespace_aclcheck(RelationGetNamespace(pk_rel), > + GetUserId(), ACL_USAGE); > + if (aclresult != ACLCHECK_OK) > + aclcheck_error(aclresult, OBJECT_SCHEMA, > + get_namespace_name(RelationGetNamespace(pk_rel))); > + /* Check the user has SELECT permissions on the referenced relation. */ > + aclresult = pg_class_aclcheck(RelationGetRelid(pk_rel), GetUserId(), > + ACL_SELECT); > + if (aclresult != ACLCHECK_OK) > + aclcheck_error(aclresult, OBJECT_TABLE, > + RelationGetRelationName(pk_rel)); > > /* > * Extract the unique key from the provided slot and choose the equality > @@ -414,6 +436,9 @@ ri_ReferencedKeyExists(Relation pk_rel, Relation fk_rel, > index_endscan(scan); > ExecDropSingleTupleTableSlot(outslot); > > + /* Restore UID and security context */ > + SetUserIdAndSecContext(save_userid, save_sec_context); > + > /* Don't release lock until commit. */ > index_close(idxrel, NoLock); I've included these changes in the updated patch. > > Shall we mention RLS restrictions? If we don't worry about that, > > I think REFERENCES privilege becomes a full bypass of RLS, at > > least for unique-key columns. > > Seeing what check_enable_rls() does when running under the security > context set by ri_PerformCheck(), it indeed seems that RLS is bypassed > when executing these RI queries. The following comment in > check_enable_rls() seems to say so: > > * InNoForceRLSOperation indicates that we should not apply RLS even > * if the table has FORCE RLS set - IF the current user is the owner. > * This is specifically to ensure that referential integrity checks > * are able to still run correctly. I've added a comment to note that the new way of "selecting" the referenced tuple effectively bypasses RLS, as is the case when selecting via SPI. > > I wonder also what happens if the referenced table isn't a plain > > heap with a plain btree index. Maybe you're accessing it at the > > right level of abstraction so things will just work with some > > other access methods, but I'm not sure about that. > > I believe that I've made ri_ReferencedKeyExists() use the appropriate > APIs to scan the index, lock the returned table tuple, etc., but do > you think we might be better served by introducing a new set of APIs > for this use case? I concur that by using the interfaces defined in genam.h and tableam.h, patch accounts for cases involving other access
Re: simplifying foreign key/RI checks
On Mon, Mar 8, 2021 at 11:41 PM Amit Langote wrote: > On Thu, Mar 4, 2021 at 5:15 AM Tom Lane wrote: > > Lastly, ri_PerformCheck is pretty careful about not only which > > snapshot it uses, but which *pair* of snapshots it uses, because > > sometimes it needs to worry about data changes since the start > > of the transaction. You've ignored all of that complexity AFAICS. > > That's okay (I think) for RI_FKey_check which was passing > > detectNewRows = false, but for sure it's not okay for > > ri_Check_Pk_Match. (I kind of thought we had isolation tests > > that would catch that, but apparently not.) > > Okay, let me closely check the ri_Check_Pk_Match() case and see if > there's any live bug. I checked, and AFAICS, the query invoked by ri_Check_Pk_Match() (that is, without the patch) does not use the "crosscheck" snapshot at any point during its execution. That snapshot is only used in the table_update() and table_delete() routines, which are not involved in the execution of ri_Check_Pk_Match()'s query. I dug through git history and -hackers archives to understand the origins of RI code's use of a crosscheck snapshot and came across this discussion: https://www.postgresql.org/message-id/20031001150510.U45145%40megazone.bigpanda.com If I am reading the discussion and the details in subsequent commit 55d85f42a891a correctly, the crosscheck snapshot is only to be used to ensure, under serializable isolation, that any attempts by the RI query of updating/deleting rows that are not visible to the transaction snapshot cause a serialization error. Use of the same facilities in ri_Check_Pk_Match() was merely done as future-proofing, with no particular use case to address, then and perhaps even now. If that is indeed the case, it does not seem particularly incorrect for ri_ReferencedKeyExists() added by the patch to not bother with setting up a crosscheck snapshot, even when called from ri_Check_Pk_Match(). Am I missing something? -- Amit Langote EDB: http://www.enterprisedb.com
Re: simplifying foreign key/RI checks
On Thu, Mar 4, 2021 at 5:15 AM Tom Lane wrote: > I took a quick look at this. Thanks a lot for the review. > I guess I'm disturbed by the idea > that we'd totally replace the implementation technology for only one > variant of foreign key checks. That means that there'll be a lot > of minor details that don't act the same depending on context. One > point I was just reminded of by [1] is that the SPI approach enforces > permissions checks on the table access, which I do not see being done > anywhere in your patch. Now, maybe it's fine not to have such checks, > on the grounds that the existence of the RI constraint is sufficient > permission (the creator had to have REFERENCES permission to make it). > But I'm not sure about that. Should we add SELECT permissions checks > to this code path to make it less different? > > In the same vein, the existing code actually runs the query as the > table owner (cf. SetUserIdAndSecContext in ri_PerformCheck), another > nicety you haven't bothered with. Maybe that is invisible for a > pure SELECT query but I'm not sure I would bet on it. At the very > least you're betting that the index-related operators you invoke > aren't going to care, and that nobody is going to try to use this > difference to create a security exploit via a trojan-horse index. How about we do at the top of ri_ReferencedKeyExists() what ri_PerformCheck() always does before executing a query, which is this: /* Switch to proper UID to perform check as */ GetUserIdAndSecContext(_userid, _sec_context); SetUserIdAndSecContext(RelationGetForm(query_rel)->relowner, save_sec_context | SECURITY_LOCAL_USERID_CHANGE | SECURITY_NOFORCE_RLS); And then also check the permissions of the switched user on the scan target relation's schema (ACL_USAGE) and the relation itself (ACL_SELECT). IOW, this: + Oid save_userid; + int save_sec_context; + AclResult aclresult; + + /* Switch to proper UID to perform check as */ + GetUserIdAndSecContext(_userid, _sec_context); + SetUserIdAndSecContext(RelationGetForm(pk_rel)->relowner, + save_sec_context | SECURITY_LOCAL_USERID_CHANGE | + SECURITY_NOFORCE_RLS); + + /* Check namespace permissions. */ + aclresult = pg_namespace_aclcheck(RelationGetNamespace(pk_rel), + GetUserId(), ACL_USAGE); + if (aclresult != ACLCHECK_OK) + aclcheck_error(aclresult, OBJECT_SCHEMA, + get_namespace_name(RelationGetNamespace(pk_rel))); + /* Check the user has SELECT permissions on the referenced relation. */ + aclresult = pg_class_aclcheck(RelationGetRelid(pk_rel), GetUserId(), + ACL_SELECT); + if (aclresult != ACLCHECK_OK) + aclcheck_error(aclresult, OBJECT_TABLE, + RelationGetRelationName(pk_rel)); /* * Extract the unique key from the provided slot and choose the equality @@ -414,6 +436,9 @@ ri_ReferencedKeyExists(Relation pk_rel, Relation fk_rel, index_endscan(scan); ExecDropSingleTupleTableSlot(outslot); + /* Restore UID and security context */ + SetUserIdAndSecContext(save_userid, save_sec_context); + /* Don't release lock until commit. */ index_close(idxrel, NoLock); > Shall we mention RLS restrictions? If we don't worry about that, > I think REFERENCES privilege becomes a full bypass of RLS, at > least for unique-key columns. Seeing what check_enable_rls() does when running under the security context set by ri_PerformCheck(), it indeed seems that RLS is bypassed when executing these RI queries. The following comment in check_enable_rls() seems to say so: * InNoForceRLSOperation indicates that we should not apply RLS even * if the table has FORCE RLS set - IF the current user is the owner. * This is specifically to ensure that referential integrity checks * are able to still run correctly. > I wonder also what happens if the referenced table isn't a plain > heap with a plain btree index. Maybe you're accessing it at the > right level of abstraction so things will just work with some > other access methods, but I'm not sure about that. I believe that I've made ri_ReferencedKeyExists() use the appropriate APIs to scan the index, lock the returned table tuple, etc., but do you think we might be better served by introducing a new set of APIs for this use case? > (Anybody > want to try this with a partitioned table some of whose partitions > are foreign tables?) Partitioned tables with foreign table partitions cannot be referenced in a foreign key, so cannot appear in this function. That's because unique constraints are not allowed when there are foreign table partitions. > Lastly, ri_PerformCheck is pretty careful about not only which > snapshot it uses, but which *pair* of snapshots it uses, because > sometimes it needs to worry about
Re: simplifying foreign key/RI checks
I took a quick look at this. I guess I'm disturbed by the idea that we'd totally replace the implementation technology for only one variant of foreign key checks. That means that there'll be a lot of minor details that don't act the same depending on context. One point I was just reminded of by [1] is that the SPI approach enforces permissions checks on the table access, which I do not see being done anywhere in your patch. Now, maybe it's fine not to have such checks, on the grounds that the existence of the RI constraint is sufficient permission (the creator had to have REFERENCES permission to make it). But I'm not sure about that. Should we add SELECT permissions checks to this code path to make it less different? In the same vein, the existing code actually runs the query as the table owner (cf. SetUserIdAndSecContext in ri_PerformCheck), another nicety you haven't bothered with. Maybe that is invisible for a pure SELECT query but I'm not sure I would bet on it. At the very least you're betting that the index-related operators you invoke aren't going to care, and that nobody is going to try to use this difference to create a security exploit via a trojan-horse index. Shall we mention RLS restrictions? If we don't worry about that, I think REFERENCES privilege becomes a full bypass of RLS, at least for unique-key columns. I wonder also what happens if the referenced table isn't a plain heap with a plain btree index. Maybe you're accessing it at the right level of abstraction so things will just work with some other access methods, but I'm not sure about that. (Anybody want to try this with a partitioned table some of whose partitions are foreign tables?) Lastly, ri_PerformCheck is pretty careful about not only which snapshot it uses, but which *pair* of snapshots it uses, because sometimes it needs to worry about data changes since the start of the transaction. You've ignored all of that complexity AFAICS. That's okay (I think) for RI_FKey_check which was passing detectNewRows = false, but for sure it's not okay for ri_Check_Pk_Match. (I kind of thought we had isolation tests that would catch that, but apparently not.) So, this is a cute idea, and the speedup is pretty impressive, but I don't think it's anywhere near committable. I also wonder whether we really want ri_triggers.c having its own copy of low-level stuff like the tuple-locking code you copied. Seems like a likely maintenance hazard, so maybe some more refactoring is needed. regards, tom lane [1] https://www.postgresql.org/message-id/flat/16911-ca792f6bbe244754%40postgresql.org
RE: simplifying foreign key/RI checks
Hi amit, (sorry about not cc the hacker list) I have an issue about command id here. It's probably not directly related to your patch, so I am sorry if it bothers you. + /* +* Start the scan. To make the changes of the current command visible to +* the scan and for subsequent locking of the tuple (if any) found, +* increment the command counter. +*/ + CommandCounterIncrement(); For insert on fk relation, is it necessary to create new command id every time ? I think it is only necessary when it modifies the referenced table. for example: 1) has modifyingcte 2) has modifying function(trigger/domain...) All of the above seems not supported in parallel mode(parallel unsafe). So I was wondering if we can avoid the CommandCounterIncrement in parallel mode. Best regards, houzj
Re: simplifying foreign key/RI checks
On Mon, Mar 1, 2021 at 3:14 PM Corey Huinker wrote: >> > It seems to me 1 (RI_PLAN_CHECK_LOOKUPPK) is still alive. (Yeah, I >> > know that doesn't mean the usefulness of the macro but the mechanism >> > the macro suggests, but it is confusing.) On the other hand, >> > RI_PLAN_CHECK_LOOKUPPK_FROM_PK and RI_PLAN_LAST_ON_PK seem to be no >> > longer used. (Couldn't we remove them?) >> >> Yeah, better to just remove those _PK macros and say this module no >> longer runs any queries on the PK table. >> >> How about the attached? >> > > Sorry for the delay. > I see that the changes were made as described. > Passes make check and make check-world yet again. > I'm marking this Ready For Committer unless someone objects. Thank you Corey for the review. -- Amit Langote EDB: http://www.enterprisedb.com
Re: simplifying foreign key/RI checks
> > > It seems to me 1 (RI_PLAN_CHECK_LOOKUPPK) is still alive. (Yeah, I > > know that doesn't mean the usefulness of the macro but the mechanism > > the macro suggests, but it is confusing.) On the other hand, > > RI_PLAN_CHECK_LOOKUPPK_FROM_PK and RI_PLAN_LAST_ON_PK seem to be no > > longer used. (Couldn't we remove them?) > > Yeah, better to just remove those _PK macros and say this module no > longer runs any queries on the PK table. > > How about the attached? > > Sorry for the delay. I see that the changes were made as described. Passes make check and make check-world yet again. I'm marking this Ready For Committer unless someone objects.
Re: simplifying foreign key/RI checks
On Wed, Jan 27, 2021 at 5:32 PM Kyotaro Horiguchi wrote: > At Sun, 24 Jan 2021 20:51:39 +0900, Amit Langote > wrote in > > Here's v5. > > At Mon, 25 Jan 2021 18:19:56 +0900, Amit Langote > wrote in > > > Anybody else want to look this patch over before I mark it Ready For > > > Committer? > > > > Would be nice to have others look it over. Thanks. > > This nice improvement. > > 0001 just looks fine. > > 0002: > > /* RI query type codes */ > -/* these queries are executed against the PK (referenced) table: */ > +/* > + * 1 and 2 are no longer used, because PK (referenced) table is looked up > + * directly using ri_ReferencedKeyExists(). > #define RI_PLAN_CHECK_LOOKUPPK 1 > #define RI_PLAN_CHECK_LOOKUPPK_FROM_PK 2 > #define RI_PLAN_LAST_ON_PK > RI_PLAN_CHECK_LOOKUPPK_FROM_PK > > However, this patch does. > > + if (!ri_ReferencedKeyExists(pk_rel, fk_rel, newslot, riinfo)) > + ri_ReportViolation(riinfo, > + pk_rel, fk_rel, > + newslot, > + NULL, > + RI_PLAN_CHECK_LOOKUPPK, > false); > > It seems to me 1 (RI_PLAN_CHECK_LOOKUPPK) is still alive. (Yeah, I > know that doesn't mean the usefulness of the macro but the mechanism > the macro suggests, but it is confusing.) On the other hand, > RI_PLAN_CHECK_LOOKUPPK_FROM_PK and RI_PLAN_LAST_ON_PK seem to be no > longer used. (Couldn't we remove them?) Yeah, better to just remove those _PK macros and say this module no longer runs any queries on the PK table. How about the attached? -- Amit Langote EDB: http://www.enterprisedb.com v6-0002-Avoid-using-SPI-for-some-RI-checks.patch Description: Binary data v6-0001-Export-get_partition_for_tuple.patch Description: Binary data
Re: simplifying foreign key/RI checks
At Sun, 24 Jan 2021 20:51:39 +0900, Amit Langote wrote in > Here's v5. At Mon, 25 Jan 2021 18:19:56 +0900, Amit Langote wrote in > > Anybody else want to look this patch over before I mark it Ready For > > Committer? > > Would be nice to have others look it over. Thanks. This nice improvement. 0001 just looks fine. 0002: /* RI query type codes */ -/* these queries are executed against the PK (referenced) table: */ +/* + * 1 and 2 are no longer used, because PK (referenced) table is looked up + * directly using ri_ReferencedKeyExists(). #define RI_PLAN_CHECK_LOOKUPPK 1 #define RI_PLAN_CHECK_LOOKUPPK_FROM_PK 2 #define RI_PLAN_LAST_ON_PK RI_PLAN_CHECK_LOOKUPPK_FROM_PK However, this patch does. + if (!ri_ReferencedKeyExists(pk_rel, fk_rel, newslot, riinfo)) + ri_ReportViolation(riinfo, + pk_rel, fk_rel, + newslot, + NULL, + RI_PLAN_CHECK_LOOKUPPK, false); It seems to me 1 (RI_PLAN_CHECK_LOOKUPPK) is still alive. (Yeah, I know that doesn't mean the usefulness of the macro but the mechanism the macro suggests, but it is confusing.) On the other hand, RI_PLAN_CHECK_LOOKUPPK_FROM_PK and RI_PLAN_LAST_ON_PK seem to be no longer used. (Couldn't we remove them?) (about the latter, we can rewrite the only use of it "if (qkey->constr_queryno <= RI_PLAN_LAST_ON_PK)" not to use the macro.) regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: simplifying foreign key/RI checks
Hi Amit-san, Thanks for the answer! > If you only tested insert/update on the referencing table, I would've > expected to see nothing in the result of that query, because the patch > eliminates all use of SPI in that case. I suspect the CachedPlan* > memory contexts you are seeing belong to some early activity in the > session. So if you try the insert/update in a freshly started > session, you would see 0 rows in the result of that query. That's right. CREATE PARTITION TABLE included in the test script(rep.sql) was using SPI. In a new session, I confirmed that CachedPlan is not generated when only execute INSERT. # only execute INSERT postgres=# INSERT INTO ps SELECT generate_series(1,4999); INSERT 0 4999 postgres=# postgres=# INSERT INTO pr SELECT i, i from generate_series(1,4999)i; INSERT 0 4999 postgres=# SELECT name, sum(used_bytes) as bytes, pg_size_pretty(sum(used_bytes)) FROM pg_backend_memory_contexts WHERE name LIKE 'Cached%' GROUP BY name; name | bytes | pg_size_pretty --+---+ (0 rows) ★ No CachedPlan > Hmm, the patch tries to solve a general problem that SPI plans are not > being shared among partitions whereas they should be. So I don't > think that it's necessarily specific to DELETE. Until we have a > solution like the patch on this thread for DELETE, it seems fine to > consider the other patch as a stopgap solution. I see. So this is a solution to the problem of using SPI plans in partitions, not just DELETE. I agree with you, I think this is a solution to the current problem. Best Regards, -- Keisuke Kuroda NTT Software Innovation Center keisuke.kuroda.3...@gmail.com
Re: simplifying foreign key/RI checks
Hi Amit-san, + case TM_Invisible: + elog(ERROR, "attempted to lock invisible tuple"); + break; + + case TM_SelfModified: + case TM_BeingModified: + case TM_WouldBlock: + elog(ERROR, "unexpected table_tuple_lock status: %u", res); + break; + default: + elog(ERROR, "unrecognized table_tuple_lock status: %u", res); All of these are meant as debugging elog()s for cases that won't normally occur. IIUC, the discussion at the linked thread excludes those from consideration. Thanks for your explanation. Ah, I reread the thread, and I now realized that user visible log messages are the target to replace. I understood that that elog() for the cases won't normally occur. Sorry for the noise. Regards, Tatsuro Yamada
Re: simplifying foreign key/RI checks
Yamada-san, On Wed, Jan 27, 2021 at 8:51 AM Tatsuro Yamada wrote: > On 2021/01/25 18:19, Amit Langote wrote: > > On Mon, Jan 25, 2021 at 9:24 AM Corey Huinker > > wrote: > >> Anybody else want to look this patch over before I mark it Ready For > >> Committer? > > > > Would be nice to have others look it over. Thanks. > > Thanks for creating the patch! > > I tried to review the patch. Here is my comment. Thanks for the comment. > * According to this thread [1], it might be better to replace elog() with >ereport() in the patch. > > [1]: > https://www.postgresql.org/message-id/flat/92d6f545-5102-65d8-3c87-489f71ea0a37%40enterprisedb.com Could you please tell which elog() of the following added by the patch you are concerned about? + case TM_Invisible: + elog(ERROR, "attempted to lock invisible tuple"); + break; + + case TM_SelfModified: + case TM_BeingModified: + case TM_WouldBlock: + elog(ERROR, "unexpected table_tuple_lock status: %u", res); + break; + default: + elog(ERROR, "unrecognized table_tuple_lock status: %u", res); All of these are meant as debugging elog()s for cases that won't normally occur. IIUC, the discussion at the linked thread excludes those from consideration. -- Amit Langote EDB: http://www.enterprisedb.com
Re: simplifying foreign key/RI checks
Hi Amit-san, On 2021/01/25 18:19, Amit Langote wrote: On Mon, Jan 25, 2021 at 9:24 AM Corey Huinker wrote: On Sun, Jan 24, 2021 at 6:51 AM Amit Langote wrote: Here's v5. v5 patches apply to master. Suggested If/then optimization is implemented. Suggested case merging is implemented. Passes make check and make check-world yet again. Just to confirm, we don't free the RI_CompareHashEntry because it points to an entry in a hash table which is TopMemoryContext aka lifetime of the session, correct? Right. Anybody else want to look this patch over before I mark it Ready For Committer? Would be nice to have others look it over. Thanks. Thanks for creating the patch! I tried to review the patch. Here is my comment. * According to this thread [1], it might be better to replace elog() with ereport() in the patch. [1]: https://www.postgresql.org/message-id/flat/92d6f545-5102-65d8-3c87-489f71ea0a37%40enterprisedb.com Thanks, Tatsuro Yamada
Re: simplifying foreign key/RI checks
On Mon, Jan 25, 2021 at 7:01 PM Amit Langote wrote: > On Mon, Jan 25, 2021 at 6:06 PM Keisuke Kuroda > wrote: > > However, as already mentioned, the problem of memory bloat on DELETE > > remains. > > This can be solved by the patch in [1], but I think it is too much to apply > > this patch only for DELETE. What do you think? > > > > [1] > > https://www.postgresql.org/message-id/flat/cab4b85d-9292-967d-adf2-be0d803c3e23%40nttcom.co.jp_1 > > Hmm, the patch tries to solve a general problem that SPI plans are not > being shared among partitions whereas they should be. So I don't > think that it's necessarily specific to DELETE. Until we have a > solution like the patch on this thread for DELETE, it seems fine to > consider the other patch as a stopgap solution. Forgot to mention one thing. Alvaro, in his last email on that thread, characterized that patch as fixing a bug, although I may have misread that. -- Amit Langote EDB: http://www.enterprisedb.com
Re: simplifying foreign key/RI checks
Kuroda-san, On Mon, Jan 25, 2021 at 6:06 PM Keisuke Kuroda wrote: > Hi, Amit-san, > > Nice patch. I have confirmed that this solves the problem in [1] with > INSERT/UPDATE. Thanks for testing. > HEAD + patch >name | bytes | pg_size_pretty > --+---+ > CachedPlanQuery | 10280 | 10 kB > CachedPlanSource | 14616 | 14 kB > CachedPlan | 13168 | 13 kB ★ 710MB -> 13kB > (3 rows) If you only tested insert/update on the referencing table, I would've expected to see nothing in the result of that query, because the patch eliminates all use of SPI in that case. I suspect the CachedPlan* memory contexts you are seeing belong to some early activity in the session. So if you try the insert/update in a freshly started session, you would see 0 rows in the result of that query. > > > This patch completely sidesteps the DELETE case, which has more insidious > > > performance implications, but is also far less common, and whose solution > > > will likely be very different. > > > > Yeah, we should continue looking into the ways to make referenced-side > > RI checks be less bloated. > > However, as already mentioned, the problem of memory bloat on DELETE remains. > This can be solved by the patch in [1], but I think it is too much to apply > this patch only for DELETE. What do you think? > > [1] > https://www.postgresql.org/message-id/flat/cab4b85d-9292-967d-adf2-be0d803c3e23%40nttcom.co.jp_1 Hmm, the patch tries to solve a general problem that SPI plans are not being shared among partitions whereas they should be. So I don't think that it's necessarily specific to DELETE. Until we have a solution like the patch on this thread for DELETE, it seems fine to consider the other patch as a stopgap solution. -- Amit Langote EDB: http://www.enterprisedb.com
Re: simplifying foreign key/RI checks
On Mon, Jan 25, 2021 at 9:24 AM Corey Huinker wrote: > On Sun, Jan 24, 2021 at 6:51 AM Amit Langote wrote: >> Here's v5. > > v5 patches apply to master. > Suggested If/then optimization is implemented. > Suggested case merging is implemented. > Passes make check and make check-world yet again. > Just to confirm, we don't free the RI_CompareHashEntry because it points to > an entry in a hash table which is TopMemoryContext aka lifetime of the > session, correct? Right. > Anybody else want to look this patch over before I mark it Ready For > Committer? Would be nice to have others look it over. Thanks. -- Amit Langote EDB: http://www.enterprisedb.com
Re: simplifying foreign key/RI checks
Hi, Amit-san, Nice patch. I have confirmed that this solves the problem in [1] with INSERT/UPDATE. HEAD + patch name | bytes | pg_size_pretty --+---+ CachedPlanQuery | 10280 | 10 kB CachedPlanSource | 14616 | 14 kB CachedPlan | 13168 | 13 kB ★ 710MB -> 13kB (3 rows) > > This patch completely sidesteps the DELETE case, which has more insidious > > performance implications, but is also far less common, and whose solution > > will likely be very different. > > Yeah, we should continue looking into the ways to make referenced-side > RI checks be less bloated. However, as already mentioned, the problem of memory bloat on DELETE remains. This can be solved by the patch in [1], but I think it is too much to apply this patch only for DELETE. What do you think? [1] https://www.postgresql.org/message-id/flat/cab4b85d-9292-967d-adf2-be0d803c3e23%40nttcom.co.jp_1 -- Keisuke Kuroda NTT Software Innovation Center keisuke.kuroda.3...@gmail.com
Re: simplifying foreign key/RI checks
On Sun, Jan 24, 2021 at 6:51 AM Amit Langote wrote: > On Sun, Jan 24, 2021 at 11:26 AM Corey Huinker > wrote: > > On Sat, Jan 23, 2021 at 12:52 PM Zhihong Yu wrote: > >> > >> Hi, > > Thanks for the review. > > >> + for (i = 0; i < riinfo->nkeys; i++) > >> + { > >> + Oid eq_opr = eq_oprs[i]; > >> + Oid typeid = RIAttType(fk_rel, riinfo->fk_attnums[i]); > >> + RI_CompareHashEntry *entry = ri_HashCompareOp(eq_opr, > typeid); > >> + > >> + if (pk_nulls[i] != 'n' && > OidIsValid(entry->cast_func_finfo.fn_oid)) > >> > >> It seems the pk_nulls[i] != 'n' check can be lifted ahead of the > assignment to the three local variables. That way, ri_HashCompareOp > wouldn't be called when pk_nulls[i] == 'n'. > > Good idea, so done. Although, there can't be nulls right now. > > >> + case TM_Updated: > >> + if (IsolationUsesXactSnapshot()) > >> ... > >> + case TM_Deleted: > >> + if (IsolationUsesXactSnapshot()) > >> > >> It seems the handling for TM_Updated and TM_Deleted is the same. The > cases for these two values can be put next to each other (saving one block > of code). > > Ah, yes. The TM_Updated case used to be handled a bit differently in > earlier unposted versions of the patch, though at some point I > concluded that the special handling was unnecessary, but didn't > realize what you just pointed out. Fixed. > > > I'll pause on reviewing v4 until you've addressed the suggestions above. > > Here's v5. > v5 patches apply to master. Suggested If/then optimization is implemented. Suggested case merging is implemented. Passes make check and make check-world yet again. Just to confirm, we *don't* free the RI_CompareHashEntry because it points to an entry in a hash table which is TopMemoryContext aka lifetime of the session, correct? Anybody else want to look this patch over before I mark it Ready For Committer?
Re: simplifying foreign key/RI checks
On Sun, Jan 24, 2021 at 11:26 AM Corey Huinker wrote: > On Sat, Jan 23, 2021 at 12:52 PM Zhihong Yu wrote: >> >> Hi, Thanks for the review. >> + for (i = 0; i < riinfo->nkeys; i++) >> + { >> + Oid eq_opr = eq_oprs[i]; >> + Oid typeid = RIAttType(fk_rel, riinfo->fk_attnums[i]); >> + RI_CompareHashEntry *entry = ri_HashCompareOp(eq_opr, typeid); >> + >> + if (pk_nulls[i] != 'n' && >> OidIsValid(entry->cast_func_finfo.fn_oid)) >> >> It seems the pk_nulls[i] != 'n' check can be lifted ahead of the assignment >> to the three local variables. That way, ri_HashCompareOp wouldn't be called >> when pk_nulls[i] == 'n'. Good idea, so done. Although, there can't be nulls right now. >> + case TM_Updated: >> + if (IsolationUsesXactSnapshot()) >> ... >> + case TM_Deleted: >> + if (IsolationUsesXactSnapshot()) >> >> It seems the handling for TM_Updated and TM_Deleted is the same. The cases >> for these two values can be put next to each other (saving one block of >> code). Ah, yes. The TM_Updated case used to be handled a bit differently in earlier unposted versions of the patch, though at some point I concluded that the special handling was unnecessary, but didn't realize what you just pointed out. Fixed. > I'll pause on reviewing v4 until you've addressed the suggestions above. Here's v5. -- Amit Langote EDB: http://www.enterprisedb.com v5-0002-Avoid-using-SPI-for-some-RI-checks.patch Description: Binary data v5-0001-Export-get_partition_for_tuple.patch Description: Binary data
Re: simplifying foreign key/RI checks
On Sat, Jan 23, 2021 at 12:52 PM Zhihong Yu wrote: > Hi, > > + for (i = 0; i < riinfo->nkeys; i++) > + { > + Oid eq_opr = eq_oprs[i]; > + Oid typeid = RIAttType(fk_rel, riinfo->fk_attnums[i]); > + RI_CompareHashEntry *entry = ri_HashCompareOp(eq_opr, typeid); > + > + if (pk_nulls[i] != 'n' && > OidIsValid(entry->cast_func_finfo.fn_oid)) > > It seems the pk_nulls[i] != 'n' check can be lifted ahead of the > assignment to the three local variables. That way, ri_HashCompareOp > wouldn't be called when pk_nulls[i] == 'n'. > > + case TM_Updated: > + if (IsolationUsesXactSnapshot()) > ... > + case TM_Deleted: > + if (IsolationUsesXactSnapshot()) > > It seems the handling for TM_Updated and TM_Deleted is the same. The cases > for these two values can be put next to each other (saving one block of > code). > > Cheers > I'll pause on reviewing v4 until you've addressed the suggestions above.
Re: simplifying foreign key/RI checks
Hi, + for (i = 0; i < riinfo->nkeys; i++) + { + Oid eq_opr = eq_oprs[i]; + Oid typeid = RIAttType(fk_rel, riinfo->fk_attnums[i]); + RI_CompareHashEntry *entry = ri_HashCompareOp(eq_opr, typeid); + + if (pk_nulls[i] != 'n' && OidIsValid(entry->cast_func_finfo.fn_oid)) It seems the pk_nulls[i] != 'n' check can be lifted ahead of the assignment to the three local variables. That way, ri_HashCompareOp wouldn't be called when pk_nulls[i] == 'n'. + case TM_Updated: + if (IsolationUsesXactSnapshot()) ... + case TM_Deleted: + if (IsolationUsesXactSnapshot()) It seems the handling for TM_Updated and TM_Deleted is the same. The cases for these two values can be put next to each other (saving one block of code). Cheers On Fri, Jan 22, 2021 at 11:10 PM Amit Langote wrote: > On Fri, Jan 22, 2021 at 3:22 PM Corey Huinker > wrote: > >> I decided not to deviate from pk_ terminology so that the new code > >> doesn't look too different from the other code in the file. Although, > >> I guess we can at least call the main function > >> ri_ReferencedKeyExists() instead of ri_PrimaryKeyExists(), so I've > >> changed that. > > > > I think that's a nice compromise, it makes the reader aware of the > concept. > >> > >> I've attached the updated patch. > > > > Missing "break" added. Check. > > Comment updated. Check. > > Function renamed. Check. > > Attribute mapping matching test (and assertion) added. Check. > > Patch applies to an as-of-today master, passes make check and check > world. > > No additional regression tests required, as no new functionality is > introduced. > > No docs required, as there is nothing user-facing. > > Thanks for the review. > > > Questions: > > 1. There's a palloc for mapped_partkey_attnums, which is never freed, is > the prevailing memory context short lived enough that we don't care? > > 2. Same question for the AtrrMap map, should there be a free_attrmap(). > > I hadn't checked, but yes, the prevailing context is > AfterTriggerTupleContext, a short-lived one that is reset for every > trigger event tuple. I'm still inclined to explicitly free those > objects, so changed like that. While at it, I also changed > mapped_partkey_attnums to root_partattrs for readability. > > Attached v4. > -- > Amit Langote > EDB: http://www.enterprisedb.com >
Re: simplifying foreign key/RI checks
On Fri, Jan 22, 2021 at 3:22 PM Corey Huinker wrote: >> I decided not to deviate from pk_ terminology so that the new code >> doesn't look too different from the other code in the file. Although, >> I guess we can at least call the main function >> ri_ReferencedKeyExists() instead of ri_PrimaryKeyExists(), so I've >> changed that. > > I think that's a nice compromise, it makes the reader aware of the concept. >> >> I've attached the updated patch. > > Missing "break" added. Check. > Comment updated. Check. > Function renamed. Check. > Attribute mapping matching test (and assertion) added. Check. > Patch applies to an as-of-today master, passes make check and check world. > No additional regression tests required, as no new functionality is > introduced. > No docs required, as there is nothing user-facing. Thanks for the review. > Questions: > 1. There's a palloc for mapped_partkey_attnums, which is never freed, is the > prevailing memory context short lived enough that we don't care? > 2. Same question for the AtrrMap map, should there be a free_attrmap(). I hadn't checked, but yes, the prevailing context is AfterTriggerTupleContext, a short-lived one that is reset for every trigger event tuple. I'm still inclined to explicitly free those objects, so changed like that. While at it, I also changed mapped_partkey_attnums to root_partattrs for readability. Attached v4. -- Amit Langote EDB: http://www.enterprisedb.com v4-0001-Export-get_partition_for_tuple.patch Description: Binary data v4-0002-Avoid-using-SPI-for-some-RI-checks.patch Description: Binary data
Re: simplifying foreign key/RI checks
> > > > I decided not to deviate from pk_ terminology so that the new code > doesn't look too different from the other code in the file. Although, > I guess we can at least call the main function > ri_ReferencedKeyExists() instead of ri_PrimaryKeyExists(), so I've > changed that. > I think that's a nice compromise, it makes the reader aware of the concept. > > I've attached the updated patch. > Missing "break" added. Check. Comment updated. Check. Function renamed. Check. Attribute mapping matching test (and assertion) added. Check. Patch applies to an as-of-today master, passes make check and check world. No additional regression tests required, as no new functionality is introduced. No docs required, as there is nothing user-facing. Questions: 1. There's a palloc for mapped_partkey_attnums, which is never freed, is the prevailing memory context short lived enough that we don't care? 2. Same question for the AtrrMap map, should there be a free_attrmap().
Re: simplifying foreign key/RI checks
> > I decided not to deviate from pk_ terminology so that the new code > doesn't look too different from the other code in the file. Although, > I guess we can at least call the main function > ri_ReferencedKeyExists() instead of ri_PrimaryKeyExists(), so I've > changed that. > I agree with leaving the existing terminology where it is for this patch. Changing the function name is probably enough to alert the reader that the things that are called pks may not be precisely that.
Re: simplifying foreign key/RI checks
On Tue, Jan 19, 2021 at 12:00 PM Zhihong Yu wrote: > + if (mapped_partkey_attnums[i] == pk_attnums[j]) > + { > + partkey_vals[i] = pk_vals[j]; > + partkey_isnull[i] = pk_nulls[j] == 'n' ? true : false; > + i++; > + break; > > The way counter (i) is incremented is out of my expectation. > In the rare case, where some i doesn't have corresponding pk_attnums[j], > wouldn't there be a dead loop ? > > I think the goal of adding the assertion should be not loop infinitely even > if the invariant is not satisfied. > > I guess a counter other than i would be better for this purpose. I have done that in v3. Thanks. -- Amit Langote EDB: http://www.enterprisedb.com
Re: simplifying foreign key/RI checks
On Tue, Jan 19, 2021 at 3:46 PM Corey Huinker wrote: > v2 patch applies and passes make check and make check-world. Perhaps, given > the missing break at line 418 without any tests failing, we could add another > regression test if we're into 100% code path coverage. As it is, I think the > compiler warning was a sufficient alert. Thanks for the review. I will look into checking the coverage. > The code is easy to read, and the comments touch on the major points of what > complexities arise from partitioned tables. > > A somewhat pedantic complaint I have brought up off-list is that this patch > continues the pattern of the variable and function names making the > assumption that the foreign key is referencing the primary key of the > referenced table. Foreign key constraints need only reference a unique index, > it doesn't have to be the primary key. Granted, that unique index is behaving > exactly as a primary key would, so conceptually it is very similar, but > keeping with the existing naming (pk_rel, pk_type, etc) can lead a developer > to think that it would be just as correct to find the referenced relation and > get the primary key index from there, which would not always be correct. This > patch correctly grabs the index from the constraint itself, so no problem > there. I decided not to deviate from pk_ terminology so that the new code doesn't look too different from the other code in the file. Although, I guess we can at least call the main function ri_ReferencedKeyExists() instead of ri_PrimaryKeyExists(), so I've changed that. > I like that this patch changes the absolute minimum of the code in order to > get a very significant performance benefit. It does so in a way that should > reduce resource pressure found in other places [1]. This will in turn reduce > the performance penalty of "doing the right thing" in terms of defining > enforced foreign keys. It seems to get a clearer performance boost than was > achieved with previous efforts at statement level triggers. > > [1] > https://www.postgresql.org/message-id/cakkq508z6r5e3jdqhfpwszsajlpho3oyyoamfesaupto5vg...@mail.gmail.com Thanks. I hadn't noticed [1] before today, but after looking it over, it seems that what is being proposed there can still be of use. As long as SPI is used in ri_trigger.c, it makes sense to consider any tweaks addressing its negative impact, especially if they are not very invasive. There's this patch too from the last month: https://commitfest.postgresql.org/32/2930/ > This patch completely sidesteps the DELETE case, which has more insidious > performance implications, but is also far less common, and whose solution > will likely be very different. Yeah, we should continue looking into the ways to make referenced-side RI checks be less bloated. I've attached the updated patch. -- Amit Langote EDB: http://www.enterprisedb.com v3-0002-Avoid-using-SPI-for-some-RI-checks.patch Description: Binary data v3-0001-Export-get_partition_for_tuple.patch Description: Binary data
Re: simplifying foreign key/RI checks
On Mon, Jan 18, 2021 at 9:45 PM Amit Langote wrote: > On Tue, Jan 19, 2021 at 2:47 AM Zhihong Yu wrote: > > > > Hi, > > I was looking at this statement: > > > > insert into f select generate_series(1, 200, 2); > > > > Since certain generated values (the second half) are not in table p, > wouldn't insertion for those values fail ? > > I tried a scaled down version (1000th) of your example: > > > > yugabyte=# insert into f select generate_series(1, 2000, 2); > > ERROR: insert or update on table "f" violates foreign key constraint > "f_a_fkey" > > DETAIL: Key (a)=(1001) is not present in table "p". > > Sorry, a wrong copy-paste by me. Try this: > > create table p (a numeric primary key); > insert into p select generate_series(1, 200); > create table f (a bigint references p); > > -- Unpatched > insert into f select generate_series(1, 200, 2); > INSERT 0 100 > Time: 6527.652 ms (00:06.528) > > update f set a = a + 1; > UPDATE 100 > Time: 8108.310 ms (00:08.108) > > -- Patched: > insert into f select generate_series(1, 200, 2); > INSERT 0 100 > Time: 3312.193 ms (00:03.312) > > update f set a = a + 1; > UPDATE 100 > Time: 4292.807 ms (00:04.293) > > > For v1-0002-Avoid-using-SPI-for-some-RI-checks.patch : > > > > +* Collect partition key values from the unique key. > > > > At the end of the nested loop, should there be an assertion that > partkey->partnatts partition key values have been found ? > > This can be done by using a counter (initialized to 0) which is > incremented when a match is found by the inner loop. > > I've updated the patch to add the Assert. Thanks for taking a look. > > -- > Amit Langote > EDB: http://www.enterprisedb.com v2 patch applies and passes make check and make check-world. Perhaps, given the missing break at line 418 without any tests failing, we could add another regression test if we're into 100% code path coverage. As it is, I think the compiler warning was a sufficient alert. The code is easy to read, and the comments touch on the major points of what complexities arise from partitioned tables. A somewhat pedantic complaint I have brought up off-list is that this patch continues the pattern of the variable and function names making the assumption that the foreign key is referencing the primary key of the referenced table. Foreign key constraints need only reference a unique index, it doesn't have to be the primary key. Granted, that unique index is behaving exactly as a primary key would, so conceptually it is very similar, but keeping with the existing naming (pk_rel, pk_type, etc) can lead a developer to think that it would be just as correct to find the referenced relation and get the primary key index from there, which would not always be correct. This patch correctly grabs the index from the constraint itself, so no problem there. I like that this patch changes the absolute minimum of the code in order to get a very significant performance benefit. It does so in a way that should reduce resource pressure found in other places [1]. This will in turn reduce the performance penalty of "doing the right thing" in terms of defining enforced foreign keys. It seems to get a clearer performance boost than was achieved with previous efforts at statement level triggers. This patch completely sidesteps the DELETE case, which has more insidious performance implications, but is also far less common, and whose solution will likely be very different. [1] https://www.postgresql.org/message-id/cakkq508z6r5e3jdqhfpwszsajlpho3oyyoamfesaupto5vg...@mail.gmail.com
Re: simplifying foreign key/RI checks
On Tue, Jan 19, 2021 at 2:56 PM Corey Huinker wrote: >> In file included from >> /home/japin/Codes/postgresql/Debug/../src/include/postgres.h:47:0, >> from >> /home/japin/Codes/postgresql/Debug/../src/backend/utils/adt/ri_triggers.c:24: >> /home/japin/Codes/postgresql/Debug/../src/backend/utils/adt/ri_triggers.c: >> In function ‘ri_PrimaryKeyExists’: >> /home/japin/Codes/postgresql/Debug/../src/include/utils/elog.h:134:5: >> warning: this statement may fall through [-Wimplicit-fallthrough=] >> do { \ >> ^ >> /home/japin/Codes/postgresql/Debug/../src/include/utils/elog.h:156:2: note: >> in expansion of macro ‘ereport_domain’ >> ereport_domain(elevel, TEXTDOMAIN, __VA_ARGS__) >> ^~ >> /home/japin/Codes/postgresql/Debug/../src/include/utils/elog.h:229:2: note: >> in expansion of macro ‘ereport’ >> ereport(elevel, errmsg_internal(__VA_ARGS__)) >> ^~~ >> /home/japin/Codes/postgresql/Debug/../src/backend/utils/adt/ri_triggers.c:417:5: >> note: in expansion of macro ‘elog’ >> elog(ERROR, "unexpected table_tuple_lock status: %u", res); >> ^~~~ >> /home/japin/Codes/postgresql/Debug/../src/backend/utils/adt/ri_triggers.c:419:4: >> note: here >> default: >> ^~~ Thanks, will fix it. -- Amit Langote EDB: http://www.enterprisedb.com
Re: simplifying foreign key/RI checks
> > > In file included from > /home/japin/Codes/postgresql/Debug/../src/include/postgres.h:47:0, > from > /home/japin/Codes/postgresql/Debug/../src/backend/utils/adt/ri_triggers.c:24: > /home/japin/Codes/postgresql/Debug/../src/backend/utils/adt/ri_triggers.c: > In function ‘ri_PrimaryKeyExists’: > /home/japin/Codes/postgresql/Debug/../src/include/utils/elog.h:134:5: > warning: this statement may fall through [-Wimplicit-fallthrough=] > do { \ > ^ > /home/japin/Codes/postgresql/Debug/../src/include/utils/elog.h:156:2: > note: in expansion of macro ‘ereport_domain’ > ereport_domain(elevel, TEXTDOMAIN, __VA_ARGS__) > ^~ > /home/japin/Codes/postgresql/Debug/../src/include/utils/elog.h:229:2: > note: in expansion of macro ‘ereport’ > ereport(elevel, errmsg_internal(__VA_ARGS__)) > ^~~ > /home/japin/Codes/postgresql/Debug/../src/backend/utils/adt/ri_triggers.c:417:5: > note: in expansion of macro ‘elog’ > elog(ERROR, "unexpected table_tuple_lock status: %u", res); > ^~~~ > /home/japin/Codes/postgresql/Debug/../src/backend/utils/adt/ri_triggers.c:419:4: > note: here > default: > ^~~ > > -- > Regrads, > Japin Li. > ChengDu WenWu Information Technology Co.,Ltd. > I also get this warning. Adding a "break;" at line 418 resolves the warning.
Re: simplifying foreign key/RI checks
út 19. 1. 2021 v 3:08 odesílatel Amit Langote napsal: > On Tue, Jan 19, 2021 at 3:01 AM Pavel Stehule > wrote: > > po 18. 1. 2021 v 13:40 odesílatel Amit Langote > napsal: > >> I started with the check that's performed when inserting into or > >> updating the referencing table to confirm that the new row points to a > >> valid row in the referenced relation. The corresponding SQL is this: > >> > >> SELECT 1 FROM pk_rel x WHERE x.pkey = $1 FOR KEY SHARE OF x > >> > >> $1 is the value of the foreign key of the new row. If the query > >> returns a row, all good. Thanks to SPI, or its use of plan caching, > >> the query is re-planned only a handful of times before making a > >> generic plan that is then saved and reused, which looks like this: > >> > >> QUERY PLAN > >> -- > >> LockRows > >>-> Index Scan using pk_pkey on pk x > >> Index Cond: (a = $1) > >> (3 rows) > > > > > > What is performance when the referenced table is small? - a lot of > codebooks are small between 1000 to 10K rows. > > I see the same ~2x improvement. > > create table p (a numeric primary key); > insert into p select generate_series(1, 1000); > create table f (a bigint references p); > > Unpatched: > > insert into f select i%1000+1 from generate_series(1, 100) i; > INSERT 0 100 > Time: 5461.377 ms (00:05.461) > > > Patched: > > insert into f select i%1000+1 from generate_series(1, 100) i; > INSERT 0 100 > Time: 2357.440 ms (00:02.357) > > That's expected because the overhead of using SPI to check the PK > table, which the patch gets rid of, is the same no matter the size of > the index to be scanned. > It looks very well. Regards Pavel > -- > Amit Langote > EDB: http://www.enterprisedb.com >
Re: simplifying foreign key/RI checks
On Tue, 19 Jan 2021 at 10:45, Amit Langote wrote: > On Tue, Jan 19, 2021 at 2:47 AM Zhihong Yu wrote: >> >> Hi, >> I was looking at this statement: >> >> insert into f select generate_series(1, 200, 2); >> >> Since certain generated values (the second half) are not in table p, >> wouldn't insertion for those values fail ? >> I tried a scaled down version (1000th) of your example: >> >> yugabyte=# insert into f select generate_series(1, 2000, 2); >> ERROR: insert or update on table "f" violates foreign key constraint >> "f_a_fkey" >> DETAIL: Key (a)=(1001) is not present in table "p". > > Sorry, a wrong copy-paste by me. Try this: > > create table p (a numeric primary key); > insert into p select generate_series(1, 200); > create table f (a bigint references p); > > -- Unpatched > insert into f select generate_series(1, 200, 2); > INSERT 0 100 > Time: 6527.652 ms (00:06.528) > > update f set a = a + 1; > UPDATE 100 > Time: 8108.310 ms (00:08.108) > > -- Patched: > insert into f select generate_series(1, 200, 2); > INSERT 0 100 > Time: 3312.193 ms (00:03.312) > > update f set a = a + 1; > UPDATE 100 > Time: 4292.807 ms (00:04.293) > >> For v1-0002-Avoid-using-SPI-for-some-RI-checks.patch : >> >> +* Collect partition key values from the unique key. >> >> At the end of the nested loop, should there be an assertion that >> partkey->partnatts partition key values have been found ? >> This can be done by using a counter (initialized to 0) which is incremented >> when a match is found by the inner loop. > > I've updated the patch to add the Assert. Thanks for taking a look. After apply the v2 patches, here are some warnings: In file included from /home/japin/Codes/postgresql/Debug/../src/include/postgres.h:47:0, from /home/japin/Codes/postgresql/Debug/../src/backend/utils/adt/ri_triggers.c:24: /home/japin/Codes/postgresql/Debug/../src/backend/utils/adt/ri_triggers.c: In function ‘ri_PrimaryKeyExists’: /home/japin/Codes/postgresql/Debug/../src/include/utils/elog.h:134:5: warning: this statement may fall through [-Wimplicit-fallthrough=] do { \ ^ /home/japin/Codes/postgresql/Debug/../src/include/utils/elog.h:156:2: note: in expansion of macro ‘ereport_domain’ ereport_domain(elevel, TEXTDOMAIN, __VA_ARGS__) ^~ /home/japin/Codes/postgresql/Debug/../src/include/utils/elog.h:229:2: note: in expansion of macro ‘ereport’ ereport(elevel, errmsg_internal(__VA_ARGS__)) ^~~ /home/japin/Codes/postgresql/Debug/../src/backend/utils/adt/ri_triggers.c:417:5: note: in expansion of macro ‘elog’ elog(ERROR, "unexpected table_tuple_lock status: %u", res); ^~~~ /home/japin/Codes/postgresql/Debug/../src/backend/utils/adt/ri_triggers.c:419:4: note: here default: ^~~ -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
Re: simplifying foreign key/RI checks
Thanks for the quick response. + if (mapped_partkey_attnums[i] == pk_attnums[j]) + { + partkey_vals[i] = pk_vals[j]; + partkey_isnull[i] = pk_nulls[j] == 'n' ? true : false; + i++; + break; The way counter (i) is incremented is out of my expectation. In the rare case, where some i doesn't have corresponding pk_attnums[j], wouldn't there be a dead loop ? I think the goal of adding the assertion should be not loop infinitely even if the invariant is not satisfied. I guess a counter other than i would be better for this purpose. Cheers On Mon, Jan 18, 2021 at 6:45 PM Amit Langote wrote: > On Tue, Jan 19, 2021 at 2:47 AM Zhihong Yu wrote: > > > > Hi, > > I was looking at this statement: > > > > insert into f select generate_series(1, 200, 2); > > > > Since certain generated values (the second half) are not in table p, > wouldn't insertion for those values fail ? > > I tried a scaled down version (1000th) of your example: > > > > yugabyte=# insert into f select generate_series(1, 2000, 2); > > ERROR: insert or update on table "f" violates foreign key constraint > "f_a_fkey" > > DETAIL: Key (a)=(1001) is not present in table "p". > > Sorry, a wrong copy-paste by me. Try this: > > create table p (a numeric primary key); > insert into p select generate_series(1, 200); > create table f (a bigint references p); > > -- Unpatched > insert into f select generate_series(1, 200, 2); > INSERT 0 100 > Time: 6527.652 ms (00:06.528) > > update f set a = a + 1; > UPDATE 100 > Time: 8108.310 ms (00:08.108) > > -- Patched: > insert into f select generate_series(1, 200, 2); > INSERT 0 100 > Time: 3312.193 ms (00:03.312) > > update f set a = a + 1; > UPDATE 100 > Time: 4292.807 ms (00:04.293) > > > For v1-0002-Avoid-using-SPI-for-some-RI-checks.patch : > > > > +* Collect partition key values from the unique key. > > > > At the end of the nested loop, should there be an assertion that > partkey->partnatts partition key values have been found ? > > This can be done by using a counter (initialized to 0) which is > incremented when a match is found by the inner loop. > > I've updated the patch to add the Assert. Thanks for taking a look. > > -- > Amit Langote > EDB: http://www.enterprisedb.com >
Re: simplifying foreign key/RI checks
On Tue, Jan 19, 2021 at 2:47 AM Zhihong Yu wrote: > > Hi, > I was looking at this statement: > > insert into f select generate_series(1, 200, 2); > > Since certain generated values (the second half) are not in table p, wouldn't > insertion for those values fail ? > I tried a scaled down version (1000th) of your example: > > yugabyte=# insert into f select generate_series(1, 2000, 2); > ERROR: insert or update on table "f" violates foreign key constraint > "f_a_fkey" > DETAIL: Key (a)=(1001) is not present in table "p". Sorry, a wrong copy-paste by me. Try this: create table p (a numeric primary key); insert into p select generate_series(1, 200); create table f (a bigint references p); -- Unpatched insert into f select generate_series(1, 200, 2); INSERT 0 100 Time: 6527.652 ms (00:06.528) update f set a = a + 1; UPDATE 100 Time: 8108.310 ms (00:08.108) -- Patched: insert into f select generate_series(1, 200, 2); INSERT 0 100 Time: 3312.193 ms (00:03.312) update f set a = a + 1; UPDATE 100 Time: 4292.807 ms (00:04.293) > For v1-0002-Avoid-using-SPI-for-some-RI-checks.patch : > > +* Collect partition key values from the unique key. > > At the end of the nested loop, should there be an assertion that > partkey->partnatts partition key values have been found ? > This can be done by using a counter (initialized to 0) which is incremented > when a match is found by the inner loop. I've updated the patch to add the Assert. Thanks for taking a look. -- Amit Langote EDB: http://www.enterprisedb.com v2-0001-Export-get_partition_for_tuple.patch Description: Binary data v2-0002-Avoid-using-SPI-for-some-RI-checks.patch Description: Binary data
Re: simplifying foreign key/RI checks
On Tue, Jan 19, 2021 at 3:01 AM Pavel Stehule wrote: > po 18. 1. 2021 v 13:40 odesílatel Amit Langote > napsal: >> I started with the check that's performed when inserting into or >> updating the referencing table to confirm that the new row points to a >> valid row in the referenced relation. The corresponding SQL is this: >> >> SELECT 1 FROM pk_rel x WHERE x.pkey = $1 FOR KEY SHARE OF x >> >> $1 is the value of the foreign key of the new row. If the query >> returns a row, all good. Thanks to SPI, or its use of plan caching, >> the query is re-planned only a handful of times before making a >> generic plan that is then saved and reused, which looks like this: >> >> QUERY PLAN >> -- >> LockRows >>-> Index Scan using pk_pkey on pk x >> Index Cond: (a = $1) >> (3 rows) > > > What is performance when the referenced table is small? - a lot of codebooks > are small between 1000 to 10K rows. I see the same ~2x improvement. create table p (a numeric primary key); insert into p select generate_series(1, 1000); create table f (a bigint references p); Unpatched: insert into f select i%1000+1 from generate_series(1, 100) i; INSERT 0 100 Time: 5461.377 ms (00:05.461) Patched: insert into f select i%1000+1 from generate_series(1, 100) i; INSERT 0 100 Time: 2357.440 ms (00:02.357) That's expected because the overhead of using SPI to check the PK table, which the patch gets rid of, is the same no matter the size of the index to be scanned. -- Amit Langote EDB: http://www.enterprisedb.com
Re: simplifying foreign key/RI checks
po 18. 1. 2021 v 13:40 odesílatel Amit Langote napsal: > While discussing the topic of foreign key performance off-list with > Robert and Corey (also came up briefly on the list recently [1], [2]), > a few ideas were thrown around to simplify our current system of RI > checks to enforce foreign keys with the aim of reducing some of its > overheads. The two main aspects of how we do these checks that > seemingly cause the most overhead are: > > * Using row-level triggers that are fired during the modification of > the referencing and the referenced relations to perform them > > * Using plain SQL queries issued over SPI > > There is a discussion nearby titled "More efficient RI checks - take > 2" [2] to address this problem from the viewpoint that it is using > row-level triggers that causes the most overhead, although there are > some posts mentioning that SQL-over-SPI is not without blame here. I > decided to focus on the latter aspect and tried reimplementing some > checks such that SPI can be skipped altogether. > > I started with the check that's performed when inserting into or > updating the referencing table to confirm that the new row points to a > valid row in the referenced relation. The corresponding SQL is this: > > SELECT 1 FROM pk_rel x WHERE x.pkey = $1 FOR KEY SHARE OF x > > $1 is the value of the foreign key of the new row. If the query > returns a row, all good. Thanks to SPI, or its use of plan caching, > the query is re-planned only a handful of times before making a > generic plan that is then saved and reused, which looks like this: > > QUERY PLAN > -- > LockRows >-> Index Scan using pk_pkey on pk x > Index Cond: (a = $1) > (3 rows) > > > What is performance when the referenced table is small? - a lot of codebooks are small between 1000 to 10K rows.
Re: simplifying foreign key/RI checks
Hi, I was looking at this statement: insert into f select generate_series(1, 200, 2); Since certain generated values (the second half) are not in table p, wouldn't insertion for those values fail ? I tried a scaled down version (1000th) of your example: yugabyte=# insert into f select generate_series(1, 2000, 2); ERROR: insert or update on table "f" violates foreign key constraint "f_a_fkey" DETAIL: Key (a)=(1001) is not present in table "p". For v1-0002-Avoid-using-SPI-for-some-RI-checks.patch : +* Collect partition key values from the unique key. At the end of the nested loop, should there be an assertion that partkey->partnatts partition key values have been found ? This can be done by using a counter (initialized to 0) which is incremented when a match is found by the inner loop. Cheers On Mon, Jan 18, 2021 at 4:40 AM Amit Langote wrote: > While discussing the topic of foreign key performance off-list with > Robert and Corey (also came up briefly on the list recently [1], [2]), > a few ideas were thrown around to simplify our current system of RI > checks to enforce foreign keys with the aim of reducing some of its > overheads. The two main aspects of how we do these checks that > seemingly cause the most overhead are: > > * Using row-level triggers that are fired during the modification of > the referencing and the referenced relations to perform them > > * Using plain SQL queries issued over SPI > > There is a discussion nearby titled "More efficient RI checks - take > 2" [2] to address this problem from the viewpoint that it is using > row-level triggers that causes the most overhead, although there are > some posts mentioning that SQL-over-SPI is not without blame here. I > decided to focus on the latter aspect and tried reimplementing some > checks such that SPI can be skipped altogether. > > I started with the check that's performed when inserting into or > updating the referencing table to confirm that the new row points to a > valid row in the referenced relation. The corresponding SQL is this: > > SELECT 1 FROM pk_rel x WHERE x.pkey = $1 FOR KEY SHARE OF x > > $1 is the value of the foreign key of the new row. If the query > returns a row, all good. Thanks to SPI, or its use of plan caching, > the query is re-planned only a handful of times before making a > generic plan that is then saved and reused, which looks like this: > > QUERY PLAN > -- > LockRows >-> Index Scan using pk_pkey on pk x > Index Cond: (a = $1) > (3 rows) > > So in most cases, the trigger's function would only execute the plan > that's already there, at least in a given session. That's good but > what we realized would be even better is if we didn't have to > "execute" a full-fledged "plan" for this, that is, to simply find out > whether a row containing the key we're looking for exists in the > referenced relation and if found lock it. Directly scanning the index > and locking it directly with table_tuple_lock() like ExecLockRows() > does gives us exactly that behavior, which seems simple enough to be > done in a not-so-long local function in ri_trigger.c. I gave that a > try and came up with the attached. It also takes care of the case > where the referenced relation is partitioned in which case its > appropriate leaf partition's index is scanned. > > The patch results in ~2x improvement in the performance of inserts and > updates on referencing tables: > > create table p (a numeric primary key); > insert into p select generate_series(1, 100); > create table f (a bigint references p); > > -- unpatched > insert into f select generate_series(1, 200, 2); > INSERT 0 100 > Time: 6340.733 ms (00:06.341) > > update f set a = a + 1; > UPDATE 100 > Time: 7490.906 ms (00:07.491) > > -- patched > insert into f select generate_series(1, 200, 2); > INSERT 0 100 > Time: 3340.808 ms (00:03.341) > > update f set a = a + 1; > UPDATE 100 > Time: 4178.171 ms (00:04.178) > > The improvement is even more dramatic when the referenced table (that > we're no longer querying over SPI) is partitioned. Here are the > numbers when the PK relation has 1000 hash partitions. > > Unpatched: > > insert into f select generate_series(1, 200, 2); > INSERT 0 100 > Time: 35898.783 ms (00:35.899) > > update f set a = a + 1; > UPDATE 100 > Time: 37736.294 ms (00:37.736) > > Patched: > > insert into f select generate_series(1, 200, 2); > INSERT 0 100 > Time: 5633.377 ms (00:05.633) > > update f set a = a + 1; > UPDATE 100 > Time: 6345.029 ms (00:06.345) > > That's over ~5x improvement! > > While the above case seemed straightforward enough for skipping SPI, > it seems a bit hard to do the same for other cases where we query the > *referencing* relation during an operation on the referenced table > (for example, checking if the row being deleted is still referenced), > because the plan in those cases is not