Re: Huge memory consumption on partitioned table with FKs
On 2021/03/12 2:44, Tom Lane wrote: I wrote: Now, maybe it's a coincidence that husky failed on a partitioned-foreign-key test right after this patch went in, but I bet not. Since husky runs CLOBBER_CACHE_ALWAYS, it looks to me like we've overlooked some cache-reset scenario or other. After reproducing it here, that *is* a coincidence. I shall now go beat up on the correct blame-ee, instead. I did "make installcheck-parallel" on 7bb97211a, just in case. It was successful. :-D Regards, Tatsuro Yamada
Re: Huge memory consumption on partitioned table with FKs
I wrote: > Now, maybe it's a coincidence that husky failed on a > partitioned-foreign-key test right after this patch went in, but I bet > not. Since husky runs CLOBBER_CACHE_ALWAYS, it looks to me like we've > overlooked some cache-reset scenario or other. After reproducing it here, that *is* a coincidence. I shall now go beat up on the correct blame-ee, instead. regards, tom lane
Re: Huge memory consumption on partitioned table with FKs
Tatsuro Yamada writes: > Thanks for fixing the problem! :-D Hmm, I'm not sure we're done with this patch: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=husky=2021-03-10%2021%3A09%3A32 The critical log extract is 2021-03-11 05:10:13.012 CET [21574:1082] pg_regress/foreign_key LOG: statement: insert into fk_notpartitioned_pk (a, b) select 2048, x from generate_series(1,10) x; 2021-03-11 05:10:13.104 CET [11830:368] LOG: server process (PID 21574) was terminated by signal 11: Segmentation fault 2021-03-11 05:10:13.104 CET [11830:369] DETAIL: Failed process was running: insert into fk_notpartitioned_pk (a, b) select 2048, x from generate_series(1,10) x; Now, maybe it's a coincidence that husky failed on a partitioned-foreign-key test right after this patch went in, but I bet not. Since husky runs CLOBBER_CACHE_ALWAYS, it looks to me like we've overlooked some cache-reset scenario or other. regards, tom lane
Re: Huge memory consumption on partitioned table with FKs
On 2021/03/11 9:39, Amit Langote wrote: On Thu, Mar 11, 2021 at 4:25 AM Tom Lane wrote: Amit Langote writes: On Wed, Mar 10, 2021 at 8:37 AM Tom Lane wrote: Hmm. So, the key point is that the values coming from the partitioned child table are injected into the test query as parameters, not as column references, thus it doesn't matter *to the test query* what numbers the referencing columns have in that child. We just have to be sure we pass the right parameter values. Right. I did some cosmetic fooling with this (mostly, rewriting the comments YA time) and pushed it. Perfect. Thanks for your time on this. Thanks for fixing the problem! :-D Regards, Tatsuro Yamada
Re: Huge memory consumption on partitioned table with FKs
Hi hackers, > > > > I did some cosmetic fooling with this (mostly, rewriting the comments > > YA time) and pushed it. > > Perfect. Thanks for your time on this. > Thank you for your help! I'm glad to solve it. -- Keisuke Kuroda NTT Software Innovation Center keisuke.kuroda.3...@gmail.com
Re: Huge memory consumption on partitioned table with FKs
On Thu, Mar 11, 2021 at 4:25 AM Tom Lane wrote: > Amit Langote writes: > > On Wed, Mar 10, 2021 at 8:37 AM Tom Lane wrote: > >> Hmm. So, the key point is that the values coming from the partitioned > >> child table are injected into the test query as parameters, not as > >> column references, thus it doesn't matter *to the test query* what > >> numbers the referencing columns have in that child. We just have to > >> be sure we pass the right parameter values. > > > Right. > > I did some cosmetic fooling with this (mostly, rewriting the comments > YA time) and pushed it. Perfect. Thanks for your time on this. -- Amit Langote EDB: http://www.enterprisedb.com
Re: Huge memory consumption on partitioned table with FKs
Amit Langote writes: > On Wed, Mar 10, 2021 at 8:37 AM Tom Lane wrote: >> Hmm. So, the key point is that the values coming from the partitioned >> child table are injected into the test query as parameters, not as >> column references, thus it doesn't matter *to the test query* what >> numbers the referencing columns have in that child. We just have to >> be sure we pass the right parameter values. > Right. I did some cosmetic fooling with this (mostly, rewriting the comments YA time) and pushed it. regards, tom lane
Re: Huge memory consumption on partitioned table with FKs
On Wed, Mar 10, 2021 at 8:37 AM Tom Lane wrote: > Amit Langote writes: > > On Fri, Mar 5, 2021 at 6:00 AM Tom Lane wrote: > >> This claim seems false on its face: > >>> All child constraints of a given foreign key constraint can use the > >>> same RI query and the resulting plan, that is, no need to create as > >>> many copies of the query and the plan as there are partitions, as > >>> happens now due to the child constraint OID being used in the key > >>> for ri_query_cache. > > > The quoted comment could have been written to be clearer about this, > > but it's not talking about the table that is to be queried, but the > > table whose RI trigger is being executed. In all the cases except one > > (mentioned below), the table that is queried is the same irrespective > > of which partition's trigger is being executed, so we're basically > > creating the same plan separately for each partition. > > Hmm. So, the key point is that the values coming from the partitioned > child table are injected into the test query as parameters, not as > column references, thus it doesn't matter *to the test query* what > numbers the referencing columns have in that child. We just have to > be sure we pass the right parameter values. Right. > But ... doesn't the code > use riinfo->fk_attnums[] to pull out the values to be passed? Yes, from a slot that belongs to the child table. > IOW, I now get the point about being able to share the SPI plans, > but I'm still dubious about sharing the RI_ConstraintInfo cache entries. There was actually a proposal upthread about sharing the RI_ConstraintInfo too, but we decided to not pursue that for now. > It looks to me like the v4 patch is now actually not sharing the > cache entries, ie their hash key is just the child constraint OID > same as before; Yeah, you may see that we're only changing ri_BuildQueryKey() in the patch affecting only ri_query_cache, but not ri_LoadConstraintInfo() which leaves ri_constraint_cache unaffected. > but the comments are pretty confused about this. I've tried improving the comment in ri_BuildQueryKey() a bit to make clear what is and what is not being shared between partitions. > It might be simpler if you add just one new field which is the OID of > the constraint that we're building the SPI query from, which might be > either equal to constraint_id, or the OID of some parent constraint. > In particular it's not clear to me why we need both constraint_parent > and constraint_root_id. Yeah, I think constraint_parent is a leftover from some earlier hacking. I have removed it. Attached updated patch. -- Amit Langote EDB: http://www.enterprisedb.com v5-0001-Share-SPI-plan-between-partitions-in-some-RI-trig.patch Description: Binary data
Re: Huge memory consumption on partitioned table with FKs
Amit Langote writes: > On Fri, Mar 5, 2021 at 6:00 AM Tom Lane wrote: >> This claim seems false on its face: >>> All child constraints of a given foreign key constraint can use the >>> same RI query and the resulting plan, that is, no need to create as >>> many copies of the query and the plan as there are partitions, as >>> happens now due to the child constraint OID being used in the key >>> for ri_query_cache. > The quoted comment could have been written to be clearer about this, > but it's not talking about the table that is to be queried, but the > table whose RI trigger is being executed. In all the cases except one > (mentioned below), the table that is queried is the same irrespective > of which partition's trigger is being executed, so we're basically > creating the same plan separately for each partition. Hmm. So, the key point is that the values coming from the partitioned child table are injected into the test query as parameters, not as column references, thus it doesn't matter *to the test query* what numbers the referencing columns have in that child. We just have to be sure we pass the right parameter values. But ... doesn't the code use riinfo->fk_attnums[] to pull out the values to be passed? IOW, I now get the point about being able to share the SPI plans, but I'm still dubious about sharing the RI_ConstraintInfo cache entries. It looks to me like the v4 patch is now actually not sharing the cache entries, ie their hash key is just the child constraint OID same as before; but the comments are pretty confused about this. It might be simpler if you add just one new field which is the OID of the constraint that we're building the SPI query from, which might be either equal to constraint_id, or the OID of some parent constraint. In particular it's not clear to me why we need both constraint_parent and constraint_root_id. regards, tom lane
Re: Huge memory consumption on partitioned table with FKs
On Mon, Mar 8, 2021 at 9:53 PM Andy Fan wrote: > On Mon, Mar 8, 2021 at 8:42 PM Amit Langote wrote: >> On Mon, Mar 8, 2021 at 8:39 PM Andy Fan wrote: >> > On Mon, Mar 8, 2021 at 3:43 PM Andy Fan wrote: >> >> My point below is a bit off-topic, but I want to share it here. Since >> >> we implement a partitioned table in PG with the inherited class, it has >> >> much >> >> more flexibility than other databases. Like in PG, we allow different >> >> partitions >> >> have different physical order, different indexes, maybe different index >> >> states. >> >> that would cause our development work hard in many places and cause some >> >> runtime issues as well (like catalog memory usage), have we discussed >> >> limiting some flexibility so that we can have better coding/running >> >> experience? >> >> I want to do some research in this direction, but it would be better that >> >> I can >> >> listen to any advice from others. More specifically, I want to reduce >> >> the memory >> >> usage of Partitioned table/index as the first step. In my testing, each >> >> IndexOptInfo >> >> will use 2kB memory in each backend. >> > >> > >> > As for the compatible issue, will it be ok to introduce a new concept >> > like " >> > CREATE TABLE p (a int) partitioned by list(a) RESTRICTED". We can add >> > these >> > limitation to restricted partitioned relation only. >> >> I think you'd agree that the topics you want to discuss deserve a >> separate discussion thread. You may refer to this discussion in that >> new thread if you think that your proposals can solve the problem >> being discussed here more generally, which would of course be great. > > Sure, I can prepare more data and start a new thread for this. Great, thanks. -- Amit Langote EDB: http://www.enterprisedb.com
Re: Huge memory consumption on partitioned table with FKs
On Mon, Mar 8, 2021 at 8:42 PM Amit Langote wrote: > Hi Andy, > > On Mon, Mar 8, 2021 at 8:39 PM Andy Fan wrote: > > On Mon, Mar 8, 2021 at 3:43 PM Andy Fan > wrote: > >> My point below is a bit off-topic, but I want to share it here. Since > >> we implement a partitioned table in PG with the inherited class, it has > much > >> more flexibility than other databases. Like in PG, we allow different > partitions > >> have different physical order, different indexes, maybe different > index states. > >> that would cause our development work hard in many places and cause some > >> runtime issues as well (like catalog memory usage), have we discussed > >> limiting some flexibility so that we can have better coding/running > experience? > >> I want to do some research in this direction, but it would be better > that I can > >> listen to any advice from others. More specifically, I want to reduce > the memory > >> usage of Partitioned table/index as the first step. In my testing, > each IndexOptInfo > >> will use 2kB memory in each backend. > > > > > > As for the compatible issue, will it be ok to introduce a new concept > like " > > CREATE TABLE p (a int) partitioned by list(a) RESTRICTED". We can add > these > > limitation to restricted partitioned relation only. > > I think you'd agree that the topics you want to discuss deserve a > separate discussion thread. You may refer to this discussion in that > new thread if you think that your proposals can solve the problem > being discussed here more generally, which would of course be great. > > Sure, I can prepare more data and start a new thread for this. -- Best Regards Andy Fan (https://www.aliyun.com/)
Re: Huge memory consumption on partitioned table with FKs
Hi Andy, On Mon, Mar 8, 2021 at 8:39 PM Andy Fan wrote: > On Mon, Mar 8, 2021 at 3:43 PM Andy Fan wrote: >> My point below is a bit off-topic, but I want to share it here. Since >> we implement a partitioned table in PG with the inherited class, it has much >> more flexibility than other databases. Like in PG, we allow different >> partitions >> have different physical order, different indexes, maybe different index >> states. >> that would cause our development work hard in many places and cause some >> runtime issues as well (like catalog memory usage), have we discussed >> limiting some flexibility so that we can have better coding/running >> experience? >> I want to do some research in this direction, but it would be better that I >> can >> listen to any advice from others. More specifically, I want to reduce the >> memory >> usage of Partitioned table/index as the first step. In my testing, each >> IndexOptInfo >> will use 2kB memory in each backend. > > > As for the compatible issue, will it be ok to introduce a new concept like " > CREATE TABLE p (a int) partitioned by list(a) RESTRICTED". We can add these > limitation to restricted partitioned relation only. I think you'd agree that the topics you want to discuss deserve a separate discussion thread. You may refer to this discussion in that new thread if you think that your proposals can solve the problem being discussed here more generally, which would of course be great. -- Amit Langote EDB: http://www.enterprisedb.com
Re: Huge memory consumption on partitioned table with FKs
On Mon, Mar 8, 2021 at 3:43 PM Andy Fan wrote: > > > On Fri, Mar 5, 2021 at 5:00 AM Tom Lane wrote: > >> Amit Langote writes: >> > Updated patch attached. >> >> This claim seems false on its face: >> >> > All child constraints of a given foreign key constraint can use the >> > same RI query and the resulting plan, that is, no need to create as >> > many copies of the query and the plan as there are partitions, as >> > happens now due to the child constraint OID being used in the key >> > for ri_query_cache. >> >> What if the child tables don't have the same physical column numbers >> as the parent? > > > My point below is a bit off-topic, but I want to share it here. Since > we implement a partitioned table in PG with the inherited class, it has > much > more flexibility than other databases. Like in PG, we allow different > partitions > have different physical order, different indexes, maybe different index > states. > that would cause our development work hard in many places and cause some > runtime issues as well (like catalog memory usage), have we discussed > limiting some flexibility so that we can have better coding/running > experience? > I want to do some research in this direction, but it would be better that > I can > listen to any advice from others. More specifically, I want to reduce the > memory > usage of Partitioned table/index as the first step. In my testing, each > IndexOptInfo > will use 2kB memory in each backend. > As for the compatible issue, will it be ok to introduce a new concept like " CREATE TABLE p (a int) partitioned by list(a) RESTRICTED". We can add these limitation to restricted partitioned relation only. > The comment claiming that it's okay if riinfo->fk_attnums >> doesn't match seems quite off point, because the query plan is still >> going to need to use the correct column numbers. Even if column numbers >> are the same, the plan would also contain table and index OIDs that could >> only be right for one partition. >> >> I could imagine translating a parent plan to apply to a child instead of >> building it from scratch, but that would take a lot of code we don't have >> (there's no rewriteHandler infrastructure for plan nodes). >> >> Maybe I'm missing something, because I see that the cfbot claims >> this is passing, and I'd sure like to think that our test coverage >> is not so thin that it'd fail to detect probing the wrong partition >> for foreign key matches. But that's what it looks like this patch >> will do. >> >> regards, tom lane >> >> >> > > -- > Best Regards > Andy Fan (https://www.aliyun.com/) > -- Best Regards Andy Fan (https://www.aliyun.com/)
Re: Huge memory consumption on partitioned table with FKs
On Fri, Mar 5, 2021 at 5:00 AM Tom Lane wrote: > Amit Langote writes: > > Updated patch attached. > > This claim seems false on its face: > > > All child constraints of a given foreign key constraint can use the > > same RI query and the resulting plan, that is, no need to create as > > many copies of the query and the plan as there are partitions, as > > happens now due to the child constraint OID being used in the key > > for ri_query_cache. > > What if the child tables don't have the same physical column numbers > as the parent? My point below is a bit off-topic, but I want to share it here. Since we implement a partitioned table in PG with the inherited class, it has much more flexibility than other databases. Like in PG, we allow different partitions have different physical order, different indexes, maybe different index states. that would cause our development work hard in many places and cause some runtime issues as well (like catalog memory usage), have we discussed limiting some flexibility so that we can have better coding/running experience? I want to do some research in this direction, but it would be better that I can listen to any advice from others. More specifically, I want to reduce the memory usage of Partitioned table/index as the first step. In my testing, each IndexOptInfo will use 2kB memory in each backend. > The comment claiming that it's okay if riinfo->fk_attnums > doesn't match seems quite off point, because the query plan is still > going to need to use the correct column numbers. Even if column numbers > are the same, the plan would also contain table and index OIDs that could > only be right for one partition. > > I could imagine translating a parent plan to apply to a child instead of > building it from scratch, but that would take a lot of code we don't have > (there's no rewriteHandler infrastructure for plan nodes). > > Maybe I'm missing something, because I see that the cfbot claims > this is passing, and I'd sure like to think that our test coverage > is not so thin that it'd fail to detect probing the wrong partition > for foreign key matches. But that's what it looks like this patch > will do. > > regards, tom lane > > > -- Best Regards Andy Fan (https://www.aliyun.com/)
Re: Huge memory consumption on partitioned table with FKs
On Fri, Mar 5, 2021 at 6:00 AM Tom Lane wrote: > Amit Langote writes: > > Updated patch attached. > > This claim seems false on its face: > > > All child constraints of a given foreign key constraint can use the > > same RI query and the resulting plan, that is, no need to create as > > many copies of the query and the plan as there are partitions, as > > happens now due to the child constraint OID being used in the key > > for ri_query_cache. > > What if the child tables don't have the same physical column numbers > as the parent? The comment claiming that it's okay if riinfo->fk_attnums > doesn't match seems quite off point, because the query plan is still > going to need to use the correct column numbers. Even if column numbers > are the same, the plan would also contain table and index OIDs that could > only be right for one partition. > > I could imagine translating a parent plan to apply to a child instead of > building it from scratch, but that would take a lot of code we don't have > (there's no rewriteHandler infrastructure for plan nodes). > > Maybe I'm missing something, because I see that the cfbot claims > this is passing, and I'd sure like to think that our test coverage > is not so thin that it'd fail to detect probing the wrong partition > for foreign key matches. But that's what it looks like this patch > will do. The quoted comment could have been written to be clearer about this, but it's not talking about the table that is to be queried, but the table whose RI trigger is being executed. In all the cases except one (mentioned below), the table that is queried is the same irrespective of which partition's trigger is being executed, so we're basically creating the same plan separately for each partition. create table foo (a int primary key) partition by range (a); create table foo1 partition of foo for values from (minvalue) to (1); create table foo2 partition of foo for values from (1) to (maxvalue); create table bar (a int references foo) partition by range (a); create table bar1 partition of bar for values from (minvalue) to (1); create table bar2 partition of bar for values from (1) to (maxvalue); insert into foo values (0), (1); insert into bar values (0), (1); For the 2nd insert statement, RI_FKey_check() issues the following query irrespective of whether it is called from bar1's or bar2's insert check RI trigger: SELECT 1 FROM foo WHERE foo.a = $1; Likewise for: delete from foo; ri_restrict() issues the following query irrespective of whether called from foo1's or foo2's delete action trigger: SELECT 1 FROM bar WHERE bar.a = $1; The one case I found in which the queried table is the same as the table whose trigger has been invoked is ri_Check_Pk_Match(), in which case, it's actually wrong for partitions to share the plan, so the patch was wrong in that case. Apparently, we didn't test that case with partitioning, so I added one as follows: +-- test that ri_Check_Pk_Match() scans the correct partition for a deferred +-- ON DELETE/UPDATE NO ACTION constraint +CREATE SCHEMA fkpart10 + CREATE TABLE tbl1(f1 int PRIMARY KEY) PARTITION BY RANGE(f1) + CREATE TABLE tbl1_p1 PARTITION OF tbl1 FOR VALUES FROM (minvalue) TO (1) + CREATE TABLE tbl1_p2 PARTITION OF tbl1 FOR VALUES FROM (1) TO (maxvalue) + CREATE TABLE tbl2(f1 int REFERENCES tbl1 DEFERRABLE INITIALLY DEFERRED); +INSERT INTO fkpart10.tbl1 VALUES (0), (1); +INSERT INTO fkpart10.tbl2 VALUES (0), (1); +BEGIN; +DELETE FROM fkpart10.tbl1 WHERE f1 = 0; +UPDATE fkpart10.tbl1 SET f1 = 2 WHERE f1 = 1; +INSERT INTO fkpart10.tbl1 VALUES (0), (1); +COMMIT; +DROP SCHEMA fkpart10 CASCADE; +NOTICE: drop cascades to 2 other objects +DETAIL: drop cascades to table fkpart10.tbl1 +drop cascades to table fkpart10.tbl2 With the v3 patch, the test case fails with the following error on COMMIT: +ERROR: update or delete on table "tbl1_p2" violates foreign key constraint "tbl2_f1_fkey2" on table "tbl2" +DETAIL: Key (f1)=(1) is still referenced from table "tbl2". That's because in ri_Check_Pk_Match(), partition tbl2 ends up using the same cached plan as tbl1 and so scans tbl1 to check whether the row deleted from tbl2 has reappeared, which is of course wrong. I have fixed that by continuing to use the individual partition's constraint's OID as the query key for this particular query. I have also removed the somewhat confusing comment about fk_attnums. The point it was trying to make is that fk_attnums being possibly different among partitions does not make a difference to what any given shareable query's text ultimately looks like. Those attribute numbers are only used by the macros RIAttName(), RIAttType(), RIAttCollation() when generating the query text and they'd all return the same values no matter which partition's attribute numbers are used. Updated patch attached. -- Amit Langote EDB: http://www.enterprisedb.com From b433aabbab5f74a4284fabdb078200c6c1e89e57 Mon Sep 17 00:00:00 2001 From: amitlan Date: Thu, 23 Apr
Re: Huge memory consumption on partitioned table with FKs
Amit Langote writes: > Updated patch attached. This claim seems false on its face: > All child constraints of a given foreign key constraint can use the > same RI query and the resulting plan, that is, no need to create as > many copies of the query and the plan as there are partitions, as > happens now due to the child constraint OID being used in the key > for ri_query_cache. What if the child tables don't have the same physical column numbers as the parent? The comment claiming that it's okay if riinfo->fk_attnums doesn't match seems quite off point, because the query plan is still going to need to use the correct column numbers. Even if column numbers are the same, the plan would also contain table and index OIDs that could only be right for one partition. I could imagine translating a parent plan to apply to a child instead of building it from scratch, but that would take a lot of code we don't have (there's no rewriteHandler infrastructure for plan nodes). Maybe I'm missing something, because I see that the cfbot claims this is passing, and I'd sure like to think that our test coverage is not so thin that it'd fail to detect probing the wrong partition for foreign key matches. But that's what it looks like this patch will do. regards, tom lane
Re: Huge memory consumption on partitioned table with FKs
On 2021-Mar-03, Amit Langote wrote: > I don't know of any unaddressed comments on the patch, so I've marked > the entry Ready for Committer. Thanks, I'll look at it later this week. -- Álvaro Herrera39°49'30"S 73°17'W #error "Operator lives in the wrong universe" ("Use of cookies in real-time system development", M. Gleixner, M. Mc Guire)
Re: Huge memory consumption on partitioned table with FKs
Hi David, On Wed, Mar 3, 2021 at 10:21 PM David Steele wrote: > On 12/7/20 10:59 PM, Amit Langote wrote: > > On Tue, Dec 8, 2020 at 12:04 PM Kyotaro Horiguchi > > wrote: > >> At Tue, 8 Dec 2020 01:16:00 +0900, Amit Langote > >> wrote in > >>> On Mon, Dec 7, 2020 at 23:48 Alvaro Herrera > >>> wrote: > > I think this bit about splitting the struct is a distraction. Let's get > a patch that solves the bug first, and then we can discuss what further > refinements we want to do. I think we should get your patch in > CA+HiwqEOrfN9b=f3sdmyspgc4go-l_vmfhxllxvmmdp34e6...@mail.gmail.com > committed (which I have not read yet.) Do you agree with this plan? > >>> > >>> > >>> Yeah, I agree. > >> > >> Or > >> https://www.postgresql.org/message-id/ca+hiwqgrr2yoo6vobm6m_oac9w-wmxe1gouq-uydpin6zjt...@mail.gmail.com > >> ? > >> > >> +1 from me to either one. > > > > Oh, I hadn't actually checked the actual message that Alvaro > > mentioned, but yeah I too am fine with either that one or the latest > > one. > > Any progress on the decision of how to handle this bug? Not sure if that > was handled elsewhere but it appears to be key to making progress on > this patch. The v3 patch posted on Dec 4 is still what is being proposed to fix this. I don't know of any unaddressed comments on the patch, so I've marked the entry Ready for Committer. -- Amit Langote EDB: http://www.enterprisedb.com
Re: Huge memory consumption on partitioned table with FKs
On 12/7/20 10:59 PM, Amit Langote wrote: On Tue, Dec 8, 2020 at 12:04 PM Kyotaro Horiguchi wrote: At Tue, 8 Dec 2020 01:16:00 +0900, Amit Langote wrote in On Mon, Dec 7, 2020 at 23:48 Alvaro Herrera wrote: I think this bit about splitting the struct is a distraction. Let's get a patch that solves the bug first, and then we can discuss what further refinements we want to do. I think we should get your patch in CA+HiwqEOrfN9b=f3sdmyspgc4go-l_vmfhxllxvmmdp34e6...@mail.gmail.com committed (which I have not read yet.) Do you agree with this plan? Yeah, I agree. Or https://www.postgresql.org/message-id/ca+hiwqgrr2yoo6vobm6m_oac9w-wmxe1gouq-uydpin6zjt...@mail.gmail.com ? +1 from me to either one. Oh, I hadn't actually checked the actual message that Alvaro mentioned, but yeah I too am fine with either that one or the latest one. Any progress on the decision of how to handle this bug? Not sure if that was handled elsewhere but it appears to be key to making progress on this patch. Regards, -- -David da...@pgmasters.net
Re: Huge memory consumption on partitioned table with FKs
Hi Amit-san, > Thanks for checking. Indeed, it should have been added to the January > commit-fest. I've added it to the March one: > > https://commitfest.postgresql.org/32/2930/ Thank you for your quick response! -- Keisuke Kuroda NTT Software Innovation Center keisuke.kuroda.3...@gmail.com
Re: Huge memory consumption on partitioned table with FKs
Kuroda-san, On Fri, Jan 8, 2021 at 1:02 PM Keisuke Kuroda wrote: > Hi Amit-san, > > I noticed that this patch > (v3-0001-ri_triggers.c-Use-root-constraint-OID-as-key-to-r.patch) > is not registered in the commitfest. I think it needs to be registered for > commit, is that correct? > > I have confirmed that this patch can be applied to HEAD (master), > and that check-world PASS. Thanks for checking. Indeed, it should have been added to the January commit-fest. I've added it to the March one: https://commitfest.postgresql.org/32/2930/ -- Amit Langote EDB: http://www.enterprisedb.com
Re: Huge memory consumption on partitioned table with FKs
Hi Amit-san, I noticed that this patch (v3-0001-ri_triggers.c-Use-root-constraint-OID-as-key-to-r.patch) is not registered in the commitfest. I think it needs to be registered for commit, is that correct? I have confirmed that this patch can be applied to HEAD (master), and that check-world PASS. Best Regards, -- Keisuke Kuroda NTT Software Innovation Center keisuke.kuroda.3...@gmail.com
Re: Huge memory consumption on partitioned table with FKs
On Mon, Dec 7, 2020 at 11:01 PM Amit Langote wrote: > On Fri, Dec 4, 2020 at 12:05 PM Kyotaro Horiguchi > wrote: > > > Also, the comment that was in RI_ConstraintInfo now appears in > > > RI_ConstraintParam, and the new struct (RI_ConstraintInfo) is now > > > undocumented. What is the relationship between those two structs? I > > > see that they have pointers to each other, but I think the relationship > > > should be documented more clearly. > > > > I'm not sure the footprint of this patch worth doing but here is a bit > > more polished version. > > I noticed that the foreign_key test fails and it may have to do with > the fact that a partition's param info remains attached to the > parent's RI_ConstraintInfo even after it's detached from the parent > table using DETACH PARTITION. Just for (maybe not so distant) future reference, I managed to fix the failure with this: diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c index 187884f..c67f2a6 100644 --- a/src/backend/utils/adt/ri_triggers.c +++ b/src/backend/utils/adt/ri_triggers.c @@ -1932,7 +1932,7 @@ ri_BuildQueryKey(RI_QueryKey *key, const RI_ConstraintInfo *riinfo, * We assume struct RI_QueryKey contains no padding bytes, else we'd need * to use memset to clear them. */ - key->constr_id = riinfo->param->query_key; + key->constr_id = riinfo->constraint_id; key->constr_queryno = constr_queryno; } It seems the shareable part (param) is not properly invalidated after a partition's constraint is detached, apparently leaving query_key in it set to the parent's constraint id. -- Amit Langote EDB: http://www.enterprisedb.com
Re: Huge memory consumption on partitioned table with FKs
On Tue, Dec 8, 2020 at 12:04 PM Kyotaro Horiguchi wrote: > At Tue, 8 Dec 2020 01:16:00 +0900, Amit Langote > wrote in > > Hi Alvaro, > > > > On Mon, Dec 7, 2020 at 23:48 Alvaro Herrera wrote: > > > > > On 2020-Dec-07, Amit Langote wrote: > > > > > > > On Fri, Dec 4, 2020 at 12:05 PM Kyotaro Horiguchi > > > > wrote: > > > > > > Also, the comment that was in RI_ConstraintInfo now appears in > > > > > > RI_ConstraintParam, and the new struct (RI_ConstraintInfo) is now > > > > > > undocumented. What is the relationship between those two structs? > > > > > > I > > > > > > see that they have pointers to each other, but I think the > > > relationship > > > > > > should be documented more clearly. > > > > > > > > > > I'm not sure the footprint of this patch worth doing but here is a bit > > > > > more polished version. > > > > > > > > I noticed that the foreign_key test fails and it may have to do with > > > > the fact that a partition's param info remains attached to the > > > > parent's RI_ConstraintInfo even after it's detached from the parent > > > > table using DETACH PARTITION. > > > > > > I think this bit about splitting the struct is a distraction. Let's get > > > a patch that solves the bug first, and then we can discuss what further > > > refinements we want to do. I think we should get your patch in > > > CA+HiwqEOrfN9b=f3sdmyspgc4go-l_vmfhxllxvmmdp34e6...@mail.gmail.com > > > committed (which I have not read yet.) Do you agree with this plan? > > > > > > Yeah, I agree. > > Or > https://www.postgresql.org/message-id/ca+hiwqgrr2yoo6vobm6m_oac9w-wmxe1gouq-uydpin6zjt...@mail.gmail.com > ? > > +1 from me to either one. Oh, I hadn't actually checked the actual message that Alvaro mentioned, but yeah I too am fine with either that one or the latest one. -- Amit Langote EDB: http://www.enterprisedb.com
Re: Huge memory consumption on partitioned table with FKs
At Tue, 8 Dec 2020 01:16:00 +0900, Amit Langote wrote in > Hi Alvaro, > > On Mon, Dec 7, 2020 at 23:48 Alvaro Herrera wrote: > > > On 2020-Dec-07, Amit Langote wrote: > > > > > On Fri, Dec 4, 2020 at 12:05 PM Kyotaro Horiguchi > > > wrote: > > > > > Also, the comment that was in RI_ConstraintInfo now appears in > > > > > RI_ConstraintParam, and the new struct (RI_ConstraintInfo) is now > > > > > undocumented. What is the relationship between those two structs? I > > > > > see that they have pointers to each other, but I think the > > relationship > > > > > should be documented more clearly. > > > > > > > > I'm not sure the footprint of this patch worth doing but here is a bit > > > > more polished version. > > > > > > I noticed that the foreign_key test fails and it may have to do with > > > the fact that a partition's param info remains attached to the > > > parent's RI_ConstraintInfo even after it's detached from the parent > > > table using DETACH PARTITION. > > > > I think this bit about splitting the struct is a distraction. Let's get > > a patch that solves the bug first, and then we can discuss what further > > refinements we want to do. I think we should get your patch in > > CA+HiwqEOrfN9b=f3sdmyspgc4go-l_vmfhxllxvmmdp34e6...@mail.gmail.com > > committed (which I have not read yet.) Do you agree with this plan? > > > Yeah, I agree. Or https://www.postgresql.org/message-id/ca+hiwqgrr2yoo6vobm6m_oac9w-wmxe1gouq-uydpin6zjt...@mail.gmail.com ? +1 from me to either one. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Huge memory consumption on partitioned table with FKs
Hi Alvaro, On Mon, Dec 7, 2020 at 23:48 Alvaro Herrera wrote: > On 2020-Dec-07, Amit Langote wrote: > > > On Fri, Dec 4, 2020 at 12:05 PM Kyotaro Horiguchi > > wrote: > > > > Also, the comment that was in RI_ConstraintInfo now appears in > > > > RI_ConstraintParam, and the new struct (RI_ConstraintInfo) is now > > > > undocumented. What is the relationship between those two structs? I > > > > see that they have pointers to each other, but I think the > relationship > > > > should be documented more clearly. > > > > > > I'm not sure the footprint of this patch worth doing but here is a bit > > > more polished version. > > > > I noticed that the foreign_key test fails and it may have to do with > > the fact that a partition's param info remains attached to the > > parent's RI_ConstraintInfo even after it's detached from the parent > > table using DETACH PARTITION. > > I think this bit about splitting the struct is a distraction. Let's get > a patch that solves the bug first, and then we can discuss what further > refinements we want to do. I think we should get your patch in > CA+HiwqEOrfN9b=f3sdmyspgc4go-l_vmfhxllxvmmdp34e6...@mail.gmail.com > committed (which I have not read yet.) Do you agree with this plan? Yeah, I agree. - Amit > -- Amit Langote EDB: http://www.enterprisedb.com
Re: Huge memory consumption on partitioned table with FKs
On 2020-Dec-07, Amit Langote wrote: > On Fri, Dec 4, 2020 at 12:05 PM Kyotaro Horiguchi > wrote: > > > Also, the comment that was in RI_ConstraintInfo now appears in > > > RI_ConstraintParam, and the new struct (RI_ConstraintInfo) is now > > > undocumented. What is the relationship between those two structs? I > > > see that they have pointers to each other, but I think the relationship > > > should be documented more clearly. > > > > I'm not sure the footprint of this patch worth doing but here is a bit > > more polished version. > > I noticed that the foreign_key test fails and it may have to do with > the fact that a partition's param info remains attached to the > parent's RI_ConstraintInfo even after it's detached from the parent > table using DETACH PARTITION. I think this bit about splitting the struct is a distraction. Let's get a patch that solves the bug first, and then we can discuss what further refinements we want to do. I think we should get your patch in CA+HiwqEOrfN9b=f3sdmyspgc4go-l_vmfhxllxvmmdp34e6...@mail.gmail.com committed (which I have not read yet.) Do you agree with this plan?
Re: Huge memory consumption on partitioned table with FKs
On Fri, Dec 4, 2020 at 12:05 PM Kyotaro Horiguchi wrote: > > Also, the comment that was in RI_ConstraintInfo now appears in > > RI_ConstraintParam, and the new struct (RI_ConstraintInfo) is now > > undocumented. What is the relationship between those two structs? I > > see that they have pointers to each other, but I think the relationship > > should be documented more clearly. > > I'm not sure the footprint of this patch worth doing but here is a bit > more polished version. I noticed that the foreign_key test fails and it may have to do with the fact that a partition's param info remains attached to the parent's RI_ConstraintInfo even after it's detached from the parent table using DETACH PARTITION. -- Amit Langote EDB: http://www.enterprisedb.com
Re: Huge memory consumption on partitioned table with FKs
On Fri, Dec 4, 2020 at 2:48 PM Kyotaro Horiguchi wrote: > At Fri, 4 Dec 2020 12:00:09 +0900, Keisuke Kuroda > wrote in > > Hi Amit, > > > > > I have attached a patch in which I've tried to merge the ideas from > > > both my patch and Kuroda-san's. I liked that his patch added > > > conparentid to RI_ConstraintInfo because that saves a needless > > > syscache lookup for constraints that don't have a parent. I've kept > > > my idea to compute the root constraint id only once in > > > ri_LoadConstraint(), not on every invocation of ri_BuildQueryKey(). > > > Kuroda-san, anything you'd like to add to that? > > > > Thank you for the merge! It looks good to me. > > I think a fix for InvalidateConstraintCacheCallBack() is also good. > > > > I also confirmed that the patch passed the make check-world. > > It's fine that constraint_rood_id overrides constraint_id, but how > about that constraint_root_id stores constraint_id if it is not a > partition? That change makes the patch a bit simpler. My patch was like that before posting to this thread, but keeping constraint_id and constraint_root_id separate looked better for documenting the partitioning case as working differently from the regular table case. I guess a comment in ri_BuildQueryKey is enough for that though and it's not like we're using constraint_root_id in any other place to make matters confusing, so I changed it as you suggest. Updated patch attached. -- Amit Langote EDB: http://www.enterprisedb.com v3-0001-ri_triggers.c-Use-root-constraint-OID-as-key-to-r.patch Description: Binary data
Re: Huge memory consumption on partitioned table with FKs
At Fri, 4 Dec 2020 12:00:09 +0900, Keisuke Kuroda wrote in > Hi Amit, > > > I have attached a patch in which I've tried to merge the ideas from > > both my patch and Kuroda-san's. I liked that his patch added > > conparentid to RI_ConstraintInfo because that saves a needless > > syscache lookup for constraints that don't have a parent. I've kept > > my idea to compute the root constraint id only once in > > ri_LoadConstraint(), not on every invocation of ri_BuildQueryKey(). > > Kuroda-san, anything you'd like to add to that? > > Thank you for the merge! It looks good to me. > I think a fix for InvalidateConstraintCacheCallBack() is also good. > > I also confirmed that the patch passed the make check-world. It's fine that constraint_rood_id overrides constraint_id, but how about that constraint_root_id stores constraint_id if it is not a partition? That change makes the patch a bit simpler. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Huge memory consumption on partitioned table with FKs
Thanks, but sorry for the confusion. I intended just to show how it looks like if we share RI_ConstraintInfo among partition relations. At Thu, 3 Dec 2020 10:22:47 -0300, Alvaro Herrera wrote in > Hello > > I haven't followed this thread's latest posts, but I'm unclear on the > lifetime of the new struct that's being allocated in TopMemoryContext. > At what point are those structs freed? The choice of memory context is tentative and in order to shrink the patch'es footprint. I think we don't use CurrentDynaHashCxt for the additional struct so a context for this use is needed. The struct is freed only when the parent struct (RI_ConstraintInfo) is found to be able to share the child struct (RI_ConstraintParam) with the parent constraint. It seems like inefficient (or tending to make "hole"s in the heap area) but I chose it just to shrink the footprint. We could create the new RI_ConstraintInfo on stack then copy it to the cache after we find that the RI_ConstraintInfo needs its own RI_ConstriantParam. > Also, the comment that was in RI_ConstraintInfo now appears in > RI_ConstraintParam, and the new struct (RI_ConstraintInfo) is now > undocumented. What is the relationship between those two structs? I > see that they have pointers to each other, but I think the relationship > should be documented more clearly. I'm not sure the footprint of this patch worth doing but here is a bit more polished version. regards. -- Kyotaro Horiguchi NTT Open Source Software Center >From 55946e09d33fc7fa43bed04ef548bf8a3f67155d Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Thu, 3 Dec 2020 16:15:57 +0900 Subject: [PATCH 1/2] separte riinfo --- src/backend/utils/adt/ri_triggers.c | 290 1 file changed, 168 insertions(+), 122 deletions(-) diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c index 02b1a3868f..0306bf7739 100644 --- a/src/backend/utils/adt/ri_triggers.c +++ b/src/backend/utils/adt/ri_triggers.c @@ -97,14 +97,29 @@ * Information extracted from an FK pg_constraint entry. This is cached in * ri_constraint_cache. */ +struct RI_ConstraintParam; + typedef struct RI_ConstraintInfo { - Oid constraint_id; /* OID of pg_constraint entry (hash key) */ + Oid constraint_id; bool valid; /* successfully initialized? */ uint32 oidHashValue; /* hash value of pg_constraint OID */ NameData conname; /* name of the FK constraint */ - Oid pk_relid; /* referenced relation */ Oid fk_relid; /* referencing relation */ + struct RI_ConstraintParam *param; /* sharable part */ + dlist_node valid_link; /* Link in list of valid entries */ +} RI_ConstraintInfo; + +/* + * RI_ConstraintParam + * + * The part sharable among relations in a partitioned table of the cached + * constraint information. + */ +typedef struct RI_ConstraintParam +{ + /* begin with identity members */ + Oid pk_relid; /* referenced relation */ char confupdtype; /* foreign key's ON UPDATE action */ char confdeltype; /* foreign key's ON DELETE action */ char confmatchtype; /* foreign key's match type */ @@ -114,8 +129,12 @@ typedef struct RI_ConstraintInfo Oid pf_eq_oprs[RI_MAX_NUMKEYS]; /* equality operators (PK = FK) */ Oid pp_eq_oprs[RI_MAX_NUMKEYS]; /* equality operators (PK = PK) */ Oid ff_eq_oprs[RI_MAX_NUMKEYS]; /* equality operators (FK = FK) */ - dlist_node valid_link; /* Link in list of valid entries */ -} RI_ConstraintInfo; + + /* These should be at the end of struct, see ri_LoadConstraintInfo */ + Oid query_key; /* key for planned statment */ + RI_ConstraintInfo *ownerinfo; /* owner RI_ConstraintInfo of this param */ +} RI_ConstraintParam; + /* * RI_QueryKey @@ -163,6 +182,7 @@ typedef struct RI_CompareHashEntry /* * Local data */ +static MemoryContext ri_constraint_cache_cxt = NULL; static HTAB *ri_constraint_cache = NULL; static HTAB *ri_query_cache = NULL; static HTAB *ri_compare_cache = NULL; @@ -264,7 +284,7 @@ RI_FKey_check(TriggerData *trigdata) * SELECT FOR KEY SHARE will get on it. */ fk_rel = trigdata->tg_relation; - pk_rel = table_open(riinfo->pk_relid, RowShareLock); + pk_rel = table_open(riinfo->param->pk_relid, RowShareLock); switch (ri_NullCheck(RelationGetDescr(fk_rel), newslot, riinfo, false)) { @@ -283,7 +303,7 @@ RI_FKey_check(TriggerData *trigdata) * This is the only case that differs between the three kinds of * MATCH. */ - switch (riinfo->confmatchtype) + switch (riinfo->param->confmatchtype) { case FKCONSTR_MATCH_FULL: @@ -364,17 +384,17 @@ RI_FKey_check(TriggerData *trigdata) appendStringInfo(, "SELECT 1 FROM %s%s x", pk_only, pkrelname); querysep = "WHERE"; - for (int i = 0; i < riinfo->nkeys; i++) + for (int i = 0; i < riinfo->param->nkeys; i++) { - Oid pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]); - Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]); + Oid pk_type = RIAttType(pk_rel,
Re: Huge memory consumption on partitioned table with FKs
Hi Amit, > I have attached a patch in which I've tried to merge the ideas from > both my patch and Kuroda-san's. I liked that his patch added > conparentid to RI_ConstraintInfo because that saves a needless > syscache lookup for constraints that don't have a parent. I've kept > my idea to compute the root constraint id only once in > ri_LoadConstraint(), not on every invocation of ri_BuildQueryKey(). > Kuroda-san, anything you'd like to add to that? Thank you for the merge! It looks good to me. I think a fix for InvalidateConstraintCacheCallBack() is also good. I also confirmed that the patch passed the make check-world. Best Regards, -- Keisuke Kuroda NTT Software Innovation Center keisuke.kuroda.3...@gmail.com
Re: Huge memory consumption on partitioned table with FKs
At Thu, 3 Dec 2020 21:40:29 +0900, Amit Langote wrote in > On Thu, Dec 3, 2020 at 5:13 PM Kyotaro Horiguchi > wrote: > > At Thu, 3 Dec 2020 16:41:45 +0900, Amit Langote > > wrote in > > > Maybe I misread but I think you did in your email dated Dec 1 where you > > > said: > > > > > > "After an off-list discussion, we confirmed that even in that case the > > > patch works as is because fk_attnum (or contuple.conkey) always stores > > > key attnums compatible to the topmost parent when conparent has a > > > valid value (assuming the current usage of fk_attnum), but I still > > > feel uneasy to rely on that unclear behavior." > > > > fk_attnums *doesn't* refers to to top parent talbe of the referencing > > side. it refers to attributes of the partition that is compatible with > > the same element of fk_attnums of the topmost parent. Maybe I'm > > misreading. > > Yeah, no I am confused. Reading what I wrote, it seems I implied that > the referenced (PK relation's) partitions have RI_ConstraintInfo which > makes no sense, although there indeed is one pg_constraint entry that > is defined on the FK root table for every PK partition with its OID as > confrelid, which is in addition to an entry containing the root PK > table's OID as confrelid. I confused those PK-partition-referencing > entries as belonging to the partitions themselves. Although in my > defence, all of those entries' conkey contains the FK root table's > attributes, so at least that much holds. :) Yes. I think that that confusion doen't hurt the correctness of the discussion:) > > > > > On the topic of how we'd be able to share even the RI_ConstraintInfos > > > > > among partitions, that would indeed look a bit more elaborate than the > > > > > patch we have right now. > > > > > > > > Maybe just letting the hash entry for the child riinfo point to the > > > > parent riinfo if all members (other than constraint_id, of course) > > > > share the exactly the same values. No need to count references since > > > > we don't going to remove riinfos. > > > > > > Ah, something maybe worth trying. Although the memory we'd save by > > > sharing the RI_ConstraintInfos would not add that much to the savings > > > we're having by sharing the plan, because it's the plans that are a > > > memory hog AFAIK. > > > > I agree that plans are rather large but the sharable part of the > > RI_ConstraintInfos is 536 bytes, I'm not sure it is small enough > > comparing to the plans. But that has somewhat large footprint.. (See > > the attached) > > Thanks for the patch. That's only to show how that looks like. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Huge memory consumption on partitioned table with FKs
Hello I haven't followed this thread's latest posts, but I'm unclear on the lifetime of the new struct that's being allocated in TopMemoryContext. At what point are those structs freed? Also, the comment that was in RI_ConstraintInfo now appears in RI_ConstraintParam, and the new struct (RI_ConstraintInfo) is now undocumented. What is the relationship between those two structs? I see that they have pointers to each other, but I think the relationship should be documented more clearly. Thanks!
Re: Huge memory consumption on partitioned table with FKs
On Tue, Dec 1, 2020 at 12:25 PM Corey Huinker wrote: > On Mon, Nov 30, 2020 at 9:48 PM Tom Lane wrote: >> Corey Huinker writes: >> > Given that we're already looking at these checks, I was wondering if this >> > might be the time to consider implementing these checks by directly >> > scanning the constraint index. >> >> Yeah, maybe. Certainly ri_triggers is putting a huge amount of effort >> into working around the SPI/parser/planner layer, to not a lot of gain. >> >> However, it's not clear to me that that line of thought will work well >> for the statement-level-trigger approach. In that case you might be >> dealing with enough tuples to make a different plan advisable. > > Bypassing SPI would probably mean that we stay with row level triggers, and > the cached query plan would go away, perhaps replaced by an > already-looked-up-this-tuple hash sorta like what the cached nested loops > effort is doing. > > I've been meaning to give this a try when I got some spare time. This may > inspire me to try again. +1 for this line of work. -- Amit Langote EDB: http://www.enterprisedb.com
Re: Huge memory consumption on partitioned table with FKs
On Thu, Dec 3, 2020 at 5:13 PM Kyotaro Horiguchi wrote: > At Thu, 3 Dec 2020 16:41:45 +0900, Amit Langote > wrote in > > Maybe I misread but I think you did in your email dated Dec 1 where you > > said: > > > > "After an off-list discussion, we confirmed that even in that case the > > patch works as is because fk_attnum (or contuple.conkey) always stores > > key attnums compatible to the topmost parent when conparent has a > > valid value (assuming the current usage of fk_attnum), but I still > > feel uneasy to rely on that unclear behavior." > > fk_attnums *doesn't* refers to to top parent talbe of the referencing > side. it refers to attributes of the partition that is compatible with > the same element of fk_attnums of the topmost parent. Maybe I'm > misreading. Yeah, no I am confused. Reading what I wrote, it seems I implied that the referenced (PK relation's) partitions have RI_ConstraintInfo which makes no sense, although there indeed is one pg_constraint entry that is defined on the FK root table for every PK partition with its OID as confrelid, which is in addition to an entry containing the root PK table's OID as confrelid. I confused those PK-partition-referencing entries as belonging to the partitions themselves. Although in my defence, all of those entries' conkey contains the FK root table's attributes, so at least that much holds. :) > > > > On the topic of how we'd be able to share even the RI_ConstraintInfos > > > > among partitions, that would indeed look a bit more elaborate than the > > > > patch we have right now. > > > > > > Maybe just letting the hash entry for the child riinfo point to the > > > parent riinfo if all members (other than constraint_id, of course) > > > share the exactly the same values. No need to count references since > > > we don't going to remove riinfos. > > > > Ah, something maybe worth trying. Although the memory we'd save by > > sharing the RI_ConstraintInfos would not add that much to the savings > > we're having by sharing the plan, because it's the plans that are a > > memory hog AFAIK. > > I agree that plans are rather large but the sharable part of the > RI_ConstraintInfos is 536 bytes, I'm not sure it is small enough > comparing to the plans. But that has somewhat large footprint.. (See > the attached) Thanks for the patch. -- Amit Langote EDB: http://www.enterprisedb.com
Re: Huge memory consumption on partitioned table with FKs
At Thu, 03 Dec 2020 17:13:16 +0900 (JST), Kyotaro Horiguchi wrote in me> I agree that plans are rather large but the sharable part of the me> RI_ConstraintInfos is 536 bytes, I'm not sure it is small enough me> comparing to the plans. But that has somewhat large footprint.. (See me> the attached) 0001 contains a bug about query_key and get_ri_constaint_root (from your patch) is not needed there, but the core part is 0002 so please ignore them. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Huge memory consumption on partitioned table with FKs
At Thu, 3 Dec 2020 16:41:45 +0900, Amit Langote wrote in > On Thu, Dec 3, 2020 at 2:29 PM Kyotaro Horiguchi > wrote: > > At Thu, 3 Dec 2020 12:27:53 +0900, Amit Langote > > wrote in > > > On Thu, Dec 3, 2020 at 10:15 AM Kyotaro Horiguchi > > > wrote: > > > For the queries on the referencing side ("check" side), > > > type/collation/attribute name determined using the above are going to > > > be the same for all partitions in a given tree irrespective of the > > > attribute number, because they're logically the same column. On the > > > > Yes, I know that, which is what I meant by "practically" or > > "actually", but it is not explicitly defined AFAICS. > > Well, I think it's great that we don't have to worry *in this part of > the code* about partition's fk_attnums not being congruent with the > root parent's, because ensuring that is the responsibility of the > other parts of the system such as DDL. If we have any problems in > this area, they should be dealt with by ensuring that there are no > bugs in those other parts. Agreed. > > Thus that would be no longer an issue if we explicitly define that > > "When conpraentid stores a valid value, each element of fk_attnums > > points to logically the same attribute with the RI_ConstraintInfo for > > the parent constraint." Or I'd be happy if we have such a comment > > there instead. > > I saw a comment in Kuroda-san's v2 patch that is perhaps meant to > address this point, but the placement needs to be reconsidered: Ah, yes, that comes from my proposal. > @@ -366,6 +368,14 @@ RI_FKey_check(TriggerData *trigdata) > querysep = "WHERE"; > for (int i = 0; i < riinfo->nkeys; i++) > { > + > + /* > + * We share the same plan among all relations in a partition > + * hierarchy. The plan is guaranteed to be compatible since all of > + * the member relations are guaranteed to have the equivalent set > + * of foreign keys in fk_attnums[]. > + */ > + > Oid pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]); > Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]); > > A more appropriate place for this kind of comment would be where > fk_attnums is defined or in ri_BuildQueryKey() that is shared by > different RI query issuing functions. Yeah, I wanted more appropriate place for the comment. That place seems reasonable. > > > referenced side ("restrict", "cascade", "set" side), as you already > > > mentioned, fk_attnums refers to the top parent table of the > > > referencing side, so no possibility of they being different in the > > > various referenced partitions' RI_ConstraintInfos. > > > > Right. (I'm not sure I have mention that here, though:p)A > > Maybe I misread but I think you did in your email dated Dec 1 where you said: > > "After an off-list discussion, we confirmed that even in that case the > patch works as is because fk_attnum (or contuple.conkey) always stores > key attnums compatible to the topmost parent when conparent has a > valid value (assuming the current usage of fk_attnum), but I still > feel uneasy to rely on that unclear behavior." fk_attnums *doesn't* refers to to top parent talbe of the referencing side. it refers to attributes of the partition that is compatible with the same element of fk_attnums of the topmost parent. Maybe I'm misreading. > > > On the topic of how we'd be able to share even the RI_ConstraintInfos > > > among partitions, that would indeed look a bit more elaborate than the > > > patch we have right now. > > > > Maybe just letting the hash entry for the child riinfo point to the > > parent riinfo if all members (other than constraint_id, of course) > > share the exactly the same values. No need to count references since > > we don't going to remove riinfos. > > Ah, something maybe worth trying. Although the memory we'd save by > sharing the RI_ConstraintInfos would not add that much to the savings > we're having by sharing the plan, because it's the plans that are a > memory hog AFAIK. I agree that plans are rather large but the sharable part of the RI_ConstraintInfos is 536 bytes, I'm not sure it is small enough comparing to the plans. But that has somewhat large footprint.. (See the attached) > > > > About your patch, it calculates the root constrid at the time an > > > > riinfo is created, but when the root-partition is further attached to > > > > another partitioned-table after the riinfo creation, > > > > constraint_root_id gets stale. Of course that dones't matter > > > > practically, though. > > > > > > Maybe we could also store the hash value of the root constraint OID as > > > rootHashValue and check for that one too in > > > InvalidateConstraintCacheCallBack(). That would take care of this > > > unless I'm missing something. > > > > Seems to be sound. > > Okay, thanks. > > I have attached a patch in which I've tried to merge the ideas from > both my patch
Re: Huge memory consumption on partitioned table with FKs
On Thu, Dec 3, 2020 at 2:29 PM Kyotaro Horiguchi wrote: > At Thu, 3 Dec 2020 12:27:53 +0900, Amit Langote > wrote in > > On Thu, Dec 3, 2020 at 10:15 AM Kyotaro Horiguchi > > wrote: > > > I don't think you're missing something. As I (tried to..) mentioned in > > > another branch of this thread, you're right. On the otherhand > > > fk_attnums is actually used to generate the query, thus *issue* > > > potentially affects the query. Although that might be paranoic, that > > > possibility is eliminated by checking the congruency(?) of fk_attnums > > > (or other members). We might even be able to share a riinfo among > > > such children. > > > > Just to be sure I've not misunderstood you, let me mention why I think > > the queries used by different partitions all turn out to be the same > > despite attribute number differences among partitions. The places > > that uses fk_attnums when generating a query are these among others: > > > > Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]); > > ... > > Oid fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]); > > ... > > quoteOneName(attname, > > RIAttName(fk_rel, riinfo->fk_attnums[i])); > > > > For the queries on the referencing side ("check" side), > > type/collation/attribute name determined using the above are going to > > be the same for all partitions in a given tree irrespective of the > > attribute number, because they're logically the same column. On the > > Yes, I know that, which is what I meant by "practically" or > "actually", but it is not explicitly defined AFAICS. Well, I think it's great that we don't have to worry *in this part of the code* about partition's fk_attnums not being congruent with the root parent's, because ensuring that is the responsibility of the other parts of the system such as DDL. If we have any problems in this area, they should be dealt with by ensuring that there are no bugs in those other parts. > Thus that would be no longer an issue if we explicitly define that > "When conpraentid stores a valid value, each element of fk_attnums > points to logically the same attribute with the RI_ConstraintInfo for > the parent constraint." Or I'd be happy if we have such a comment > there instead. I saw a comment in Kuroda-san's v2 patch that is perhaps meant to address this point, but the placement needs to be reconsidered: @@ -366,6 +368,14 @@ RI_FKey_check(TriggerData *trigdata) querysep = "WHERE"; for (int i = 0; i < riinfo->nkeys; i++) { + + /* + * We share the same plan among all relations in a partition + * hierarchy. The plan is guaranteed to be compatible since all of + * the member relations are guaranteed to have the equivalent set + * of foreign keys in fk_attnums[]. + */ + Oid pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]); Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]); A more appropriate place for this kind of comment would be where fk_attnums is defined or in ri_BuildQueryKey() that is shared by different RI query issuing functions. > > referenced side ("restrict", "cascade", "set" side), as you already > > mentioned, fk_attnums refers to the top parent table of the > > referencing side, so no possibility of they being different in the > > various referenced partitions' RI_ConstraintInfos. > > Right. (I'm not sure I have mention that here, though:p)A Maybe I misread but I think you did in your email dated Dec 1 where you said: "After an off-list discussion, we confirmed that even in that case the patch works as is because fk_attnum (or contuple.conkey) always stores key attnums compatible to the topmost parent when conparent has a valid value (assuming the current usage of fk_attnum), but I still feel uneasy to rely on that unclear behavior." > > On the topic of how we'd be able to share even the RI_ConstraintInfos > > among partitions, that would indeed look a bit more elaborate than the > > patch we have right now. > > Maybe just letting the hash entry for the child riinfo point to the > parent riinfo if all members (other than constraint_id, of course) > share the exactly the same values. No need to count references since > we don't going to remove riinfos. Ah, something maybe worth trying. Although the memory we'd save by sharing the RI_ConstraintInfos would not add that much to the savings we're having by sharing the plan, because it's the plans that are a memory hog AFAIK. > > > About your patch, it calculates the root constrid at the time an > > > riinfo is created, but when the root-partition is further attached to > > > another partitioned-table after the riinfo creation, > > > constraint_root_id gets stale. Of course that dones't matter > > > practically, though. > > > > Maybe we could also store the hash value of the root constraint OID as > > rootHashValue and check for that one too in > >
Re: Huge memory consumption on partitioned table with FKs
At Thu, 3 Dec 2020 12:27:53 +0900, Amit Langote wrote in > Hello, > > On Thu, Dec 3, 2020 at 10:15 AM Kyotaro Horiguchi > wrote: > > At Wed, 2 Dec 2020 22:02:50 +0900, Amit Langote > > wrote in > > > Hmm, I don't see what the problem is here, because it's not > > > RI_ConstraintInfo that is being shared among partitions of the same > > > parent, but the RI query (and the SPIPlanPtr for it) that is issued > > > through SPI. The query (and the plan) turns out to be the same no > > > matter what partition's RI_ConstraintInfo is first used to generate > > > it. What am I missing? > > > > I don't think you're missing something. As I (tried to..) mentioned in > > another branch of this thread, you're right. On the otherhand > > fk_attnums is actually used to generate the query, thus *issue* > > potentially affects the query. Although that might be paranoic, that > > possibility is eliminated by checking the congruency(?) of fk_attnums > > (or other members). We might even be able to share a riinfo among > > such children. > > Just to be sure I've not misunderstood you, let me mention why I think > the queries used by different partitions all turn out to be the same > despite attribute number differences among partitions. The places > that uses fk_attnums when generating a query are these among others: > > Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]); > ... > Oid fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]); > ... > quoteOneName(attname, > RIAttName(fk_rel, riinfo->fk_attnums[i])); > > For the queries on the referencing side ("check" side), > type/collation/attribute name determined using the above are going to > be the same for all partitions in a given tree irrespective of the > attribute number, because they're logically the same column. On the Yes, I know that, which is what I meant by "practically" or "actually", but it is not explicitly defined AFAICS. Thus that would be no longer an issue if we explicitly define that "When conpraentid stores a valid value, each element of fk_attnums points to logically the same attribute with the RI_ConstraintInfo for the parent constraint." Or I'd be happy if we have such a comment there instead. > referenced side ("restrict", "cascade", "set" side), as you already > mentioned, fk_attnums refers to the top parent table of the > referencing side, so no possibility of they being different in the > various referenced partitions' RI_ConstraintInfos. Right. (I'm not sure I have mention that here, though:p)A > On the topic of how we'd be able to share even the RI_ConstraintInfos > among partitions, that would indeed look a bit more elaborate than the > patch we have right now. Maybe just letting the hash entry for the child riinfo point to the parent riinfo if all members (other than constraint_id, of course) share the exactly the same values. No need to count references since we don't going to remove riinfos. > > > BTW, one problem with Kuroda-san's patch is that it's using > > > conparentid as the shared key, which can still lead to multiple > > > queries being generated when using multi-level partitioning, because > > > there would be many (intermediate) parent constraints in that case. > > > We should really be using the "root" constraint OID as the key, > > > because there would only be one root from which all constraints in a > > > given partition hierarchy would've originated. Actually, I had > > > written a patch a few months back to do exactly that, a polished > > > version of which I've attached with this email. Please take a look. > > > > I don't like that the function self-recurses but > > ri_GetParentConstrOid() actually climbs up to the root constraint, so > > the patch covers the multilevel partitioning cases. > > Sorry, I got too easily distracted by the name Kuroda-san chose for > the field in RI_ConstraintInfo and the function. I thought the same at the first look to the function. > > About your patch, it calculates the root constrid at the time an > > riinfo is created, but when the root-partition is further attached to > > another partitioned-table after the riinfo creation, > > constraint_root_id gets stale. Of course that dones't matter > > practically, though. > > Maybe we could also store the hash value of the root constraint OID as > rootHashValue and check for that one too in > InvalidateConstraintCacheCallBack(). That would take care of this > unless I'm missing something. Seems to be sound. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Huge memory consumption on partitioned table with FKs
Hello, On Thu, Dec 3, 2020 at 10:15 AM Kyotaro Horiguchi wrote: > At Wed, 2 Dec 2020 22:02:50 +0900, Amit Langote > wrote in > > Hello, > > > > On Tue, Dec 1, 2020 at 9:40 AM Kyotaro Horiguchi > > wrote: > > > > > > At Mon, 30 Nov 2020 21:03:45 -0300, Alvaro Herrera > > > wrote in > > > > On 2020-Nov-26, Kyotaro Horiguchi wrote: > > > > > > > > > This shares RI_ConstraintInfo cache by constraints that shares the > > > > > same parent constraints. But you forgot that the cache contains some > > > > > members that can differ among partitions. > > > > > > > > > > Consider the case of attaching a partition that have experienced a > > > > > column deletion. > > > > > > > > I think this can be solved easily in the patch, by having > > > > ri_BuildQueryKey() compare the parent's fk_attnums to the parent; if > > > > they are equal then use the parent's constaint_id, otherwise use the > > > > child constraint. That way, the cache entry is reused in the common > > > > case where they are identical. > > > > > > *I* think it's the direction. After an off-list discussion, we > > > confirmed that even in that case the patch works as is because > > > fk_attnum (or contuple.conkey) always stores key attnums compatible > > > to the topmost parent when conparent has a valid value (assuming the > > > current usage of fk_attnum), but I still feel uneasy to rely on that > > > unclear behavior. > > > > Hmm, I don't see what the problem is here, because it's not > > RI_ConstraintInfo that is being shared among partitions of the same > > parent, but the RI query (and the SPIPlanPtr for it) that is issued > > through SPI. The query (and the plan) turns out to be the same no > > matter what partition's RI_ConstraintInfo is first used to generate > > it. What am I missing? > > I don't think you're missing something. As I (tried to..) mentioned in > another branch of this thread, you're right. On the otherhand > fk_attnums is actually used to generate the query, thus *issue* > potentially affects the query. Although that might be paranoic, that > possibility is eliminated by checking the congruency(?) of fk_attnums > (or other members). We might even be able to share a riinfo among > such children. Just to be sure I've not misunderstood you, let me mention why I think the queries used by different partitions all turn out to be the same despite attribute number differences among partitions. The places that uses fk_attnums when generating a query are these among others: Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]); ... Oid fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]); ... quoteOneName(attname, RIAttName(fk_rel, riinfo->fk_attnums[i])); For the queries on the referencing side ("check" side), type/collation/attribute name determined using the above are going to be the same for all partitions in a given tree irrespective of the attribute number, because they're logically the same column. On the referenced side ("restrict", "cascade", "set" side), as you already mentioned, fk_attnums refers to the top parent table of the referencing side, so no possibility of they being different in the various referenced partitions' RI_ConstraintInfos. On the topic of how we'd be able to share even the RI_ConstraintInfos among partitions, that would indeed look a bit more elaborate than the patch we have right now. > > BTW, one problem with Kuroda-san's patch is that it's using > > conparentid as the shared key, which can still lead to multiple > > queries being generated when using multi-level partitioning, because > > there would be many (intermediate) parent constraints in that case. > > We should really be using the "root" constraint OID as the key, > > because there would only be one root from which all constraints in a > > given partition hierarchy would've originated. Actually, I had > > written a patch a few months back to do exactly that, a polished > > version of which I've attached with this email. Please take a look. > > I don't like that the function self-recurses but > ri_GetParentConstrOid() actually climbs up to the root constraint, so > the patch covers the multilevel partitioning cases. Sorry, I got too easily distracted by the name Kuroda-san chose for the field in RI_ConstraintInfo and the function. > About your patch, it calculates the root constrid at the time an > riinfo is created, but when the root-partition is further attached to > another partitioned-table after the riinfo creation, > constraint_root_id gets stale. Of course that dones't matter > practically, though. Maybe we could also store the hash value of the root constraint OID as rootHashValue and check for that one too in InvalidateConstraintCacheCallBack(). That would take care of this unless I'm missing something. -- Amit Langote EDB: http://www.enterprisedb.com
Re: Huge memory consumption on partitioned table with FKs
Hi Hackers, >> I would embed all this knowledge in ri_BuildQueryKey though, without >> adding the new function ri_GetParentConstOid. I don't think that >> function meaningful abstraction value, and instead it would make what I >> suggest more difficult. > It seems to me reasonable. Indeed, I tried to get the conparentid with ri_BuildQueryKey. (v2_reduce_ri_SPI-plan-hash.patch) > Somewhat of a detour, but in reviewing the patch for > Statement-Level RI checks, Andres and I observed that SPI > made for a large portion of the RI overhead. If this can be resolved by reviewing the Statement-Level RI checks, then I think it would be better. > BTW, one problem with Kuroda-san's patch is that it's using > conparentid as the shared key, which can still lead to multiple > queries being generated when using multi-level partitioning, because > there would be many (intermediate) parent constraints in that case. Horiguchi-san also mentiond, In my patch, in ri_GetParentConstOid, iterate on getting the constraint OID until comparentid == InvalidOid. This should allow to get the "root constraint OID". However, the names "comparentid" and "ri_GetParentConstOid" are confusing. We should use names like "constraint_root_id", as in Amit-san's patch. Best Regards, -- Keisuke Kuroda NTT Software Innovation Center keisuke.kuroda.3...@gmail.com v2_reduce_ri_SPI-plan-hash.patch Description: Binary data
Re: Huge memory consumption on partitioned table with FKs
At Wed, 2 Dec 2020 22:02:50 +0900, Amit Langote wrote in > Hello, > > On Tue, Dec 1, 2020 at 9:40 AM Kyotaro Horiguchi > wrote: > > > > At Mon, 30 Nov 2020 21:03:45 -0300, Alvaro Herrera > > wrote in > > > On 2020-Nov-26, Kyotaro Horiguchi wrote: > > > > > > > This shares RI_ConstraintInfo cache by constraints that shares the > > > > same parent constraints. But you forgot that the cache contains some > > > > members that can differ among partitions. > > > > > > > > Consider the case of attaching a partition that have experienced a > > > > column deletion. > > > > > > I think this can be solved easily in the patch, by having > > > ri_BuildQueryKey() compare the parent's fk_attnums to the parent; if > > > they are equal then use the parent's constaint_id, otherwise use the > > > child constraint. That way, the cache entry is reused in the common > > > case where they are identical. > > > > *I* think it's the direction. After an off-list discussion, we > > confirmed that even in that case the patch works as is because > > fk_attnum (or contuple.conkey) always stores key attnums compatible > > to the topmost parent when conparent has a valid value (assuming the > > current usage of fk_attnum), but I still feel uneasy to rely on that > > unclear behavior. > > Hmm, I don't see what the problem is here, because it's not > RI_ConstraintInfo that is being shared among partitions of the same > parent, but the RI query (and the SPIPlanPtr for it) that is issued > through SPI. The query (and the plan) turns out to be the same no > matter what partition's RI_ConstraintInfo is first used to generate > it. What am I missing? I don't think you're missing something. As I (tried to..) mentioned in another branch of this thread, you're right. On the otherhand fk_attnums is actually used to generate the query, thus *issue* potentially affects the query. Although that might be paranoic, that possibility is eliminated by checking the congruency(?) of fk_attnums (or other members). We might even be able to share a riinfo among such children. > BTW, one problem with Kuroda-san's patch is that it's using > conparentid as the shared key, which can still lead to multiple > queries being generated when using multi-level partitioning, because > there would be many (intermediate) parent constraints in that case. > We should really be using the "root" constraint OID as the key, > because there would only be one root from which all constraints in a > given partition hierarchy would've originated. Actually, I had > written a patch a few months back to do exactly that, a polished > version of which I've attached with this email. Please take a look. I don't like that the function self-recurses but ri_GetParentConstrOid() actually climbs up to the root constraint, so the patch covers the multilevel partitioning cases. About your patch, it calculates the root constrid at the time an riinfo is created, but when the root-partition is further attached to another partitioned-table after the riinfo creation, constraint_root_id gets stale. Of course that dones't matter practically, though. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Huge memory consumption on partitioned table with FKs
On Tue, 1 Dec 2020 at 00:03, Alvaro Herrera wrote: > cache entry is reused in the common case where they are identical. Does a similar situation exist for partition statistics accessed during planning? Or planning itself? It would be useful to avoid repeated access to similar statistics and repeated planning of sub-plans for similar partitions. -- Simon Riggshttp://www.EnterpriseDB.com/
Re: Huge memory consumption on partitioned table with FKs
Hello, On Tue, Dec 1, 2020 at 9:40 AM Kyotaro Horiguchi wrote: > > At Mon, 30 Nov 2020 21:03:45 -0300, Alvaro Herrera > wrote in > > On 2020-Nov-26, Kyotaro Horiguchi wrote: > > > > > This shares RI_ConstraintInfo cache by constraints that shares the > > > same parent constraints. But you forgot that the cache contains some > > > members that can differ among partitions. > > > > > > Consider the case of attaching a partition that have experienced a > > > column deletion. > > > > I think this can be solved easily in the patch, by having > > ri_BuildQueryKey() compare the parent's fk_attnums to the parent; if > > they are equal then use the parent's constaint_id, otherwise use the > > child constraint. That way, the cache entry is reused in the common > > case where they are identical. > > *I* think it's the direction. After an off-list discussion, we > confirmed that even in that case the patch works as is because > fk_attnum (or contuple.conkey) always stores key attnums compatible > to the topmost parent when conparent has a valid value (assuming the > current usage of fk_attnum), but I still feel uneasy to rely on that > unclear behavior. Hmm, I don't see what the problem is here, because it's not RI_ConstraintInfo that is being shared among partitions of the same parent, but the RI query (and the SPIPlanPtr for it) that is issued through SPI. The query (and the plan) turns out to be the same no matter what partition's RI_ConstraintInfo is first used to generate it. What am I missing? BTW, one problem with Kuroda-san's patch is that it's using conparentid as the shared key, which can still lead to multiple queries being generated when using multi-level partitioning, because there would be many (intermediate) parent constraints in that case. We should really be using the "root" constraint OID as the key, because there would only be one root from which all constraints in a given partition hierarchy would've originated. Actually, I had written a patch a few months back to do exactly that, a polished version of which I've attached with this email. Please take a look. -- Amit Langote EDB: http://www.enterprisedb.com 0001-ri_triggers.c-Use-root-constraint-OID-as-key-to-ri_q.patch Description: Binary data
Re: Huge memory consumption on partitioned table with FKs
On Mon, Nov 30, 2020 at 9:48 PM Tom Lane wrote: > Corey Huinker writes: > > Given that we're already looking at these checks, I was wondering if this > > might be the time to consider implementing these checks by directly > > scanning the constraint index. > > Yeah, maybe. Certainly ri_triggers is putting a huge amount of effort > into working around the SPI/parser/planner layer, to not a lot of gain. > > However, it's not clear to me that that line of thought will work well > for the statement-level-trigger approach. In that case you might be > dealing with enough tuples to make a different plan advisable. > > regards, tom lane > Bypassing SPI would probably mean that we stay with row level triggers, and the cached query plan would go away, perhaps replaced by an already-looked-up-this-tuple hash sorta like what the cached nested loops effort is doing. I've been meaning to give this a try when I got some spare time. This may inspire me to try again.
Re: Huge memory consumption on partitioned table with FKs
Corey Huinker writes: > Given that we're already looking at these checks, I was wondering if this > might be the time to consider implementing these checks by directly > scanning the constraint index. Yeah, maybe. Certainly ri_triggers is putting a huge amount of effort into working around the SPI/parser/planner layer, to not a lot of gain. However, it's not clear to me that that line of thought will work well for the statement-level-trigger approach. In that case you might be dealing with enough tuples to make a different plan advisable. regards, tom lane
Re: Huge memory consumption on partitioned table with FKs
> > I think this can be solved easily in the patch, by having > ri_BuildQueryKey() compare the parent's fk_attnums to the parent; if > they are equal then use the parent's constaint_id, otherwise use the > child constraint. That way, the cache entry is reused in the common > case where they are identical. > Somewhat of a detour, but in reviewing the patch for Statement-Level RI checks, Andres and I observed that SPI made for a large portion of the RI overhead. Given that we're already looking at these checks, I was wondering if this might be the time to consider implementing these checks by directly scanning the constraint index.
Re: Huge memory consumption on partitioned table with FKs
At Mon, 30 Nov 2020 21:03:45 -0300, Alvaro Herrera wrote in > On 2020-Nov-26, Kyotaro Horiguchi wrote: > > > This shares RI_ConstraintInfo cache by constraints that shares the > > same parent constraints. But you forgot that the cache contains some > > members that can differ among partitions. > > > > Consider the case of attaching a partition that have experienced a > > column deletion. > > I think this can be solved easily in the patch, by having > ri_BuildQueryKey() compare the parent's fk_attnums to the parent; if > they are equal then use the parent's constaint_id, otherwise use the > child constraint. That way, the cache entry is reused in the common > case where they are identical. *I* think it's the direction. After an off-list discussion, we confirmed that even in that case the patch works as is because fk_attnum (or contuple.conkey) always stores key attnums compatible to the topmost parent when conparent has a valid value (assuming the current usage of fk_attnum), but I still feel uneasy to rely on that unclear behavior. > I would embed all this knowledge in ri_BuildQueryKey though, without > adding the new function ri_GetParentConstOid. I don't think that > function meaningful abstraction value, and instead it would make what I > suggest more difficult. It seems to me reasonable. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Huge memory consumption on partitioned table with FKs
On 2020-Nov-26, Kyotaro Horiguchi wrote: > This shares RI_ConstraintInfo cache by constraints that shares the > same parent constraints. But you forgot that the cache contains some > members that can differ among partitions. > > Consider the case of attaching a partition that have experienced a > column deletion. I think this can be solved easily in the patch, by having ri_BuildQueryKey() compare the parent's fk_attnums to the parent; if they are equal then use the parent's constaint_id, otherwise use the child constraint. That way, the cache entry is reused in the common case where they are identical. I would embed all this knowledge in ri_BuildQueryKey though, without adding the new function ri_GetParentConstOid. I don't think that function meaningful abstraction value, and instead it would make what I suggest more difficult.
Re: Huge memory consumption on partitioned table with FKs
At Thu, 26 Nov 2020 09:59:28 +0900, Keisuke Kuroda wrote in > Hi Hackers, > > Analyzed the problem and created a patch to resolve it. > > # Problem 1 > > When you create a foreign key to a partitioned table, referential > integrity function is created for the number of partitions. > Internally, SPI_prepare() creates a plan and SPI_keepplan() > caches the plan. > > The more partitions in the referencing table, the more plans will > be cached. > > # Problem 2 > > When referenced table is partitioned table, the larger the number > of partitions, the larger the plan size to be cached. > > The actual plan processed is simple and small if pruning is > enabled. However, the cached plan will include all partition > information. > > The more partitions in the referenced table, the larger the plan > size to be cached. > > # Idea for solution > > Constraints with the same pg_constraint.parentid can be combined > into one plan with the same comparentid if we can guarantee that > all their contents are the same. The memory reduction this patch gives seems quite high with a small footprint. This shares RI_ConstraintInfo cache by constraints that shares the same parent constraints. But you forgot that the cache contains some members that can differ among partitions. Consider the case of attaching a partition that have experienced a column deletion. create table t (a int primary key); create table p (a int, r int references t(a)) partition by range(a); create table c1 partition of p for values from (0) to (10); create table c2 (a int, r int); alter table c2 drop column r; alter table c2 add column r int; alter table p attach partition c2 for values from (10) to (20); In that case riinfo->fk_attnums has different values from other partitions. =# select oid, conrelid::regclass, confrelid::regclass, conparentid, conname, conkey from pg_constraint where confrelid = 't'::regclass; oid | conrelid | confrelid | conparentid | conname | conkey ---+--+---+-+--+ 16620 | p| t | 0 | p_r_fkey | {2} 16626 | c1 | t | 16620 | p_r_fkey | {2} 16632 | c2 | t | 16620 | p_r_fkey | {3} (3 rows) conkey is copied onto riinfo->fk_attnums. > Problem 1 can be solved > and significant memory bloat can be avoided. > CachedPlan: 710MB -> 1466kB > > Solving Problem 2 could also reduce memory, > but I don't have a good idea. > > Currently, DISCARD ALL does not discard CachedPlan by SPI as in > this case. It may be possible to modify DISCARD ALL to discard > CachedPlan and run it periodically. However, we think the burden > on the user is high. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Huge memory consumption on partitioned table with FKs
Hi Hackers, Analyzed the problem and created a patch to resolve it. # Problem 1 When you create a foreign key to a partitioned table, referential integrity function is created for the number of partitions. Internally, SPI_prepare() creates a plan and SPI_keepplan() caches the plan. The more partitions in the referencing table, the more plans will be cached. # Problem 2 When referenced table is partitioned table, the larger the number of partitions, the larger the plan size to be cached. The actual plan processed is simple and small if pruning is enabled. However, the cached plan will include all partition information. The more partitions in the referenced table, the larger the plan size to be cached. # Idea for solution Constraints with the same pg_constraint.parentid can be combined into one plan with the same comparentid if we can guarantee that all their contents are the same. Problem 1 can be solved and significant memory bloat can be avoided. CachedPlan: 710MB -> 1466kB Solving Problem 2 could also reduce memory, but I don't have a good idea. Currently, DISCARD ALL does not discard CachedPlan by SPI as in this case. It may be possible to modify DISCARD ALL to discard CachedPlan and run it periodically. However, we think the burden on the user is high. # result with patch(PG14 HEAD(e522024b) + patch) name | bytes | pg_size_pretty --+-+ CachedPlanQuery | 12912 | 13 kB CachedPlanSource | 17448 | 17 kB CachedPlan | 1501192 | 1466 kB CachedPlan * 1 Record postgres=# SELECT count(*) FROM pg_backend_memory_contexts WHERE name = 'CachedPlan' AND ident LIKE 'SELECT 1 FROM%'; count --- 1 postgres=# SELECT * FROM pg_backend_memory_contexts WHERE name = 'CachedPlan' AND ident LIKE 'SELECT 1 FROM%'; -[ RECORD 1 ]-+-- name | CachedPlan ident | SELECT 1 FROM "public"."ps" x WHERE "c1" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x parent| CacheMemoryContext level | 2 total_bytes | 2101248 total_nblocks | 12 free_bytes| 613256 free_chunks | 1 used_bytes| 1487992 (1 record) # result without patch(PG14 HEAD(e522024b)) name | bytes | pg_size_pretty --+---+ CachedPlanQuery | 1326280 | 1295 kB CachedPlanSource | 1474528 | 1440 kB CachedPlan | 744009200 | 710 MB CachedPlan * 500 Records postgres=# SELECT count(*) FROM pg_backend_memory_contexts WHERE name = 'CachedPlan' AND ident LIKE 'SELECT 1 FROM%'; count --- 500 SELECT * FROM pg_backend_memory_contexts WHERE name = 'CachedPlan' AND ident LIKE 'SELECT 1 FROM%'; name | CachedPlan ident | SELECT 1 FROM "public"."ps" x WHERE "c1" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x parent| CacheMemoryContext level | 2 total_bytes | 2101248 total_nblocks | 12 free_bytes| 613256 free_chunks | 1 used_bytes| 1487992 ...(500 same records) Best Regards, -- Keisuke Kuroda NTT Software Innovation Center keisuke.kuroda.3...@gmail.com v1_reduce_ri_SPI-plan-hash.patch Description: Binary data
Huge memory consumption on partitioned table with FKs
Hi Hackers, My company (NTT Comware) and NTT OSS Center did verification of partitioned table on PG14dev, and we faced a problem that consumed huge memory when we created a Foreign key constraint on a partitioned table including 500 partitioning tables and inserted some data. We investigated it a little to use the "pg_backend_memory_contextes" command and realized a "CachedPlan" of backends increased dramatically. See below: Without FKs == CachedPlan 0kB With FKs (the problem is here) == CachedPlan 710MB Please find the attached file to reproduce the problem. We know two things as following: - Each backend uses the size of CachedPlan - The more increasing partitioning tables, the more CachedPlan consuming If there are many backends, it consumes about the size of CachedPlan x the number of backends. It may occur a shortage of memory and OOM killer. We think the affected version are PG12 or later. I believe it would be better to fix the problem. Any thoughts? Regards, Tatsuro Yamada DROP TABLE IF EXISTS pr CASCADE; DROP TABLE IF EXISTS ps CASCADE; CREATE TABLE ps (c1 INT PRIMARY KEY) PARTITION BY RANGE(c1); CREATE TABLE pr (c1 INT, c2 INT REFERENCES ps(c1)) PARTITION BY RANGE(c1); -- Show memory usage of 'Cached%' SELECT name, sum(used_bytes) as bytes, pg_size_pretty(sum(used_bytes)) FROM pg_backend_memory_contexts WHERE name LIKE 'Cached%' GROUP BY name; -- Procedure for creating partitioned table CREATE OR REPLACE PROCEDURE part_make(tbl text, num int) AS $$ DECLARE width int := 10; next int :=1; BEGIN FOR i in 1..num LOOP EXECUTE 'CREATE TABLE ' || tbl || '_' || i || ' partition of ' || tbl || ' FOR VALUES FROM (' || next || ') TO (' || i * width || ');'; next := i * width; END LOOP; END; $$ LANGUAGE plpgsql; -- Create partitioned tables named ps and pr. The each table has 500 partitioning tables. CALL part_make('ps', 500); CALL part_make('pr', 500); -- Insert data INSERT INTO ps SELECT generate_series(1,4999); INSERT INTO pr SELECT i, i from generate_series(1,4999)i; -- Show memory usages of 'Cached%' again -- You can see 'CachedPlan 710MB' SELECT name, sum(used_bytes) as bytes, pg_size_pretty(sum(used_bytes)) FROM pg_backend_memory_contexts WHERE name LIKE 'Cached%' GROUP BY name;