Re: Huge memory consumption on partitioned table with FKs

2021-03-11 Thread Tatsuro Yamada

On 2021/03/12 2:44, Tom Lane wrote:

I wrote:

Now, maybe it's a coincidence that husky failed on a
partitioned-foreign-key test right after this patch went in, but I bet
not.  Since husky runs CLOBBER_CACHE_ALWAYS, it looks to me like we've
overlooked some cache-reset scenario or other.


After reproducing it here, that *is* a coincidence.  I shall now go beat
up on the correct blame-ee, instead.



I did "make installcheck-parallel" on 7bb97211a, just in case. It was 
successful. :-D

Regards,
Tatsuro Yamada





Re: Huge memory consumption on partitioned table with FKs

2021-03-11 Thread Tom Lane
I wrote:
> Now, maybe it's a coincidence that husky failed on a
> partitioned-foreign-key test right after this patch went in, but I bet
> not.  Since husky runs CLOBBER_CACHE_ALWAYS, it looks to me like we've
> overlooked some cache-reset scenario or other.

After reproducing it here, that *is* a coincidence.  I shall now go beat
up on the correct blame-ee, instead.

regards, tom lane




Re: Huge memory consumption on partitioned table with FKs

2021-03-11 Thread Tom Lane
Tatsuro Yamada  writes:
> Thanks for fixing the problem! :-D

Hmm, I'm not sure we're done with this patch:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=husky=2021-03-10%2021%3A09%3A32

The critical log extract is

2021-03-11 05:10:13.012 CET [21574:1082] pg_regress/foreign_key LOG:  
statement: insert into fk_notpartitioned_pk (a, b)
  select 2048, x from generate_series(1,10) x;
2021-03-11 05:10:13.104 CET [11830:368] LOG:  server process (PID 21574) was 
terminated by signal 11: Segmentation fault
2021-03-11 05:10:13.104 CET [11830:369] DETAIL:  Failed process was running: 
insert into fk_notpartitioned_pk (a, b)
  select 2048, x from generate_series(1,10) x;

Now, maybe it's a coincidence that husky failed on a
partitioned-foreign-key test right after this patch went in, but I bet
not.  Since husky runs CLOBBER_CACHE_ALWAYS, it looks to me like we've
overlooked some cache-reset scenario or other.

regards, tom lane




Re: Huge memory consumption on partitioned table with FKs

2021-03-10 Thread Tatsuro Yamada

On 2021/03/11 9:39, Amit Langote wrote:

On Thu, Mar 11, 2021 at 4:25 AM Tom Lane  wrote:

Amit Langote  writes:

On Wed, Mar 10, 2021 at 8:37 AM Tom Lane  wrote:

Hmm.  So, the key point is that the values coming from the partitioned
child table are injected into the test query as parameters, not as
column references, thus it doesn't matter *to the test query* what
numbers the referencing columns have in that child.  We just have to
be sure we pass the right parameter values.



Right.


I did some cosmetic fooling with this (mostly, rewriting the comments
YA time) and pushed it.


Perfect.   Thanks for your time on this.



Thanks for fixing the problem! :-D


Regards,
Tatsuro Yamada





Re: Huge memory consumption on partitioned table with FKs

2021-03-10 Thread Keisuke Kuroda
Hi hackers,

> >
> > I did some cosmetic fooling with this (mostly, rewriting the comments
> > YA time) and pushed it.
>
> Perfect.   Thanks for your time on this.
>

Thank you for your help! I'm glad to solve it.

-- 
Keisuke Kuroda
NTT Software Innovation Center
keisuke.kuroda.3...@gmail.com




Re: Huge memory consumption on partitioned table with FKs

2021-03-10 Thread Amit Langote
On Thu, Mar 11, 2021 at 4:25 AM Tom Lane  wrote:
> Amit Langote  writes:
> > On Wed, Mar 10, 2021 at 8:37 AM Tom Lane  wrote:
> >> Hmm.  So, the key point is that the values coming from the partitioned
> >> child table are injected into the test query as parameters, not as
> >> column references, thus it doesn't matter *to the test query* what
> >> numbers the referencing columns have in that child.  We just have to
> >> be sure we pass the right parameter values.
>
> > Right.
>
> I did some cosmetic fooling with this (mostly, rewriting the comments
> YA time) and pushed it.

Perfect.   Thanks for your time on this.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: Huge memory consumption on partitioned table with FKs

2021-03-10 Thread Tom Lane
Amit Langote  writes:
> On Wed, Mar 10, 2021 at 8:37 AM Tom Lane  wrote:
>> Hmm.  So, the key point is that the values coming from the partitioned
>> child table are injected into the test query as parameters, not as
>> column references, thus it doesn't matter *to the test query* what
>> numbers the referencing columns have in that child.  We just have to
>> be sure we pass the right parameter values.

> Right.

I did some cosmetic fooling with this (mostly, rewriting the comments
YA time) and pushed it.

regards, tom lane




Re: Huge memory consumption on partitioned table with FKs

2021-03-10 Thread Amit Langote
On Wed, Mar 10, 2021 at 8:37 AM Tom Lane  wrote:
> Amit Langote  writes:
> > On Fri, Mar 5, 2021 at 6:00 AM Tom Lane  wrote:
> >> This claim seems false on its face:
> >>> All child constraints of a given foreign key constraint can use the
> >>> same RI query and the resulting plan, that is, no need to create as
> >>> many copies of the query and the plan as there are partitions, as
> >>> happens now due to the child constraint OID being used in the key
> >>> for ri_query_cache.
>
> > The quoted comment could have been written to be clearer about this,
> > but it's not talking about the table that is to be queried, but the
> > table whose RI trigger is being executed.  In all the cases except one
> > (mentioned below), the table that is queried is the same irrespective
> > of which partition's trigger is being executed, so we're basically
> > creating the same plan separately for each partition.
>
> Hmm.  So, the key point is that the values coming from the partitioned
> child table are injected into the test query as parameters, not as
> column references, thus it doesn't matter *to the test query* what
> numbers the referencing columns have in that child.  We just have to
> be sure we pass the right parameter values.

Right.

>  But ... doesn't the code
> use riinfo->fk_attnums[] to pull out the values to be passed?

Yes, from a slot that belongs to the child table.

> IOW, I now get the point about being able to share the SPI plans,
> but I'm still dubious about sharing the RI_ConstraintInfo cache entries.

There was actually a proposal upthread about sharing the
RI_ConstraintInfo too, but we decided to not pursue that for now.

> It looks to me like the v4 patch is now actually not sharing the
> cache entries, ie their hash key is just the child constraint OID
> same as before;

Yeah, you may see that we're only changing ri_BuildQueryKey() in the
patch affecting only ri_query_cache, but not ri_LoadConstraintInfo()
which leaves ri_constraint_cache unaffected.

> but the comments are pretty confused about this.

I've tried improving the comment in ri_BuildQueryKey() a bit to make
clear what is and what is not being shared between partitions.

> It might be simpler if you add just one new field which is the OID of
> the constraint that we're building the SPI query from, which might be
> either equal to constraint_id, or the OID of some parent constraint.
> In particular it's not clear to me why we need both constraint_parent
> and constraint_root_id.

Yeah, I think constraint_parent is a leftover from some earlier
hacking.  I have removed it.

Attached updated patch.

--
Amit Langote
EDB: http://www.enterprisedb.com


v5-0001-Share-SPI-plan-between-partitions-in-some-RI-trig.patch
Description: Binary data


Re: Huge memory consumption on partitioned table with FKs

2021-03-09 Thread Tom Lane
Amit Langote  writes:
> On Fri, Mar 5, 2021 at 6:00 AM Tom Lane  wrote:
>> This claim seems false on its face:
>>> All child constraints of a given foreign key constraint can use the
>>> same RI query and the resulting plan, that is, no need to create as
>>> many copies of the query and the plan as there are partitions, as
>>> happens now due to the child constraint OID being used in the key
>>> for ri_query_cache.

> The quoted comment could have been written to be clearer about this,
> but it's not talking about the table that is to be queried, but the
> table whose RI trigger is being executed.  In all the cases except one
> (mentioned below), the table that is queried is the same irrespective
> of which partition's trigger is being executed, so we're basically
> creating the same plan separately for each partition.

Hmm.  So, the key point is that the values coming from the partitioned
child table are injected into the test query as parameters, not as
column references, thus it doesn't matter *to the test query* what
numbers the referencing columns have in that child.  We just have to
be sure we pass the right parameter values.  But ... doesn't the code
use riinfo->fk_attnums[] to pull out the values to be passed?

IOW, I now get the point about being able to share the SPI plans,
but I'm still dubious about sharing the RI_ConstraintInfo cache entries.

It looks to me like the v4 patch is now actually not sharing the
cache entries, ie their hash key is just the child constraint OID
same as before; but the comments are pretty confused about this.

It might be simpler if you add just one new field which is the OID of
the constraint that we're building the SPI query from, which might be
either equal to constraint_id, or the OID of some parent constraint.
In particular it's not clear to me why we need both constraint_parent
and constraint_root_id.

regards, tom lane




Re: Huge memory consumption on partitioned table with FKs

2021-03-08 Thread Amit Langote
On Mon, Mar 8, 2021 at 9:53 PM Andy Fan  wrote:
> On Mon, Mar 8, 2021 at 8:42 PM Amit Langote  wrote:
>> On Mon, Mar 8, 2021 at 8:39 PM Andy Fan  wrote:
>> > On Mon, Mar 8, 2021 at 3:43 PM Andy Fan  wrote:
>> >> My point below is a bit off-topic, but I want to share it here.  Since
>> >> we implement a partitioned table in PG with the inherited class, it has 
>> >> much
>> >> more flexibility than other databases.  Like in PG,  we allow different 
>> >> partitions
>> >> have different physical order,  different indexes,  maybe different index 
>> >> states.
>> >> that would cause our development work hard in many places and cause some
>> >> runtime issues as well (like catalog memory usage),  have we discussed
>> >> limiting some flexibility so that we can have better coding/running 
>> >> experience?
>> >> I want to do some research in this direction, but it would be better that 
>> >> I can
>> >> listen to any advice from others.  More specifically, I want to reduce 
>> >> the memory
>> >> usage of Partitioned table/index as the first step.  In my testing, each 
>> >> IndexOptInfo
>> >> will use 2kB memory in each backend.
>> >
>> >
>> > As for the compatible issue,  will it be ok to introduce a new  concept 
>> > like "
>> > CREATE TABLE p (a int) partitioned by list(a) RESTRICTED".  We can add 
>> > these
>> > limitation to restricted partitioned relation only.
>>
>> I think you'd agree that the topics you want to discuss deserve a
>> separate discussion thread.  You may refer to this discussion in that
>> new thread if you think that your proposals can solve the problem
>> being discussed here more generally, which would of course be great.
>
> Sure, I can prepare more data and start a new thread for this.

Great, thanks.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: Huge memory consumption on partitioned table with FKs

2021-03-08 Thread Andy Fan
On Mon, Mar 8, 2021 at 8:42 PM Amit Langote  wrote:

> Hi Andy,
>
> On Mon, Mar 8, 2021 at 8:39 PM Andy Fan  wrote:
> > On Mon, Mar 8, 2021 at 3:43 PM Andy Fan 
> wrote:
> >> My point below is a bit off-topic, but I want to share it here.  Since
> >> we implement a partitioned table in PG with the inherited class, it has
> much
> >> more flexibility than other databases.  Like in PG,  we allow different
> partitions
> >> have different physical order,  different indexes,  maybe different
> index states.
> >> that would cause our development work hard in many places and cause some
> >> runtime issues as well (like catalog memory usage),  have we discussed
> >> limiting some flexibility so that we can have better coding/running
> experience?
> >> I want to do some research in this direction, but it would be better
> that I can
> >> listen to any advice from others.  More specifically, I want to reduce
> the memory
> >> usage of Partitioned table/index as the first step.  In my testing,
> each IndexOptInfo
> >> will use 2kB memory in each backend.
> >
> >
> > As for the compatible issue,  will it be ok to introduce a new  concept
> like "
> > CREATE TABLE p (a int) partitioned by list(a) RESTRICTED".  We can add
> these
> > limitation to restricted partitioned relation only.
>
> I think you'd agree that the topics you want to discuss deserve a
> separate discussion thread.  You may refer to this discussion in that
> new thread if you think that your proposals can solve the problem
> being discussed here more generally, which would of course be great.
>
>
Sure, I can prepare more data and start a new thread for this.

-- 
Best Regards
Andy Fan (https://www.aliyun.com/)


Re: Huge memory consumption on partitioned table with FKs

2021-03-08 Thread Amit Langote
Hi Andy,

On Mon, Mar 8, 2021 at 8:39 PM Andy Fan  wrote:
> On Mon, Mar 8, 2021 at 3:43 PM Andy Fan  wrote:
>> My point below is a bit off-topic, but I want to share it here.  Since
>> we implement a partitioned table in PG with the inherited class, it has much
>> more flexibility than other databases.  Like in PG,  we allow different 
>> partitions
>> have different physical order,  different indexes,  maybe different index 
>> states.
>> that would cause our development work hard in many places and cause some
>> runtime issues as well (like catalog memory usage),  have we discussed
>> limiting some flexibility so that we can have better coding/running 
>> experience?
>> I want to do some research in this direction, but it would be better that I 
>> can
>> listen to any advice from others.  More specifically, I want to reduce the 
>> memory
>> usage of Partitioned table/index as the first step.  In my testing, each 
>> IndexOptInfo
>> will use 2kB memory in each backend.
>
>
> As for the compatible issue,  will it be ok to introduce a new  concept like "
> CREATE TABLE p (a int) partitioned by list(a) RESTRICTED".  We can add these
> limitation to restricted partitioned relation only.

I think you'd agree that the topics you want to discuss deserve a
separate discussion thread.  You may refer to this discussion in that
new thread if you think that your proposals can solve the problem
being discussed here more generally, which would of course be great.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: Huge memory consumption on partitioned table with FKs

2021-03-08 Thread Andy Fan
On Mon, Mar 8, 2021 at 3:43 PM Andy Fan  wrote:

>
>
> On Fri, Mar 5, 2021 at 5:00 AM Tom Lane  wrote:
>
>> Amit Langote  writes:
>> > Updated patch attached.
>>
>> This claim seems false on its face:
>>
>> > All child constraints of a given foreign key constraint can use the
>> > same RI query and the resulting plan, that is, no need to create as
>> > many copies of the query and the plan as there are partitions, as
>> > happens now due to the child constraint OID being used in the key
>> > for ri_query_cache.
>>
>> What if the child tables don't have the same physical column numbers
>> as the parent?
>
>
> My point below is a bit off-topic, but I want to share it here.  Since
> we implement a partitioned table in PG with the inherited class, it has
> much
> more flexibility than other databases.  Like in PG,  we allow different
> partitions
> have different physical order,  different indexes,  maybe different index
> states.
> that would cause our development work hard in many places and cause some
> runtime issues as well (like catalog memory usage),  have we discussed
> limiting some flexibility so that we can have better coding/running
> experience?
> I want to do some research in this direction, but it would be better that
> I can
> listen to any advice from others.  More specifically, I want to reduce the
> memory
> usage of Partitioned table/index as the first step.  In my testing, each
> IndexOptInfo
> will use 2kB memory in each backend.
>

As for the compatible issue,  will it be ok to introduce a new  concept
like "
CREATE TABLE p (a int) partitioned by list(a) RESTRICTED".  We can add these
limitation to restricted partitioned relation only.


>
The comment claiming that it's okay if riinfo->fk_attnums
>> doesn't match seems quite off point, because the query plan is still
>> going to need to use the correct column numbers.  Even if column numbers
>> are the same, the plan would also contain table and index OIDs that could
>> only be right for one partition.
>>
>> I could imagine translating a parent plan to apply to a child instead of
>> building it from scratch, but that would take a lot of code we don't have
>> (there's no rewriteHandler infrastructure for plan nodes).
>>
>> Maybe I'm missing something, because I see that the cfbot claims
>> this is passing, and I'd sure like to think that our test coverage
>> is not so thin that it'd fail to detect probing the wrong partition
>> for foreign key matches.  But that's what it looks like this patch
>> will do.
>>
>> regards, tom lane
>>
>>
>>
>
> --
> Best Regards
> Andy Fan (https://www.aliyun.com/)
>


-- 
Best Regards
Andy Fan (https://www.aliyun.com/)


Re: Huge memory consumption on partitioned table with FKs

2021-03-07 Thread Andy Fan
On Fri, Mar 5, 2021 at 5:00 AM Tom Lane  wrote:

> Amit Langote  writes:
> > Updated patch attached.
>
> This claim seems false on its face:
>
> > All child constraints of a given foreign key constraint can use the
> > same RI query and the resulting plan, that is, no need to create as
> > many copies of the query and the plan as there are partitions, as
> > happens now due to the child constraint OID being used in the key
> > for ri_query_cache.
>
> What if the child tables don't have the same physical column numbers
> as the parent?


My point below is a bit off-topic, but I want to share it here.  Since
we implement a partitioned table in PG with the inherited class, it has much
more flexibility than other databases.  Like in PG,  we allow different
partitions
have different physical order,  different indexes,  maybe different index
states.
that would cause our development work hard in many places and cause some
runtime issues as well (like catalog memory usage),  have we discussed
limiting some flexibility so that we can have better coding/running
experience?
I want to do some research in this direction, but it would be better that I
can
listen to any advice from others.  More specifically, I want to reduce the
memory
usage of Partitioned table/index as the first step.  In my testing, each
IndexOptInfo
will use 2kB memory in each backend.


> The comment claiming that it's okay if riinfo->fk_attnums
> doesn't match seems quite off point, because the query plan is still
> going to need to use the correct column numbers.  Even if column numbers
> are the same, the plan would also contain table and index OIDs that could
> only be right for one partition.
>
> I could imagine translating a parent plan to apply to a child instead of
> building it from scratch, but that would take a lot of code we don't have
> (there's no rewriteHandler infrastructure for plan nodes).
>
> Maybe I'm missing something, because I see that the cfbot claims
> this is passing, and I'd sure like to think that our test coverage
> is not so thin that it'd fail to detect probing the wrong partition
> for foreign key matches.  But that's what it looks like this patch
> will do.
>
> regards, tom lane
>
>
>

-- 
Best Regards
Andy Fan (https://www.aliyun.com/)


Re: Huge memory consumption on partitioned table with FKs

2021-03-07 Thread Amit Langote
On Fri, Mar 5, 2021 at 6:00 AM Tom Lane  wrote:
> Amit Langote  writes:
> > Updated patch attached.
>
> This claim seems false on its face:
>
> > All child constraints of a given foreign key constraint can use the
> > same RI query and the resulting plan, that is, no need to create as
> > many copies of the query and the plan as there are partitions, as
> > happens now due to the child constraint OID being used in the key
> > for ri_query_cache.
>
> What if the child tables don't have the same physical column numbers
> as the parent?  The comment claiming that it's okay if riinfo->fk_attnums
> doesn't match seems quite off point, because the query plan is still
> going to need to use the correct column numbers.  Even if column numbers
> are the same, the plan would also contain table and index OIDs that could
> only be right for one partition.
>
> I could imagine translating a parent plan to apply to a child instead of
> building it from scratch, but that would take a lot of code we don't have
> (there's no rewriteHandler infrastructure for plan nodes).
>
> Maybe I'm missing something, because I see that the cfbot claims
> this is passing, and I'd sure like to think that our test coverage
> is not so thin that it'd fail to detect probing the wrong partition
> for foreign key matches.  But that's what it looks like this patch
> will do.

The quoted comment could have been written to be clearer about this,
but it's not talking about the table that is to be queried, but the
table whose RI trigger is being executed.  In all the cases except one
(mentioned below), the table that is queried is the same irrespective
of which partition's trigger is being executed, so we're basically
creating the same plan separately for each partition.

create table foo (a int primary key) partition by range (a);
create table foo1 partition of foo for values from (minvalue) to (1);
create table foo2 partition of foo for values from (1) to (maxvalue);
create table bar (a int references foo) partition by range (a);
create table bar1 partition of bar for values from (minvalue) to (1);
create table bar2 partition of bar for values from (1) to (maxvalue);

insert into foo values (0), (1);
insert into bar values (0), (1);

For the 2nd insert statement, RI_FKey_check() issues the following
query irrespective of whether it is called from bar1's or bar2's
insert check RI trigger:

SELECT 1 FROM foo WHERE foo.a = $1;

Likewise for:

delete from foo;

ri_restrict() issues the following query irrespective of whether
called from foo1's or foo2's delete action trigger:

SELECT 1 FROM bar WHERE bar.a = $1;

The one case I found in which the queried table is the same as the
table whose trigger has been invoked is ri_Check_Pk_Match(), in which
case, it's actually wrong for partitions to share the plan, so the
patch was wrong in that case.  Apparently, we didn't test that case
with partitioning, so I added one as follows:

+-- test that ri_Check_Pk_Match() scans the correct partition for a deferred
+-- ON DELETE/UPDATE NO ACTION constraint
+CREATE SCHEMA fkpart10
+  CREATE TABLE tbl1(f1 int PRIMARY KEY) PARTITION BY RANGE(f1)
+  CREATE TABLE tbl1_p1 PARTITION OF tbl1 FOR VALUES FROM (minvalue) TO (1)
+  CREATE TABLE tbl1_p2 PARTITION OF tbl1 FOR VALUES FROM (1) TO (maxvalue)
+  CREATE TABLE tbl2(f1 int REFERENCES tbl1 DEFERRABLE INITIALLY DEFERRED);
+INSERT INTO fkpart10.tbl1 VALUES (0), (1);
+INSERT INTO fkpart10.tbl2 VALUES (0), (1);
+BEGIN;
+DELETE FROM fkpart10.tbl1 WHERE f1 = 0;
+UPDATE fkpart10.tbl1 SET f1 = 2 WHERE f1 = 1;
+INSERT INTO fkpart10.tbl1 VALUES (0), (1);
+COMMIT;
+DROP SCHEMA fkpart10 CASCADE;
+NOTICE:  drop cascades to 2 other objects
+DETAIL:  drop cascades to table fkpart10.tbl1
+drop cascades to table fkpart10.tbl2

With the v3 patch, the test case fails with the following error on COMMIT:

+ERROR:  update or delete on table "tbl1_p2" violates foreign key
constraint "tbl2_f1_fkey2" on table "tbl2"
+DETAIL:  Key (f1)=(1) is still referenced from table "tbl2".

That's because in ri_Check_Pk_Match(), partition tbl2 ends up using
the same cached plan as tbl1 and so scans tbl1 to check whether the
row deleted from tbl2 has reappeared, which is of course wrong.  I
have fixed that by continuing to use the individual partition's
constraint's OID as the query key for this particular query.

I have also removed the somewhat confusing comment about fk_attnums.
The point it was trying to make is that fk_attnums being possibly
different among partitions does not make a difference to what any
given shareable query's text ultimately looks like.  Those attribute
numbers are only used by the macros RIAttName(), RIAttType(),
RIAttCollation() when generating the query text and they'd all return
the same values no matter which partition's attribute numbers are
used.

Updated patch attached.

--
Amit Langote
EDB: http://www.enterprisedb.com
From b433aabbab5f74a4284fabdb078200c6c1e89e57 Mon Sep 17 00:00:00 2001
From: amitlan 
Date: Thu, 23 Apr 

Re: Huge memory consumption on partitioned table with FKs

2021-03-04 Thread Tom Lane
Amit Langote  writes:
> Updated patch attached.

This claim seems false on its face:

> All child constraints of a given foreign key constraint can use the
> same RI query and the resulting plan, that is, no need to create as
> many copies of the query and the plan as there are partitions, as
> happens now due to the child constraint OID being used in the key
> for ri_query_cache.

What if the child tables don't have the same physical column numbers
as the parent?  The comment claiming that it's okay if riinfo->fk_attnums
doesn't match seems quite off point, because the query plan is still
going to need to use the correct column numbers.  Even if column numbers
are the same, the plan would also contain table and index OIDs that could
only be right for one partition.

I could imagine translating a parent plan to apply to a child instead of
building it from scratch, but that would take a lot of code we don't have
(there's no rewriteHandler infrastructure for plan nodes).

Maybe I'm missing something, because I see that the cfbot claims
this is passing, and I'd sure like to think that our test coverage
is not so thin that it'd fail to detect probing the wrong partition
for foreign key matches.  But that's what it looks like this patch
will do.

regards, tom lane




Re: Huge memory consumption on partitioned table with FKs

2021-03-03 Thread Alvaro Herrera
On 2021-Mar-03, Amit Langote wrote:

> I don't know of any unaddressed comments on the patch, so I've marked
> the entry Ready for Committer.

Thanks, I'll look at it later this week.

-- 
Álvaro Herrera39°49'30"S 73°17'W
#error "Operator lives in the wrong universe"
  ("Use of cookies in real-time system development", M. Gleixner, M. Mc Guire)




Re: Huge memory consumption on partitioned table with FKs

2021-03-03 Thread Amit Langote
Hi David,

On Wed, Mar 3, 2021 at 10:21 PM David Steele  wrote:
> On 12/7/20 10:59 PM, Amit Langote wrote:
> > On Tue, Dec 8, 2020 at 12:04 PM Kyotaro Horiguchi
> >  wrote:
> >> At Tue, 8 Dec 2020 01:16:00 +0900, Amit Langote  
> >> wrote in
> >>> On Mon, Dec 7, 2020 at 23:48 Alvaro Herrera  
> >>> wrote:
> 
>  I think this bit about splitting the struct is a distraction.  Let's get
>  a patch that solves the bug first, and then we can discuss what further
>  refinements we want to do.  I think we should get your patch in
>  CA+HiwqEOrfN9b=f3sdmyspgc4go-l_vmfhxllxvmmdp34e6...@mail.gmail.com
>  committed (which I have not read yet.)  Do you agree with this plan?
> >>>
> >>>
> >>> Yeah, I agree.
> >>
> >> Or 
> >> https://www.postgresql.org/message-id/ca+hiwqgrr2yoo6vobm6m_oac9w-wmxe1gouq-uydpin6zjt...@mail.gmail.com
> >>  ?
> >>
> >> +1 from me to either one.
> >
> > Oh, I hadn't actually checked the actual message that Alvaro
> > mentioned, but yeah I too am fine with either that one or the latest
> > one.
>
> Any progress on the decision of how to handle this bug? Not sure if that
> was handled elsewhere but it appears to be key to making progress on
> this patch.

The v3 patch posted on Dec 4 is still what is being proposed to fix this.

I don't know of any unaddressed comments on the patch, so I've marked
the entry Ready for Committer.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: Huge memory consumption on partitioned table with FKs

2021-03-03 Thread David Steele

On 12/7/20 10:59 PM, Amit Langote wrote:

On Tue, Dec 8, 2020 at 12:04 PM Kyotaro Horiguchi
 wrote:

At Tue, 8 Dec 2020 01:16:00 +0900, Amit Langote  wrote 
in

On Mon, Dec 7, 2020 at 23:48 Alvaro Herrera  wrote:


I think this bit about splitting the struct is a distraction.  Let's get
a patch that solves the bug first, and then we can discuss what further
refinements we want to do.  I think we should get your patch in
CA+HiwqEOrfN9b=f3sdmyspgc4go-l_vmfhxllxvmmdp34e6...@mail.gmail.com
committed (which I have not read yet.)  Do you agree with this plan?



Yeah, I agree.


Or 
https://www.postgresql.org/message-id/ca+hiwqgrr2yoo6vobm6m_oac9w-wmxe1gouq-uydpin6zjt...@mail.gmail.com
 ?

+1 from me to either one.


Oh, I hadn't actually checked the actual message that Alvaro
mentioned, but yeah I too am fine with either that one or the latest
one.


Any progress on the decision of how to handle this bug? Not sure if that 
was handled elsewhere but it appears to be key to making progress on 
this patch.


Regards,
--
-David
da...@pgmasters.net




Re: Huge memory consumption on partitioned table with FKs

2021-01-11 Thread Keisuke Kuroda
Hi Amit-san,

> Thanks for checking.  Indeed, it should have been added to the January
> commit-fest.  I've added it to the March one:
>
> https://commitfest.postgresql.org/32/2930/

Thank you for your quick response!

-- 
Keisuke Kuroda
NTT Software Innovation Center
keisuke.kuroda.3...@gmail.com




Re: Huge memory consumption on partitioned table with FKs

2021-01-07 Thread Amit Langote
Kuroda-san,

On Fri, Jan 8, 2021 at 1:02 PM Keisuke Kuroda
 wrote:
> Hi Amit-san,
>
> I noticed that this patch
> (v3-0001-ri_triggers.c-Use-root-constraint-OID-as-key-to-r.patch)
> is not registered in the commitfest. I think it needs to be registered for
> commit, is that correct?
>
> I have confirmed that this patch can be applied to HEAD (master),
> and that check-world PASS.

Thanks for checking.  Indeed, it should have been added to the January
commit-fest.  I've added it to the March one:

https://commitfest.postgresql.org/32/2930/

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: Huge memory consumption on partitioned table with FKs

2021-01-07 Thread Keisuke Kuroda
Hi Amit-san,

I noticed that this patch
(v3-0001-ri_triggers.c-Use-root-constraint-OID-as-key-to-r.patch)
is not registered in the commitfest. I think it needs to be registered for
commit, is that correct?

I have confirmed that this patch can be applied to HEAD (master),
and that check-world PASS.

Best Regards,

-- 
Keisuke Kuroda
NTT Software Innovation Center
keisuke.kuroda.3...@gmail.com




Re: Huge memory consumption on partitioned table with FKs

2020-12-08 Thread Amit Langote
On Mon, Dec 7, 2020 at 11:01 PM Amit Langote  wrote:
> On Fri, Dec 4, 2020 at 12:05 PM Kyotaro Horiguchi
>  wrote:
> > > Also, the comment that was in RI_ConstraintInfo now appears in
> > > RI_ConstraintParam, and the new struct (RI_ConstraintInfo) is now
> > > undocumented.  What is the relationship between those two structs?  I
> > > see that they have pointers to each other, but I think the relationship
> > > should be documented more clearly.
> >
> > I'm not sure the footprint of this patch worth doing but here is a bit
> > more polished version.
>
> I noticed that the foreign_key test fails and it may have to do with
> the fact that a partition's param info remains attached to the
> parent's RI_ConstraintInfo even after it's detached from the parent
> table using DETACH PARTITION.

Just for (maybe not so distant) future reference, I managed to fix the
failure with this:

diff --git a/src/backend/utils/adt/ri_triggers.c
b/src/backend/utils/adt/ri_triggers.c
index 187884f..c67f2a6 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -1932,7 +1932,7 @@ ri_BuildQueryKey(RI_QueryKey *key, const
RI_ConstraintInfo *riinfo,
 * We assume struct RI_QueryKey contains no padding bytes, else we'd need
 * to use memset to clear them.
 */
-   key->constr_id = riinfo->param->query_key;
+   key->constr_id = riinfo->constraint_id;
key->constr_queryno = constr_queryno;
 }

It seems the shareable part (param) is not properly invalidated after
a partition's constraint is detached, apparently leaving query_key in
it set to the parent's constraint id.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: Huge memory consumption on partitioned table with FKs

2020-12-07 Thread Amit Langote
On Tue, Dec 8, 2020 at 12:04 PM Kyotaro Horiguchi
 wrote:
> At Tue, 8 Dec 2020 01:16:00 +0900, Amit Langote  
> wrote in
> > Hi Alvaro,
> >
> > On Mon, Dec 7, 2020 at 23:48 Alvaro Herrera  wrote:
> >
> > > On 2020-Dec-07, Amit Langote wrote:
> > >
> > > > On Fri, Dec 4, 2020 at 12:05 PM Kyotaro Horiguchi
> > > >  wrote:
> > > > > > Also, the comment that was in RI_ConstraintInfo now appears in
> > > > > > RI_ConstraintParam, and the new struct (RI_ConstraintInfo) is now
> > > > > > undocumented.  What is the relationship between those two structs?  
> > > > > > I
> > > > > > see that they have pointers to each other, but I think the
> > > relationship
> > > > > > should be documented more clearly.
> > > > >
> > > > > I'm not sure the footprint of this patch worth doing but here is a bit
> > > > > more polished version.
> > > >
> > > > I noticed that the foreign_key test fails and it may have to do with
> > > > the fact that a partition's param info remains attached to the
> > > > parent's RI_ConstraintInfo even after it's detached from the parent
> > > > table using DETACH PARTITION.
> > >
> > > I think this bit about splitting the struct is a distraction.  Let's get
> > > a patch that solves the bug first, and then we can discuss what further
> > > refinements we want to do.  I think we should get your patch in
> > > CA+HiwqEOrfN9b=f3sdmyspgc4go-l_vmfhxllxvmmdp34e6...@mail.gmail.com
> > > committed (which I have not read yet.)  Do you agree with this plan?
> >
> >
> > Yeah, I agree.
>
> Or 
> https://www.postgresql.org/message-id/ca+hiwqgrr2yoo6vobm6m_oac9w-wmxe1gouq-uydpin6zjt...@mail.gmail.com
>  ?
>
> +1 from me to either one.

Oh, I hadn't actually checked the actual message that Alvaro
mentioned, but yeah I too am fine with either that one or the latest
one.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: Huge memory consumption on partitioned table with FKs

2020-12-07 Thread Kyotaro Horiguchi
At Tue, 8 Dec 2020 01:16:00 +0900, Amit Langote  wrote 
in 
> Hi Alvaro,
> 
> On Mon, Dec 7, 2020 at 23:48 Alvaro Herrera  wrote:
> 
> > On 2020-Dec-07, Amit Langote wrote:
> >
> > > On Fri, Dec 4, 2020 at 12:05 PM Kyotaro Horiguchi
> > >  wrote:
> > > > > Also, the comment that was in RI_ConstraintInfo now appears in
> > > > > RI_ConstraintParam, and the new struct (RI_ConstraintInfo) is now
> > > > > undocumented.  What is the relationship between those two structs?  I
> > > > > see that they have pointers to each other, but I think the
> > relationship
> > > > > should be documented more clearly.
> > > >
> > > > I'm not sure the footprint of this patch worth doing but here is a bit
> > > > more polished version.
> > >
> > > I noticed that the foreign_key test fails and it may have to do with
> > > the fact that a partition's param info remains attached to the
> > > parent's RI_ConstraintInfo even after it's detached from the parent
> > > table using DETACH PARTITION.
> >
> > I think this bit about splitting the struct is a distraction.  Let's get
> > a patch that solves the bug first, and then we can discuss what further
> > refinements we want to do.  I think we should get your patch in
> > CA+HiwqEOrfN9b=f3sdmyspgc4go-l_vmfhxllxvmmdp34e6...@mail.gmail.com
> > committed (which I have not read yet.)  Do you agree with this plan?
> 
> 
> Yeah, I agree.

Or 
https://www.postgresql.org/message-id/ca+hiwqgrr2yoo6vobm6m_oac9w-wmxe1gouq-uydpin6zjt...@mail.gmail.com
 ?

+1 from me to either one.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Huge memory consumption on partitioned table with FKs

2020-12-07 Thread Amit Langote
Hi Alvaro,

On Mon, Dec 7, 2020 at 23:48 Alvaro Herrera  wrote:

> On 2020-Dec-07, Amit Langote wrote:
>
> > On Fri, Dec 4, 2020 at 12:05 PM Kyotaro Horiguchi
> >  wrote:
> > > > Also, the comment that was in RI_ConstraintInfo now appears in
> > > > RI_ConstraintParam, and the new struct (RI_ConstraintInfo) is now
> > > > undocumented.  What is the relationship between those two structs?  I
> > > > see that they have pointers to each other, but I think the
> relationship
> > > > should be documented more clearly.
> > >
> > > I'm not sure the footprint of this patch worth doing but here is a bit
> > > more polished version.
> >
> > I noticed that the foreign_key test fails and it may have to do with
> > the fact that a partition's param info remains attached to the
> > parent's RI_ConstraintInfo even after it's detached from the parent
> > table using DETACH PARTITION.
>
> I think this bit about splitting the struct is a distraction.  Let's get
> a patch that solves the bug first, and then we can discuss what further
> refinements we want to do.  I think we should get your patch in
> CA+HiwqEOrfN9b=f3sdmyspgc4go-l_vmfhxllxvmmdp34e6...@mail.gmail.com
> committed (which I have not read yet.)  Do you agree with this plan?


Yeah, I agree.

- Amit

> --
Amit Langote
EDB: http://www.enterprisedb.com


Re: Huge memory consumption on partitioned table with FKs

2020-12-07 Thread Alvaro Herrera
On 2020-Dec-07, Amit Langote wrote:

> On Fri, Dec 4, 2020 at 12:05 PM Kyotaro Horiguchi
>  wrote:
> > > Also, the comment that was in RI_ConstraintInfo now appears in
> > > RI_ConstraintParam, and the new struct (RI_ConstraintInfo) is now
> > > undocumented.  What is the relationship between those two structs?  I
> > > see that they have pointers to each other, but I think the relationship
> > > should be documented more clearly.
> >
> > I'm not sure the footprint of this patch worth doing but here is a bit
> > more polished version.
> 
> I noticed that the foreign_key test fails and it may have to do with
> the fact that a partition's param info remains attached to the
> parent's RI_ConstraintInfo even after it's detached from the parent
> table using DETACH PARTITION.

I think this bit about splitting the struct is a distraction.  Let's get
a patch that solves the bug first, and then we can discuss what further
refinements we want to do.  I think we should get your patch in
CA+HiwqEOrfN9b=f3sdmyspgc4go-l_vmfhxllxvmmdp34e6...@mail.gmail.com
committed (which I have not read yet.)  Do you agree with this plan?




Re: Huge memory consumption on partitioned table with FKs

2020-12-07 Thread Amit Langote
On Fri, Dec 4, 2020 at 12:05 PM Kyotaro Horiguchi
 wrote:
> > Also, the comment that was in RI_ConstraintInfo now appears in
> > RI_ConstraintParam, and the new struct (RI_ConstraintInfo) is now
> > undocumented.  What is the relationship between those two structs?  I
> > see that they have pointers to each other, but I think the relationship
> > should be documented more clearly.
>
> I'm not sure the footprint of this patch worth doing but here is a bit
> more polished version.

I noticed that the foreign_key test fails and it may have to do with
the fact that a partition's param info remains attached to the
parent's RI_ConstraintInfo even after it's detached from the parent
table using DETACH PARTITION.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: Huge memory consumption on partitioned table with FKs

2020-12-03 Thread Amit Langote
On Fri, Dec 4, 2020 at 2:48 PM Kyotaro Horiguchi
 wrote:
> At Fri, 4 Dec 2020 12:00:09 +0900, Keisuke Kuroda 
>  wrote in
> > Hi Amit,
> >
> > > I have attached a patch in which I've tried to merge the ideas from
> > > both my patch and Kuroda-san's.  I liked that his patch added
> > > conparentid to RI_ConstraintInfo because that saves a needless
> > > syscache lookup for constraints that don't have a parent.  I've kept
> > > my idea to compute the root constraint id only once in
> > > ri_LoadConstraint(), not on every invocation of ri_BuildQueryKey().
> > > Kuroda-san, anything you'd like to add to that?
> >
> > Thank you for the merge! It looks good to me.
> > I think a fix for InvalidateConstraintCacheCallBack() is also good.
> >
> > I also confirmed that the patch passed the make check-world.
>
> It's fine that constraint_rood_id overrides constraint_id, but how
> about that constraint_root_id stores constraint_id if it is not a
> partition?  That change makes the patch a bit simpler.

My patch was like that before posting to this thread, but keeping
constraint_id and constraint_root_id separate looked better for
documenting the partitioning case as working differently from the
regular table case.  I guess a comment in ri_BuildQueryKey is enough
for that though and it's not like we're using constraint_root_id in
any other place to make matters confusing, so I changed it as you
suggest.  Updated patch attached.

-- 
Amit Langote
EDB: http://www.enterprisedb.com


v3-0001-ri_triggers.c-Use-root-constraint-OID-as-key-to-r.patch
Description: Binary data


Re: Huge memory consumption on partitioned table with FKs

2020-12-03 Thread Kyotaro Horiguchi
At Fri, 4 Dec 2020 12:00:09 +0900, Keisuke Kuroda 
 wrote in 
> Hi Amit,
> 
> > I have attached a patch in which I've tried to merge the ideas from
> > both my patch and Kuroda-san's.  I liked that his patch added
> > conparentid to RI_ConstraintInfo because that saves a needless
> > syscache lookup for constraints that don't have a parent.  I've kept
> > my idea to compute the root constraint id only once in
> > ri_LoadConstraint(), not on every invocation of ri_BuildQueryKey().
> > Kuroda-san, anything you'd like to add to that?
> 
> Thank you for the merge! It looks good to me.
> I think a fix for InvalidateConstraintCacheCallBack() is also good.
> 
> I also confirmed that the patch passed the make check-world.

It's fine that constraint_rood_id overrides constraint_id, but how
about that constraint_root_id stores constraint_id if it is not a
partition?  That change makes the patch a bit simpler.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Huge memory consumption on partitioned table with FKs

2020-12-03 Thread Kyotaro Horiguchi
Thanks, but sorry for the confusion.

I intended just to show how it looks like if we share
RI_ConstraintInfo among partition relations.

At Thu, 3 Dec 2020 10:22:47 -0300, Alvaro Herrera  
wrote in 
> Hello
> 
> I haven't followed this thread's latest posts, but I'm unclear on the
> lifetime of the new struct that's being allocated in TopMemoryContext.
> At what point are those structs freed?

The choice of memory context is tentative and in order to shrink the
patch'es footprint. I think we don't use CurrentDynaHashCxt for the
additional struct so a context for this use is needed.

The struct is freed only when the parent struct (RI_ConstraintInfo) is
found to be able to share the child struct (RI_ConstraintParam) with
the parent constraint.  It seems like inefficient (or tending to make
"hole"s in the heap area) but I chose it just to shrink the footprint.

We could create the new RI_ConstraintInfo on stack then copy it to the
cache after we find that the RI_ConstraintInfo needs its own
RI_ConstriantParam.

> Also, the comment that was in RI_ConstraintInfo now appears in
> RI_ConstraintParam, and the new struct (RI_ConstraintInfo) is now
> undocumented.  What is the relationship between those two structs?  I
> see that they have pointers to each other, but I think the relationship
> should be documented more clearly.

I'm not sure the footprint of this patch worth doing but here is a bit
more polished version.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 55946e09d33fc7fa43bed04ef548bf8a3f67155d Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Thu, 3 Dec 2020 16:15:57 +0900
Subject: [PATCH 1/2] separte riinfo

---
 src/backend/utils/adt/ri_triggers.c | 290 
 1 file changed, 168 insertions(+), 122 deletions(-)

diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 02b1a3868f..0306bf7739 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -97,14 +97,29 @@
  * Information extracted from an FK pg_constraint entry.  This is cached in
  * ri_constraint_cache.
  */
+struct RI_ConstraintParam;
+
 typedef struct RI_ConstraintInfo
 {
-	Oid			constraint_id;	/* OID of pg_constraint entry (hash key) */
+	Oid			constraint_id;
 	bool		valid;			/* successfully initialized? */
 	uint32		oidHashValue;	/* hash value of pg_constraint OID */
 	NameData	conname;		/* name of the FK constraint */
-	Oid			pk_relid;		/* referenced relation */
 	Oid			fk_relid;		/* referencing relation */
+	struct RI_ConstraintParam *param;	/* sharable part  */
+	dlist_node	valid_link;		/* Link in list of valid entries */
+} RI_ConstraintInfo;
+
+/*
+ * RI_ConstraintParam
+ *
+ * The part sharable among relations in a partitioned table of the cached
+ * constraint information.
+ */
+typedef struct RI_ConstraintParam
+{
+	/* begin with identity members */
+	Oid			pk_relid;		/* referenced relation */
 	char		confupdtype;	/* foreign key's ON UPDATE action */
 	char		confdeltype;	/* foreign key's ON DELETE action */
 	char		confmatchtype;	/* foreign key's match type */
@@ -114,8 +129,12 @@ typedef struct RI_ConstraintInfo
 	Oid			pf_eq_oprs[RI_MAX_NUMKEYS]; /* equality operators (PK = FK) */
 	Oid			pp_eq_oprs[RI_MAX_NUMKEYS]; /* equality operators (PK = PK) */
 	Oid			ff_eq_oprs[RI_MAX_NUMKEYS]; /* equality operators (FK = FK) */
-	dlist_node	valid_link;		/* Link in list of valid entries */
-} RI_ConstraintInfo;
+
+	/* These should be at the end of struct, see ri_LoadConstraintInfo */
+	Oid			query_key;		/* key for planned statment */
+	RI_ConstraintInfo *ownerinfo; /* owner RI_ConstraintInfo of this param */
+} RI_ConstraintParam;
+
 
 /*
  * RI_QueryKey
@@ -163,6 +182,7 @@ typedef struct RI_CompareHashEntry
 /*
  * Local data
  */
+static MemoryContext ri_constraint_cache_cxt = NULL;
 static HTAB *ri_constraint_cache = NULL;
 static HTAB *ri_query_cache = NULL;
 static HTAB *ri_compare_cache = NULL;
@@ -264,7 +284,7 @@ RI_FKey_check(TriggerData *trigdata)
 	 * SELECT FOR KEY SHARE will get on it.
 	 */
 	fk_rel = trigdata->tg_relation;
-	pk_rel = table_open(riinfo->pk_relid, RowShareLock);
+	pk_rel = table_open(riinfo->param->pk_relid, RowShareLock);
 
 	switch (ri_NullCheck(RelationGetDescr(fk_rel), newslot, riinfo, false))
 	{
@@ -283,7 +303,7 @@ RI_FKey_check(TriggerData *trigdata)
 			 * This is the only case that differs between the three kinds of
 			 * MATCH.
 			 */
-			switch (riinfo->confmatchtype)
+			switch (riinfo->param->confmatchtype)
 			{
 case FKCONSTR_MATCH_FULL:
 
@@ -364,17 +384,17 @@ RI_FKey_check(TriggerData *trigdata)
 		appendStringInfo(, "SELECT 1 FROM %s%s x",
 		 pk_only, pkrelname);
 		querysep = "WHERE";
-		for (int i = 0; i < riinfo->nkeys; i++)
+		for (int i = 0; i < riinfo->param->nkeys; i++)
 		{
-			Oid			pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
-			Oid			fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
+			Oid			pk_type = RIAttType(pk_rel, 

Re: Huge memory consumption on partitioned table with FKs

2020-12-03 Thread Keisuke Kuroda
Hi Amit,

> I have attached a patch in which I've tried to merge the ideas from
> both my patch and Kuroda-san's.  I liked that his patch added
> conparentid to RI_ConstraintInfo because that saves a needless
> syscache lookup for constraints that don't have a parent.  I've kept
> my idea to compute the root constraint id only once in
> ri_LoadConstraint(), not on every invocation of ri_BuildQueryKey().
> Kuroda-san, anything you'd like to add to that?

Thank you for the merge! It looks good to me.
I think a fix for InvalidateConstraintCacheCallBack() is also good.

I also confirmed that the patch passed the make check-world.

Best Regards,
-- 
Keisuke Kuroda
NTT Software Innovation Center
keisuke.kuroda.3...@gmail.com




Re: Huge memory consumption on partitioned table with FKs

2020-12-03 Thread Kyotaro Horiguchi
At Thu, 3 Dec 2020 21:40:29 +0900, Amit Langote  wrote 
in 
> On Thu, Dec 3, 2020 at 5:13 PM Kyotaro Horiguchi
>  wrote:
> > At Thu, 3 Dec 2020 16:41:45 +0900, Amit Langote  
> > wrote in
> > > Maybe I misread but I think you did in your email dated Dec 1 where you 
> > > said:
> > >
> > > "After an off-list discussion, we confirmed that even in that case the
> > > patch works as is because fk_attnum (or contuple.conkey) always stores
> > > key attnums compatible to the topmost parent when conparent has a
> > > valid value (assuming the current usage of fk_attnum), but I still
> > > feel uneasy to rely on that unclear behavior."
> >
> > fk_attnums *doesn't* refers to to top parent talbe of the referencing
> > side. it refers to attributes of the partition that is compatible with
> > the same element of fk_attnums of the topmost parent.  Maybe I'm
> > misreading.
> 
> Yeah, no I am confused.  Reading what I wrote, it seems I implied that
> the referenced (PK relation's) partitions have RI_ConstraintInfo which
> makes no sense, although there indeed is one pg_constraint entry that
> is defined on the FK root table for every PK partition with its OID as
> confrelid, which is in addition to an entry containing the root PK
> table's OID as confrelid.  I confused those PK-partition-referencing
> entries as belonging to the partitions themselves.  Although in my
> defence, all of those entries' conkey contains the FK root table's
> attributes, so at least that much holds. :)

Yes. I think that that confusion doen't hurt the correctness of the
discussion:)

> > > > > On the topic of how we'd be able to share even the RI_ConstraintInfos
> > > > > among partitions, that would indeed look a bit more elaborate than the
> > > > > patch we have right now.
> > > >
> > > > Maybe just letting the hash entry for the child riinfo point to the
> > > > parent riinfo if all members (other than constraint_id, of course)
> > > > share the exactly the same values.  No need to count references since
> > > > we don't going to remove riinfos.
> > >
> > > Ah, something maybe worth trying.  Although the memory we'd save by
> > > sharing the RI_ConstraintInfos would not add that much to the savings
> > > we're having by sharing the plan, because it's the plans that are a
> > > memory hog AFAIK.
> >
> > I agree that plans are rather large but the sharable part of the
> > RI_ConstraintInfos is 536 bytes, I'm not sure it is small enough
> > comparing to the plans.  But that has somewhat large footprint.. (See
> > the attached)
> 
> Thanks for the patch.

That's only to show how that looks like.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Huge memory consumption on partitioned table with FKs

2020-12-03 Thread Alvaro Herrera
Hello

I haven't followed this thread's latest posts, but I'm unclear on the
lifetime of the new struct that's being allocated in TopMemoryContext.
At what point are those structs freed?

Also, the comment that was in RI_ConstraintInfo now appears in
RI_ConstraintParam, and the new struct (RI_ConstraintInfo) is now
undocumented.  What is the relationship between those two structs?  I
see that they have pointers to each other, but I think the relationship
should be documented more clearly.

Thanks!





Re: Huge memory consumption on partitioned table with FKs

2020-12-03 Thread Amit Langote
On Tue, Dec 1, 2020 at 12:25 PM Corey Huinker  wrote:
> On Mon, Nov 30, 2020 at 9:48 PM Tom Lane  wrote:
>> Corey Huinker  writes:
>> > Given that we're already looking at these checks, I was wondering if this
>> > might be the time to consider implementing these checks by directly
>> > scanning the constraint index.
>>
>> Yeah, maybe.  Certainly ri_triggers is putting a huge amount of effort
>> into working around the SPI/parser/planner layer, to not a lot of gain.
>>
>> However, it's not clear to me that that line of thought will work well
>> for the statement-level-trigger approach.  In that case you might be
>> dealing with enough tuples to make a different plan advisable.
>
> Bypassing SPI would probably mean that we stay with row level triggers, and 
> the cached query plan would go away, perhaps replaced by an 
> already-looked-up-this-tuple hash sorta like what the cached nested loops 
> effort is doing.
>
> I've been meaning to give this a try when I got some spare time. This may 
> inspire me to try again.

+1 for this line of work.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: Huge memory consumption on partitioned table with FKs

2020-12-03 Thread Amit Langote
On Thu, Dec 3, 2020 at 5:13 PM Kyotaro Horiguchi
 wrote:
> At Thu, 3 Dec 2020 16:41:45 +0900, Amit Langote  
> wrote in
> > Maybe I misread but I think you did in your email dated Dec 1 where you 
> > said:
> >
> > "After an off-list discussion, we confirmed that even in that case the
> > patch works as is because fk_attnum (or contuple.conkey) always stores
> > key attnums compatible to the topmost parent when conparent has a
> > valid value (assuming the current usage of fk_attnum), but I still
> > feel uneasy to rely on that unclear behavior."
>
> fk_attnums *doesn't* refers to to top parent talbe of the referencing
> side. it refers to attributes of the partition that is compatible with
> the same element of fk_attnums of the topmost parent.  Maybe I'm
> misreading.

Yeah, no I am confused.  Reading what I wrote, it seems I implied that
the referenced (PK relation's) partitions have RI_ConstraintInfo which
makes no sense, although there indeed is one pg_constraint entry that
is defined on the FK root table for every PK partition with its OID as
confrelid, which is in addition to an entry containing the root PK
table's OID as confrelid.  I confused those PK-partition-referencing
entries as belonging to the partitions themselves.  Although in my
defence, all of those entries' conkey contains the FK root table's
attributes, so at least that much holds. :)

> > > > On the topic of how we'd be able to share even the RI_ConstraintInfos
> > > > among partitions, that would indeed look a bit more elaborate than the
> > > > patch we have right now.
> > >
> > > Maybe just letting the hash entry for the child riinfo point to the
> > > parent riinfo if all members (other than constraint_id, of course)
> > > share the exactly the same values.  No need to count references since
> > > we don't going to remove riinfos.
> >
> > Ah, something maybe worth trying.  Although the memory we'd save by
> > sharing the RI_ConstraintInfos would not add that much to the savings
> > we're having by sharing the plan, because it's the plans that are a
> > memory hog AFAIK.
>
> I agree that plans are rather large but the sharable part of the
> RI_ConstraintInfos is 536 bytes, I'm not sure it is small enough
> comparing to the plans.  But that has somewhat large footprint.. (See
> the attached)

Thanks for the patch.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: Huge memory consumption on partitioned table with FKs

2020-12-03 Thread Kyotaro Horiguchi
At Thu, 03 Dec 2020 17:13:16 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
me> I agree that plans are rather large but the sharable part of the
me> RI_ConstraintInfos is 536 bytes, I'm not sure it is small enough
me> comparing to the plans.  But that has somewhat large footprint.. (See
me> the attached)

0001 contains a bug about query_key and get_ri_constaint_root (from
your patch) is not needed there, but the core part is 0002 so please
ignore them.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Huge memory consumption on partitioned table with FKs

2020-12-03 Thread Kyotaro Horiguchi
At Thu, 3 Dec 2020 16:41:45 +0900, Amit Langote  wrote 
in 
> On Thu, Dec 3, 2020 at 2:29 PM Kyotaro Horiguchi
>  wrote:
> > At Thu, 3 Dec 2020 12:27:53 +0900, Amit Langote  
> > wrote in
> > > On Thu, Dec 3, 2020 at 10:15 AM Kyotaro Horiguchi
> > >  wrote:
> > > For the queries on the referencing side ("check" side),
> > > type/collation/attribute name determined using the above are going to
> > > be the same for all partitions in a given tree irrespective of the
> > > attribute number, because they're logically the same column.  On the
> >
> > Yes, I know that, which is what I meant by "practically" or
> > "actually", but it is not explicitly defined AFAICS.
> 
> Well, I think it's great that we don't have to worry *in this part of
> the code* about partition's fk_attnums not being congruent with the
> root parent's, because ensuring that is the responsibility of the
> other parts of the system such as DDL.  If we have any problems in
> this area, they should be dealt with by ensuring that there are no
> bugs in those other parts.

Agreed.

> > Thus that would be no longer an issue if we explicitly define that
> > "When conpraentid stores a valid value, each element of fk_attnums
> > points to logically the same attribute with the RI_ConstraintInfo for
> > the parent constraint."  Or I'd be happy if we have such a comment
> > there instead.
> 
> I saw a comment in Kuroda-san's v2 patch that is perhaps meant to
> address this point, but the placement needs to be reconsidered:

Ah, yes, that comes from my proposal.

> @@ -366,6 +368,14 @@ RI_FKey_check(TriggerData *trigdata)
> querysep = "WHERE";
> for (int i = 0; i < riinfo->nkeys; i++)
> {
> +
> +   /*
> +   * We share the same plan among all relations in a partition
> +   * hierarchy.  The plan is guaranteed to be compatible since all of
> +   * the member relations are guaranteed to have the equivalent set
> +   * of foreign keys in fk_attnums[].
> +   */
> +
> Oid pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
> Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
> 
> A more appropriate place for this kind of comment would be where
> fk_attnums is defined or in ri_BuildQueryKey() that is shared by
> different RI query issuing functions.

Yeah, I wanted more appropriate place for the comment.  That place
seems reasonable.

> > > referenced side ("restrict", "cascade", "set" side), as you already
> > > mentioned, fk_attnums refers to the top parent table of the
> > > referencing side, so no possibility of they being different in the
> > > various referenced partitions' RI_ConstraintInfos.
> >
> > Right. (I'm not sure I have mention that here, though:p)A
> 
> Maybe I misread but I think you did in your email dated Dec 1 where you said:
> 
> "After an off-list discussion, we confirmed that even in that case the
> patch works as is because fk_attnum (or contuple.conkey) always stores
> key attnums compatible to the topmost parent when conparent has a
> valid value (assuming the current usage of fk_attnum), but I still
> feel uneasy to rely on that unclear behavior."

fk_attnums *doesn't* refers to to top parent talbe of the referencing
side. it refers to attributes of the partition that is compatible with
the same element of fk_attnums of the topmost parent.  Maybe I'm
misreading.


> > > On the topic of how we'd be able to share even the RI_ConstraintInfos
> > > among partitions, that would indeed look a bit more elaborate than the
> > > patch we have right now.
> >
> > Maybe just letting the hash entry for the child riinfo point to the
> > parent riinfo if all members (other than constraint_id, of course)
> > share the exactly the same values.  No need to count references since
> > we don't going to remove riinfos.
> 
> Ah, something maybe worth trying.  Although the memory we'd save by
> sharing the RI_ConstraintInfos would not add that much to the savings
> we're having by sharing the plan, because it's the plans that are a
> memory hog AFAIK.

I agree that plans are rather large but the sharable part of the
RI_ConstraintInfos is 536 bytes, I'm not sure it is small enough
comparing to the plans.  But that has somewhat large footprint.. (See
the attached)

> > > > About your patch, it calculates the root constrid at the time an
> > > > riinfo is created, but when the root-partition is further attached to
> > > > another partitioned-table after the riinfo creation,
> > > > constraint_root_id gets stale.  Of course that dones't matter
> > > > practically, though.
> > >
> > > Maybe we could also store the hash value of the root constraint OID as
> > > rootHashValue and check for that one too in
> > > InvalidateConstraintCacheCallBack().  That would take care of this
> > > unless I'm missing something.
> >
> > Seems to be sound.
> 
> Okay, thanks.
> 
> I have attached a patch in which I've tried to merge the ideas from
> both my patch 

Re: Huge memory consumption on partitioned table with FKs

2020-12-02 Thread Amit Langote
On Thu, Dec 3, 2020 at 2:29 PM Kyotaro Horiguchi
 wrote:
> At Thu, 3 Dec 2020 12:27:53 +0900, Amit Langote  
> wrote in
> > On Thu, Dec 3, 2020 at 10:15 AM Kyotaro Horiguchi
> >  wrote:
> > > I don't think you're missing something. As I (tried to..) mentioned in
> > > another branch of this thread, you're right. On the otherhand
> > > fk_attnums is actually used to generate the query, thus *issue*
> > > potentially affects the query.  Although that might be paranoic, that
> > > possibility is eliminated by checking the congruency(?) of fk_attnums
> > > (or other members).  We might even be able to share a riinfo among
> > > such children.
> >
> > Just to be sure I've not misunderstood you, let me mention why I think
> > the queries used by different partitions all turn out to be the same
> > despite attribute number differences among partitions.  The places
> > that uses fk_attnums when generating a query are these among others:
> >
> > Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
> > ...
> > Oid fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
> > ...
> > quoteOneName(attname,
> >  RIAttName(fk_rel, riinfo->fk_attnums[i]));
> >
> > For the queries on the referencing side ("check" side),
> > type/collation/attribute name determined using the above are going to
> > be the same for all partitions in a given tree irrespective of the
> > attribute number, because they're logically the same column.  On the
>
> Yes, I know that, which is what I meant by "practically" or
> "actually", but it is not explicitly defined AFAICS.

Well, I think it's great that we don't have to worry *in this part of
the code* about partition's fk_attnums not being congruent with the
root parent's, because ensuring that is the responsibility of the
other parts of the system such as DDL.  If we have any problems in
this area, they should be dealt with by ensuring that there are no
bugs in those other parts.

> Thus that would be no longer an issue if we explicitly define that
> "When conpraentid stores a valid value, each element of fk_attnums
> points to logically the same attribute with the RI_ConstraintInfo for
> the parent constraint."  Or I'd be happy if we have such a comment
> there instead.

I saw a comment in Kuroda-san's v2 patch that is perhaps meant to
address this point, but the placement needs to be reconsidered:

@@ -366,6 +368,14 @@ RI_FKey_check(TriggerData *trigdata)
querysep = "WHERE";
for (int i = 0; i < riinfo->nkeys; i++)
{
+
+   /*
+   * We share the same plan among all relations in a partition
+   * hierarchy.  The plan is guaranteed to be compatible since all of
+   * the member relations are guaranteed to have the equivalent set
+   * of foreign keys in fk_attnums[].
+   */
+
Oid pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);

A more appropriate place for this kind of comment would be where
fk_attnums is defined or in ri_BuildQueryKey() that is shared by
different RI query issuing functions.

> > referenced side ("restrict", "cascade", "set" side), as you already
> > mentioned, fk_attnums refers to the top parent table of the
> > referencing side, so no possibility of they being different in the
> > various referenced partitions' RI_ConstraintInfos.
>
> Right. (I'm not sure I have mention that here, though:p)A

Maybe I misread but I think you did in your email dated Dec 1 where you said:

"After an off-list discussion, we confirmed that even in that case the
patch works as is because fk_attnum (or contuple.conkey) always stores
key attnums compatible to the topmost parent when conparent has a
valid value (assuming the current usage of fk_attnum), but I still
feel uneasy to rely on that unclear behavior."

> > On the topic of how we'd be able to share even the RI_ConstraintInfos
> > among partitions, that would indeed look a bit more elaborate than the
> > patch we have right now.
>
> Maybe just letting the hash entry for the child riinfo point to the
> parent riinfo if all members (other than constraint_id, of course)
> share the exactly the same values.  No need to count references since
> we don't going to remove riinfos.

Ah, something maybe worth trying.  Although the memory we'd save by
sharing the RI_ConstraintInfos would not add that much to the savings
we're having by sharing the plan, because it's the plans that are a
memory hog AFAIK.

> > > About your patch, it calculates the root constrid at the time an
> > > riinfo is created, but when the root-partition is further attached to
> > > another partitioned-table after the riinfo creation,
> > > constraint_root_id gets stale.  Of course that dones't matter
> > > practically, though.
> >
> > Maybe we could also store the hash value of the root constraint OID as
> > rootHashValue and check for that one too in
> > 

Re: Huge memory consumption on partitioned table with FKs

2020-12-02 Thread Kyotaro Horiguchi
At Thu, 3 Dec 2020 12:27:53 +0900, Amit Langote  wrote 
in 
> Hello,
> 
> On Thu, Dec 3, 2020 at 10:15 AM Kyotaro Horiguchi
>  wrote:
> > At Wed, 2 Dec 2020 22:02:50 +0900, Amit Langote  
> > wrote in
> > > Hmm, I don't see what the problem is here, because it's not
> > > RI_ConstraintInfo that is being shared among partitions of the same
> > > parent, but the RI query (and the SPIPlanPtr for it) that is issued
> > > through SPI.  The query (and the plan) turns out to be the same no
> > > matter what partition's RI_ConstraintInfo is first used to generate
> > > it.  What am I missing?
> >
> > I don't think you're missing something. As I (tried to..) mentioned in
> > another branch of this thread, you're right. On the otherhand
> > fk_attnums is actually used to generate the query, thus *issue*
> > potentially affects the query.  Although that might be paranoic, that
> > possibility is eliminated by checking the congruency(?) of fk_attnums
> > (or other members).  We might even be able to share a riinfo among
> > such children.
> 
> Just to be sure I've not misunderstood you, let me mention why I think
> the queries used by different partitions all turn out to be the same
> despite attribute number differences among partitions.  The places
> that uses fk_attnums when generating a query are these among others:
> 
> Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
> ...
> Oid fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
> ...
> quoteOneName(attname,
>  RIAttName(fk_rel, riinfo->fk_attnums[i]));
> 
> For the queries on the referencing side ("check" side),
> type/collation/attribute name determined using the above are going to
> be the same for all partitions in a given tree irrespective of the
> attribute number, because they're logically the same column.  On the

Yes, I know that, which is what I meant by "practically" or
"actually", but it is not explicitly defined AFAICS.

Thus that would be no longer an issue if we explicitly define that
"When conpraentid stores a valid value, each element of fk_attnums
points to logically the same attribute with the RI_ConstraintInfo for
the parent constraint."  Or I'd be happy if we have such a comment
there instead.

> referenced side ("restrict", "cascade", "set" side), as you already
> mentioned, fk_attnums refers to the top parent table of the
> referencing side, so no possibility of they being different in the
> various referenced partitions' RI_ConstraintInfos.

Right. (I'm not sure I have mention that here, though:p)A

> On the topic of how we'd be able to share even the RI_ConstraintInfos
> among partitions, that would indeed look a bit more elaborate than the
> patch we have right now.

Maybe just letting the hash entry for the child riinfo point to the
parent riinfo if all members (other than constraint_id, of course)
share the exactly the same values.  No need to count references since
we don't going to remove riinfos.

> > > BTW, one problem with Kuroda-san's patch is that it's using
> > > conparentid as the shared key, which can still lead to multiple
> > > queries being generated when using multi-level partitioning, because
> > > there would be many (intermediate) parent constraints in that case.
> > > We should really be using the "root" constraint OID as the key,
> > > because there would only be one root from which all constraints in a
> > > given partition hierarchy would've originated.  Actually, I had
> > > written a patch a few months back to do exactly that, a polished
> > > version of which I've attached with this email.  Please take a look.
> >
> > I don't like that the function self-recurses but
> > ri_GetParentConstrOid() actually climbs up to the root constraint, so
> > the patch covers the multilevel partitioning cases.
> 
> Sorry, I got too easily distracted by the name Kuroda-san chose for
> the field in RI_ConstraintInfo and the function.

I thought the same at the first look to the function.

> > About your patch, it calculates the root constrid at the time an
> > riinfo is created, but when the root-partition is further attached to
> > another partitioned-table after the riinfo creation,
> > constraint_root_id gets stale.  Of course that dones't matter
> > practically, though.
> 
> Maybe we could also store the hash value of the root constraint OID as
> rootHashValue and check for that one too in
> InvalidateConstraintCacheCallBack().  That would take care of this
> unless I'm missing something.

Seems to be sound.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Huge memory consumption on partitioned table with FKs

2020-12-02 Thread Amit Langote
Hello,

On Thu, Dec 3, 2020 at 10:15 AM Kyotaro Horiguchi
 wrote:
> At Wed, 2 Dec 2020 22:02:50 +0900, Amit Langote  
> wrote in
> > Hello,
> >
> > On Tue, Dec 1, 2020 at 9:40 AM Kyotaro Horiguchi
> >  wrote:
> > >
> > > At Mon, 30 Nov 2020 21:03:45 -0300, Alvaro Herrera 
> > >  wrote in
> > > > On 2020-Nov-26, Kyotaro Horiguchi wrote:
> > > >
> > > > > This shares RI_ConstraintInfo cache by constraints that shares the
> > > > > same parent constraints. But you forgot that the cache contains some
> > > > > members that can differ among partitions.
> > > > >
> > > > > Consider the case of attaching a partition that have experienced a
> > > > > column deletion.
> > > >
> > > > I think this can be solved easily in the patch, by having
> > > > ri_BuildQueryKey() compare the parent's fk_attnums to the parent; if
> > > > they are equal then use the parent's constaint_id, otherwise use the
> > > > child constraint.  That way, the cache entry is reused in the common
> > > > case where they are identical.
> > >
> > > *I* think it's the direction.  After an off-list discussion, we
> > >  confirmed that even in that case the patch works as is because
> > >  fk_attnum (or contuple.conkey) always stores key attnums compatible
> > >  to the topmost parent when conparent has a valid value (assuming the
> > >  current usage of fk_attnum), but I still feel uneasy to rely on that
> > >  unclear behavior.
> >
> > Hmm, I don't see what the problem is here, because it's not
> > RI_ConstraintInfo that is being shared among partitions of the same
> > parent, but the RI query (and the SPIPlanPtr for it) that is issued
> > through SPI.  The query (and the plan) turns out to be the same no
> > matter what partition's RI_ConstraintInfo is first used to generate
> > it.  What am I missing?
>
> I don't think you're missing something. As I (tried to..) mentioned in
> another branch of this thread, you're right. On the otherhand
> fk_attnums is actually used to generate the query, thus *issue*
> potentially affects the query.  Although that might be paranoic, that
> possibility is eliminated by checking the congruency(?) of fk_attnums
> (or other members).  We might even be able to share a riinfo among
> such children.

Just to be sure I've not misunderstood you, let me mention why I think
the queries used by different partitions all turn out to be the same
despite attribute number differences among partitions.  The places
that uses fk_attnums when generating a query are these among others:

Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
...
Oid fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
...
quoteOneName(attname,
 RIAttName(fk_rel, riinfo->fk_attnums[i]));

For the queries on the referencing side ("check" side),
type/collation/attribute name determined using the above are going to
be the same for all partitions in a given tree irrespective of the
attribute number, because they're logically the same column.  On the
referenced side ("restrict", "cascade", "set" side), as you already
mentioned, fk_attnums refers to the top parent table of the
referencing side, so no possibility of they being different in the
various referenced partitions' RI_ConstraintInfos.

On the topic of how we'd be able to share even the RI_ConstraintInfos
among partitions, that would indeed look a bit more elaborate than the
patch we have right now.

> > BTW, one problem with Kuroda-san's patch is that it's using
> > conparentid as the shared key, which can still lead to multiple
> > queries being generated when using multi-level partitioning, because
> > there would be many (intermediate) parent constraints in that case.
> > We should really be using the "root" constraint OID as the key,
> > because there would only be one root from which all constraints in a
> > given partition hierarchy would've originated.  Actually, I had
> > written a patch a few months back to do exactly that, a polished
> > version of which I've attached with this email.  Please take a look.
>
> I don't like that the function self-recurses but
> ri_GetParentConstrOid() actually climbs up to the root constraint, so
> the patch covers the multilevel partitioning cases.

Sorry, I got too easily distracted by the name Kuroda-san chose for
the field in RI_ConstraintInfo and the function.

> About your patch, it calculates the root constrid at the time an
> riinfo is created, but when the root-partition is further attached to
> another partitioned-table after the riinfo creation,
> constraint_root_id gets stale.  Of course that dones't matter
> practically, though.

Maybe we could also store the hash value of the root constraint OID as
rootHashValue and check for that one too in
InvalidateConstraintCacheCallBack().  That would take care of this
unless I'm missing something.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: Huge memory consumption on partitioned table with FKs

2020-12-02 Thread Keisuke Kuroda
Hi Hackers,

>> I would embed all this knowledge in ri_BuildQueryKey though, without
>> adding the new function ri_GetParentConstOid.  I don't think that
>> function meaningful abstraction value, and instead it would make what I
>> suggest more difficult.

> It seems to me reasonable.

Indeed, I tried to get the conparentid with ri_BuildQueryKey.
(v2_reduce_ri_SPI-plan-hash.patch)

> Somewhat of a detour, but in reviewing the patch for
> Statement-Level RI checks, Andres and I observed that SPI
> made for a large portion of the RI overhead.

If this can be resolved by reviewing the Statement-Level RI checks,
then I think it would be better.

> BTW, one problem with Kuroda-san's patch is that it's using
> conparentid as the shared key, which can still lead to multiple
> queries being generated when using multi-level partitioning, because
> there would be many (intermediate) parent constraints in that case.

Horiguchi-san also mentiond,
In my patch, in ri_GetParentConstOid, iterate on getting
the constraint OID until comparentid == InvalidOid.
This should allow to get the "root constraint OID".

However, the names "comparentid" and "ri_GetParentConstOid"
are confusing. We should use names like "constraint_root_id",
as in Amit-san's patch.

Best Regards,
-- 
Keisuke Kuroda
NTT Software Innovation Center
keisuke.kuroda.3...@gmail.com


v2_reduce_ri_SPI-plan-hash.patch
Description: Binary data


Re: Huge memory consumption on partitioned table with FKs

2020-12-02 Thread Kyotaro Horiguchi
At Wed, 2 Dec 2020 22:02:50 +0900, Amit Langote  wrote 
in 
> Hello,
> 
> On Tue, Dec 1, 2020 at 9:40 AM Kyotaro Horiguchi
>  wrote:
> >
> > At Mon, 30 Nov 2020 21:03:45 -0300, Alvaro Herrera 
> >  wrote in
> > > On 2020-Nov-26, Kyotaro Horiguchi wrote:
> > >
> > > > This shares RI_ConstraintInfo cache by constraints that shares the
> > > > same parent constraints. But you forgot that the cache contains some
> > > > members that can differ among partitions.
> > > >
> > > > Consider the case of attaching a partition that have experienced a
> > > > column deletion.
> > >
> > > I think this can be solved easily in the patch, by having
> > > ri_BuildQueryKey() compare the parent's fk_attnums to the parent; if
> > > they are equal then use the parent's constaint_id, otherwise use the
> > > child constraint.  That way, the cache entry is reused in the common
> > > case where they are identical.
> >
> > *I* think it's the direction.  After an off-list discussion, we
> >  confirmed that even in that case the patch works as is because
> >  fk_attnum (or contuple.conkey) always stores key attnums compatible
> >  to the topmost parent when conparent has a valid value (assuming the
> >  current usage of fk_attnum), but I still feel uneasy to rely on that
> >  unclear behavior.
> 
> Hmm, I don't see what the problem is here, because it's not
> RI_ConstraintInfo that is being shared among partitions of the same
> parent, but the RI query (and the SPIPlanPtr for it) that is issued
> through SPI.  The query (and the plan) turns out to be the same no
> matter what partition's RI_ConstraintInfo is first used to generate
> it.  What am I missing?

I don't think you're missing something. As I (tried to..) mentioned in
another branch of this thread, you're right. On the otherhand
fk_attnums is actually used to generate the query, thus *issue*
potentially affects the query.  Although that might be paranoic, that
possibility is eliminated by checking the congruency(?) of fk_attnums
(or other members).  We might even be able to share a riinfo among
such children.

> BTW, one problem with Kuroda-san's patch is that it's using
> conparentid as the shared key, which can still lead to multiple
> queries being generated when using multi-level partitioning, because
> there would be many (intermediate) parent constraints in that case.
> We should really be using the "root" constraint OID as the key,
> because there would only be one root from which all constraints in a
> given partition hierarchy would've originated.  Actually, I had
> written a patch a few months back to do exactly that, a polished
> version of which I've attached with this email.  Please take a look.

I don't like that the function self-recurses but
ri_GetParentConstrOid() actually climbs up to the root constraint, so
the patch covers the multilevel partitioning cases.

About your patch, it calculates the root constrid at the time an
riinfo is created, but when the root-partition is further attached to
another partitioned-table after the riinfo creation,
constraint_root_id gets stale.  Of course that dones't matter
practically, though.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Huge memory consumption on partitioned table with FKs

2020-12-02 Thread Simon Riggs
On Tue, 1 Dec 2020 at 00:03, Alvaro Herrera  wrote:
> cache entry is reused in the common case where they are identical.

Does a similar situation exist for partition statistics accessed
during planning? Or planning itself?

It would be useful to avoid repeated access to similar statistics and
repeated planning of sub-plans for similar partitions.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: Huge memory consumption on partitioned table with FKs

2020-12-02 Thread Amit Langote
Hello,

On Tue, Dec 1, 2020 at 9:40 AM Kyotaro Horiguchi
 wrote:
>
> At Mon, 30 Nov 2020 21:03:45 -0300, Alvaro Herrera  
> wrote in
> > On 2020-Nov-26, Kyotaro Horiguchi wrote:
> >
> > > This shares RI_ConstraintInfo cache by constraints that shares the
> > > same parent constraints. But you forgot that the cache contains some
> > > members that can differ among partitions.
> > >
> > > Consider the case of attaching a partition that have experienced a
> > > column deletion.
> >
> > I think this can be solved easily in the patch, by having
> > ri_BuildQueryKey() compare the parent's fk_attnums to the parent; if
> > they are equal then use the parent's constaint_id, otherwise use the
> > child constraint.  That way, the cache entry is reused in the common
> > case where they are identical.
>
> *I* think it's the direction.  After an off-list discussion, we
>  confirmed that even in that case the patch works as is because
>  fk_attnum (or contuple.conkey) always stores key attnums compatible
>  to the topmost parent when conparent has a valid value (assuming the
>  current usage of fk_attnum), but I still feel uneasy to rely on that
>  unclear behavior.

Hmm, I don't see what the problem is here, because it's not
RI_ConstraintInfo that is being shared among partitions of the same
parent, but the RI query (and the SPIPlanPtr for it) that is issued
through SPI.  The query (and the plan) turns out to be the same no
matter what partition's RI_ConstraintInfo is first used to generate
it.  What am I missing?

BTW, one problem with Kuroda-san's patch is that it's using
conparentid as the shared key, which can still lead to multiple
queries being generated when using multi-level partitioning, because
there would be many (intermediate) parent constraints in that case.
We should really be using the "root" constraint OID as the key,
because there would only be one root from which all constraints in a
given partition hierarchy would've originated.  Actually, I had
written a patch a few months back to do exactly that, a polished
version of which I've attached with this email.  Please take a look.

--
Amit Langote
EDB: http://www.enterprisedb.com


0001-ri_triggers.c-Use-root-constraint-OID-as-key-to-ri_q.patch
Description: Binary data


Re: Huge memory consumption on partitioned table with FKs

2020-11-30 Thread Corey Huinker
On Mon, Nov 30, 2020 at 9:48 PM Tom Lane  wrote:

> Corey Huinker  writes:
> > Given that we're already looking at these checks, I was wondering if this
> > might be the time to consider implementing these checks by directly
> > scanning the constraint index.
>
> Yeah, maybe.  Certainly ri_triggers is putting a huge amount of effort
> into working around the SPI/parser/planner layer, to not a lot of gain.
>
> However, it's not clear to me that that line of thought will work well
> for the statement-level-trigger approach.  In that case you might be
> dealing with enough tuples to make a different plan advisable.
>
> regards, tom lane
>

Bypassing SPI would probably mean that we stay with row level triggers, and
the cached query plan would go away, perhaps replaced by an
already-looked-up-this-tuple hash sorta like what the cached nested loops
effort is doing.

I've been meaning to give this a try when I got some spare time. This may
inspire me to try again.


Re: Huge memory consumption on partitioned table with FKs

2020-11-30 Thread Tom Lane
Corey Huinker  writes:
> Given that we're already looking at these checks, I was wondering if this
> might be the time to consider implementing these checks by directly
> scanning the constraint index.

Yeah, maybe.  Certainly ri_triggers is putting a huge amount of effort
into working around the SPI/parser/planner layer, to not a lot of gain.

However, it's not clear to me that that line of thought will work well
for the statement-level-trigger approach.  In that case you might be
dealing with enough tuples to make a different plan advisable.

regards, tom lane




Re: Huge memory consumption on partitioned table with FKs

2020-11-30 Thread Corey Huinker
>
> I think this can be solved easily in the patch, by having
> ri_BuildQueryKey() compare the parent's fk_attnums to the parent; if
> they are equal then use the parent's constaint_id, otherwise use the
> child constraint.  That way, the cache entry is reused in the common
> case where they are identical.
>

Somewhat of a detour, but in reviewing the patch for Statement-Level RI
checks, Andres and I observed that SPI made for a large portion of the RI
overhead.

Given that we're already looking at these checks, I was wondering if this
might be the time to consider implementing these checks by directly
scanning the constraint index.


Re: Huge memory consumption on partitioned table with FKs

2020-11-30 Thread Kyotaro Horiguchi
At Mon, 30 Nov 2020 21:03:45 -0300, Alvaro Herrera  
wrote in 
> On 2020-Nov-26, Kyotaro Horiguchi wrote:
> 
> > This shares RI_ConstraintInfo cache by constraints that shares the
> > same parent constraints. But you forgot that the cache contains some
> > members that can differ among partitions.
> > 
> > Consider the case of attaching a partition that have experienced a
> > column deletion.
> 
> I think this can be solved easily in the patch, by having
> ri_BuildQueryKey() compare the parent's fk_attnums to the parent; if
> they are equal then use the parent's constaint_id, otherwise use the
> child constraint.  That way, the cache entry is reused in the common
> case where they are identical.

*I* think it's the direction.  After an off-list discussion, we
 confirmed that even in that case the patch works as is because
 fk_attnum (or contuple.conkey) always stores key attnums compatible
 to the topmost parent when conparent has a valid value (assuming the
 current usage of fk_attnum), but I still feel uneasy to rely on that
 unclear behavior.

> I would embed all this knowledge in ri_BuildQueryKey though, without
> adding the new function ri_GetParentConstOid.  I don't think that
> function meaningful abstraction value, and instead it would make what I
> suggest more difficult.

It seems to me reasonable.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Huge memory consumption on partitioned table with FKs

2020-11-30 Thread Alvaro Herrera
On 2020-Nov-26, Kyotaro Horiguchi wrote:

> This shares RI_ConstraintInfo cache by constraints that shares the
> same parent constraints. But you forgot that the cache contains some
> members that can differ among partitions.
> 
> Consider the case of attaching a partition that have experienced a
> column deletion.

I think this can be solved easily in the patch, by having
ri_BuildQueryKey() compare the parent's fk_attnums to the parent; if
they are equal then use the parent's constaint_id, otherwise use the
child constraint.  That way, the cache entry is reused in the common
case where they are identical.

I would embed all this knowledge in ri_BuildQueryKey though, without
adding the new function ri_GetParentConstOid.  I don't think that
function meaningful abstraction value, and instead it would make what I
suggest more difficult.




Re: Huge memory consumption on partitioned table with FKs

2020-11-25 Thread Kyotaro Horiguchi
At Thu, 26 Nov 2020 09:59:28 +0900, Keisuke Kuroda 
 wrote in 
> Hi Hackers,
> 
> Analyzed the problem and created a patch to resolve it.
> 
> # Problem 1
> 
> When you create a foreign key to a partitioned table, referential
> integrity function is created for the number of partitions.
> Internally, SPI_prepare() creates a plan and SPI_keepplan()
> caches the plan.
> 
> The more partitions in the referencing table, the more plans will
> be cached.
> 
> # Problem 2
> 
> When referenced table is partitioned table, the larger the number
> of partitions, the larger the plan size to be cached.
> 
> The actual plan processed is simple and small if pruning is
> enabled. However, the cached plan will include all partition
> information.
> 
> The more partitions in the referenced table, the larger the plan
> size to be cached.
> 
> # Idea for solution
> 
> Constraints with the same pg_constraint.parentid can be combined
> into one plan with the same comparentid if we can guarantee that
> all their contents are the same.

The memory reduction this patch gives seems quite high with a small
footprint.

This shares RI_ConstraintInfo cache by constraints that shares the
same parent constraints. But you forgot that the cache contains some
members that can differ among partitions.

Consider the case of attaching a partition that have experienced a
column deletion.

create table t (a int primary key);
create table p (a int, r int references t(a)) partition by range(a);
create table c1 partition of p for values from (0) to (10);
create table c2 (a int, r int);
alter table c2 drop column r;
alter table c2 add column r int;
alter table p attach partition c2 for values from (10) to (20);

In that case riinfo->fk_attnums has different values from other
partitions.

=# select oid, conrelid::regclass, confrelid::regclass, conparentid, conname, 
conkey from pg_constraint where confrelid = 't'::regclass;

  oid  | conrelid | confrelid | conparentid | conname  | conkey 
---+--+---+-+--+
 16620 | p| t |   0 | p_r_fkey | {2}
 16626 | c1   | t |   16620 | p_r_fkey | {2}
 16632 | c2   | t |   16620 | p_r_fkey | {3}
(3 rows)

conkey is copied onto riinfo->fk_attnums.

> Problem 1 can be solved
> and significant memory bloat can be avoided.
> CachedPlan: 710MB -> 1466kB
> 
> Solving Problem 2 could also reduce memory,
> but I don't have a good idea.
> 
> Currently, DISCARD ALL does not discard CachedPlan by SPI as in
> this case. It may be possible to modify DISCARD ALL to discard
> CachedPlan and run it periodically. However, we think the burden
> on the user is high.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Huge memory consumption on partitioned table with FKs

2020-11-25 Thread Keisuke Kuroda
Hi Hackers,

Analyzed the problem and created a patch to resolve it.

# Problem 1

When you create a foreign key to a partitioned table, referential
integrity function is created for the number of partitions.
Internally, SPI_prepare() creates a plan and SPI_keepplan()
caches the plan.

The more partitions in the referencing table, the more plans will
be cached.

# Problem 2

When referenced table is partitioned table, the larger the number
of partitions, the larger the plan size to be cached.

The actual plan processed is simple and small if pruning is
enabled. However, the cached plan will include all partition
information.

The more partitions in the referenced table, the larger the plan
size to be cached.

# Idea for solution

Constraints with the same pg_constraint.parentid can be combined
into one plan with the same comparentid if we can guarantee that
all their contents are the same.

Problem 1 can be solved
and significant memory bloat can be avoided.
CachedPlan: 710MB -> 1466kB

Solving Problem 2 could also reduce memory,
but I don't have a good idea.

Currently, DISCARD ALL does not discard CachedPlan by SPI as in
this case. It may be possible to modify DISCARD ALL to discard
CachedPlan and run it periodically. However, we think the burden
on the user is high.

# result with patch(PG14 HEAD(e522024b) + patch)

   name   |  bytes  | pg_size_pretty
--+-+
 CachedPlanQuery  |   12912 | 13 kB
 CachedPlanSource |   17448 | 17 kB
 CachedPlan   | 1501192 | 1466 kB

CachedPlan * 1 Record

postgres=# SELECT count(*) FROM pg_backend_memory_contexts WHERE name
= 'CachedPlan' AND ident LIKE 'SELECT 1 FROM%';
 count
---
 1

postgres=# SELECT * FROM pg_backend_memory_contexts WHERE name =
'CachedPlan' AND ident LIKE 'SELECT 1 FROM%';
-[ RECORD 1 
]-+--
name  | CachedPlan
ident | SELECT 1 FROM "public"."ps" x WHERE "c1"
OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x
parent| CacheMemoryContext
level | 2
total_bytes   | 2101248
total_nblocks | 12
free_bytes| 613256
free_chunks   | 1
used_bytes| 1487992
(1 record)

# result without patch(PG14 HEAD(e522024b))

   name   |   bytes   | pg_size_pretty
--+---+
 CachedPlanQuery  |   1326280 | 1295 kB
 CachedPlanSource |   1474528 | 1440 kB
 CachedPlan   | 744009200 | 710 MB

CachedPlan * 500 Records

postgres=# SELECT count(*) FROM pg_backend_memory_contexts WHERE name
= 'CachedPlan' AND ident LIKE 'SELECT 1 FROM%';
 count
---
   500

SELECT * FROM pg_backend_memory_contexts WHERE name = 'CachedPlan' AND
ident LIKE 'SELECT 1 FROM%';
name  | CachedPlan
ident | SELECT 1 FROM "public"."ps" x WHERE "c1"
OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x
parent| CacheMemoryContext
level | 2
total_bytes   | 2101248
total_nblocks | 12
free_bytes| 613256
free_chunks   | 1
used_bytes| 1487992
...(500 same records)

Best Regards,
-- 
Keisuke Kuroda
NTT Software Innovation Center
keisuke.kuroda.3...@gmail.com


v1_reduce_ri_SPI-plan-hash.patch
Description: Binary data


Huge memory consumption on partitioned table with FKs

2020-11-23 Thread Tatsuro Yamada

Hi Hackers,

My company (NTT Comware) and NTT OSS Center did verification of
partitioned table on PG14dev, and we faced a problem that consumed
huge memory when we created a Foreign key constraint on a partitioned
table including 500 partitioning tables and inserted some data.

We investigated it a little to use the "pg_backend_memory_contextes"
command and realized a "CachedPlan" of backends increased dramatically.
See below:

Without FKs
==
CachedPlan 0kB

With FKs (the problem is here)
==
CachedPlan 710MB


Please find the attached file to reproduce the problem.

We know two things as following:
  - Each backend uses the size of CachedPlan
  - The more increasing partitioning tables, the more CachedPlan
consuming

If there are many backends, it consumes about the size of CachedPlan x
the number of backends. It may occur a shortage of memory and OOM killer.
We think the affected version are PG12 or later. I believe it would be
better to fix the problem. Any thoughts?

Regards,
Tatsuro Yamada
DROP TABLE IF EXISTS pr CASCADE;
DROP TABLE IF EXISTS ps CASCADE;
CREATE TABLE ps (c1 INT PRIMARY KEY) PARTITION BY RANGE(c1);
CREATE TABLE pr (c1 INT, c2 INT REFERENCES ps(c1)) PARTITION BY RANGE(c1);

-- Show memory usage of 'Cached%' 
SELECT name, sum(used_bytes) as bytes, pg_size_pretty(sum(used_bytes)) FROM 
pg_backend_memory_contexts
WHERE name LIKE 'Cached%' GROUP BY name;

-- Procedure for creating partitioned table
CREATE OR REPLACE PROCEDURE part_make(tbl text, num int) AS $$
 DECLARE width int := 10; next int :=1;
 BEGIN
   FOR i in 1..num LOOP
   EXECUTE 'CREATE TABLE ' || tbl || '_' || i || ' partition of ' || tbl || 
' FOR VALUES FROM (' || next || ') TO (' || i * width
   || ');';
   next := i * width;
   END LOOP;
 END;
$$ LANGUAGE plpgsql;

-- Create partitioned tables named ps and pr. The each table has 500 
partitioning tables. 
CALL part_make('ps', 500);
CALL part_make('pr', 500);

-- Insert data
INSERT INTO ps SELECT generate_series(1,4999);
INSERT INTO pr SELECT i, i from generate_series(1,4999)i;

-- Show memory usages of 'Cached%' again
-- You can see 'CachedPlan 710MB'
SELECT name, sum(used_bytes) as bytes, pg_size_pretty(sum(used_bytes)) FROM 
pg_backend_memory_contexts
WHERE name LIKE 'Cached%' GROUP BY name;