from:"Tender Wang"

Re: First draft of PG 17 release notes

2024-05-13 Thread Tender Wang

jian he  于2024年5月9日周四 18:00写道：

> On Thu, May 9, 2024 at 12:04 PM Bruce Momjian  wrote:
> >
> > I have committed the first draft of the PG 17 release notes;  you can
> > see the results here:
> >
> > https://momjian.us/pgsql_docs/release-17.html
> >
>
> another potential incompatibilities issue:
> ALTER TABLE DROP PRIMARY KEY
>
> see:
>
> https://www.postgresql.org/message-id/202404181849.6frtmajobe27%40alvherre.pgsql
>
>
Since Alvaro has reverted all changes to not-null constraints, so will not
have potential incompatibilities issue.
-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2024-04-23 Thread Tender Wang

Andrey M. Borodin  于2024年4月8日周一 17:40写道：

>
>
> > On 27 Sep 2023, at 16:06, tender wang  wrote:
> >
> >Do you have any comments or suggestions on this issue? Thanks.
> Hi Tender,
>
> there are some review comments in the thread, that you might be interested
> in.
> I'll mark this [0] entry "Waiting on Author" and move to next CF.
>
> Thanks!
>
>
> Best regards, Andrey Borodin.
>
> [0]https://commitfest.postgresql.org/47/4549/


I have rebased master and fixed a plan diff case.
-- 
Tender Wang
OpenPie:  https://en.openpie.com/


v3-0001-Support-materializing-inner-path-on-parallel-oute.patch
Description: Binary data

Re: Can't find not null constraint, but \d+ shows that

2024-04-19 Thread Tender Wang

Alvaro Herrera  于2024年4月19日周五 02:49写道：

> On 2024-Apr-13, jian he wrote:
>
> > I wonder is there any incompatibility issue, or do we need to say
> something
> > about the new behavior when dropping a key column?
>
> Umm, yeah, maybe we should document it in ALTER TABLE DROP PRIMARY KEY
> and in the release notes to note the different behavior.
>
> > only minor cosmetic issue:
> > + if (unconstrained_cols)
> > i would like change it to
> > + if (unconstrained_cols != NIL)
> >
> > + foreach(lc, unconstrained_cols)
> > we can change to
> > + foreach_int(attnum, unconstrained_cols)
> > per commit
> >
> https://git.postgresql.org/cgit/postgresql.git/commit/?id=14dd0f27d7cd56ffae9ecdbe324965073d01a9ff
>
> Ah, yeah.  I did that, rewrote some comments and refined the tests a
> little bit to ensure the pg_upgrade behavior is sane.  I intend to get
> this pushed tomorrow, if nothing ugly comes up.
>

The new patch looks good to me.


>
> CI run: https://cirrus-ci.com/build/5471117953990656
>
>

-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Can't find not null constraint, but \d+ shows that

2024-04-12 Thread Tender Wang

Alvaro Herrera  于2024年4月11日周四 22:48写道：

> On 2024-Apr-11, Alvaro Herrera wrote:
>
> > Well, I think you were right that we should try to handle the situation
> > of unmarking attnotnull as much as possible, to decrease the chances
> > that the problematic situation occurs.  That means, we can use the
> > earlier code to handle DROP COLUMN when it causes a PK to be dropped --
> > even though we still need to handle the situation of an attnotnull flag
> > set with no pg_constraint row.  I mean, we still have to handle DROP
> > DOMAIN correctly (and there might be other cases that I haven't thought
> > about) ... but surely this is a much less common situation than the one
> > you reported.  So I'll merge everything and post an updated patch.
>
> Here's roughly what I'm thinking.  If we drop a constraint, we can still
> reset attnotnull in RemoveConstraintById(), but only after checking that
> it's not a generated column or a replica identity.  If they are, we
> don't error out -- just skip the attnotnull update.
>
> Now, about the code to allow ALTER TABLE DROP NOT NULL in case there's
> no pg_constraint row, I think at this point it's mostly dead code,
> because it can only happen when you have a replica identity or generated
> column ... and the DROP NOT NULL should still prevent you from dropping
> the flag anyway.  But the case can still arise, if you change the
> replica identity or ALTER TABLE ALTER COLUMN DROP DEFAULT, respectively.
>
> I'm still not ready with this -- still not convinced about the new AT
> pass.


Yeah, at first, I was also hesitant. Two reasons make me convinced.
in ATPostAlterTypeParse()
-
   else if (cmd->subtype == AT_SetAttNotNull)
{
/*
 * The parser will create
AT_AttSetNotNull subcommands for
 * columns of PRIMARY KEY
indexes/constraints, but we need
 * not do anything with them here,
because the columns'
 * NOT NULL marks will already have
been propagated into
 * the new table definition.
 */
}
---
The new table difinition continues to use old column not-null, so here does
nothing.
If we reset NOT NULL marks in RemoveConstrainById() when dropping PK
indirectly,
we need to do something here or somewhere else.

Except AT_SetAttNotNull type, other types add a AT pass to tab->subcmds.
Because
not-null should be added before re-adding index, there is no right AT pass
in current AlterTablePass.
So a new AT pass ahead AT_PASS_OLD_INDEX  is needed.

Another reason is that it can use ALTER TABLE frame to set not-null.
This way looks simpler and better than hardcode to re-install not-null in
some funciton.

-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Can't find not null constraint, but \d+ shows that

2024-04-11 Thread Tender Wang

jian he  于2024年4月12日周五 10:12写道：

> On Thu, Apr 11, 2024 at 10:48 PM Alvaro Herrera 
> wrote:
> >
> >
> > I'm still not ready with this -- still not convinced about the new AT
> > pass.  Also, I want to add a test for the pg_dump behavior, and there's
> > an XXX comment.
> >
> Now I am more confused...
>
> +-- make sure attnotnull is reset correctly when a PK is dropped
> indirectly,
> +-- or kept if there's a reason for that
> +CREATE TABLE notnull_tbl1 (c0 int, c1 int, PRIMARY KEY (c0, c1));
> +ALTER TABLE  notnull_tbl1 DROP c1;
> +\d+ notnull_tbl1
> +   Table "public.notnull_tbl1"
> + Column |  Type   | Collation | Nullable | Default | Storage | Stats
> target | Description
>
> ++-+---+--+-+-+--+-
> + c0 | integer |   |  | | plain   |
>   |
> +
> +DROP TABLE notnull_tbl1;
>
> same query, mysql make let "c0" be not null
> mysql https://dbfiddle.uk/_ltoU7PO
>
> for postgre
> https://dbfiddle.uk/ZHJXEqL1
> from 9.3 to 16 (click the link (https://dbfiddle.uk/ZHJXEqL1), then
> click "9.3" choose which version you like)
> all will make the remaining column "co" be not null.
>
> latest
> 0001-Better-handle-indirect-constraint-drops.patch make c0 attnotnull be
> false.
>
> previous patches:
> v2-0001-Handle-ALTER-.-DROP-NOT-NULL-when-no-pg_constrain.patch  make
> c0 attnotnull be true.
> 0001-Correctly-reset-attnotnull-when-constraints-dropped-.patch make
> c0 attnotnull be false.
>

I'm not sure that SQL standard specifies what database must do for this
case.
If the standard does not specify, then it depends on each database vendor's
decision.

Some people like not-null retained, other people may like not-null removed.
I think it will be ok if people can drop not-null or add not-null back
again after dropping pk.

In Master, not-null will reset when we drop PK directly. I hope dropping pk
indirectly
is consistent with dropping PK directly.

--
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Can't find not null constraint, but \d+ shows that

2024-04-11 Thread Tender Wang

jian he  于2024年4月11日周四 14:40写道：

> On Wed, Apr 10, 2024 at 2:10 PM jian he 
> wrote:
> >
> > DROP TABLE if exists notnull_tbl2;
> > CREATE TABLE notnull_tbl2 (c0 int generated by default as IDENTITY, c1
> int);
> > ALTER TABLE notnull_tbl2 ADD CONSTRAINT Q PRIMARY KEY(c0, c1);
> > ALTER TABLE notnull_tbl2 DROP CONSTRAINT notnull_tbl2_c0_not_null;
> > ALTER TABLE notnull_tbl2 DROP c1;
> > \d notnull_tbl2
>
> > ALTER TABLE notnull_tbl2 DROP CONSTRAINT notnull_tbl2_c0_not_null;
> per above sequence execution order, this should error out?
>
> otherwise which "not null" (attribute|constraint) to anchor "generated
> by default as identity" not null property?
> "DROP c1" will drop the not null property for "c0" and "c1".
> if "DROP CONSTRAINT notnull_tbl2_c0_not_nul" not error out, then
> " ALTER TABLE notnull_tbl2 DROP c1;"
> should either error out
> or transform "c0" from "c0 int generated by default as identity"
> to
> "c0 int"
>
> I try above case on MASTER and MASTER with Alvaro V2 patch, and all work
correctly.
\d+ notnull_tbl2 will see not-null of "c0".


>
> On Thu, Apr 11, 2024 at 1:23 AM Alvaro Herrera 
> wrote:
> >
> > On 2024-Apr-10, Alvaro Herrera wrote:
> >
> > > One thing missing here is pg_dump support.  If you just dump this
> table,
> > > it'll end up with no constraint at all.  That's obviously bad, so I
> > > propose we have pg_dump add a regular NOT NULL constraint for those, to
> > > avoid perpetuating the weird situation further.
> >
> > Here's another crude patchset, this time including the pg_dump aspect.
> >
>
> +DROP TABLE notnull_tbl1;
> +-- make sure attnotnull is reset correctly when a PK is dropped indirectly
> +CREATE TABLE notnull_tbl1 (c0 int, c1 int, PRIMARY KEY (c0, c1));
> +ALTER TABLE  notnull_tbl1 DROP c1;
> +\d+ notnull_tbl1
> +   Table "public.notnull_tbl1"
> + Column |  Type   | Collation | Nullable | Default | Storage | Stats
> target | Description
>
> ++-+---+--+-+-+--+-
> + c0 | integer |   | not null | | plain   |
>   |
> +
>
> this is not what we expected?
> "not null" for "c0" now should be false?
> am I missing something?
>
Yeah, now this is expected behavior.
Users can  drop manually not-null of "c0" if they want, and no error
reporte.

-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Can't find not null constraint, but \d+ shows that

2024-04-10 Thread Tender Wang

Alvaro Herrera  于2024年4月10日周三 21:58写道：

> It turns out that trying to close all holes that lead to columns marked
> not-null without a pg_constraint row is not possible within the ALTER
> TABLE framework, because it can happen outside it also.  Consider this
>
> CREATE DOMAIN dom1 AS integer;
> CREATE TABLE notnull_tbl (a dom1, b int, PRIMARY KEY (a, b));
> DROP DOMAIN dom1 CASCADE;
>
> In this case you'll end up with b having attnotnull=true and no
> constraint; and no amount of messing with tablecmds.c will fix it.
>

I try above case on my v4 patch[1], and it seems no result as what you said.
But, anyway, I now don't like updating other catalog in
RemoveConstraintById().
Because it will not be friendly for others who call RemoveConstraintById()
want only
to remove pg_constraint tuple, but actually it do more works stealthily.


> So I propose to instead allow those constraints, and treat them as
> second-class citizens.  We allow dropping them with ALTER TABLE DROP NOT
> NULL, and we allow to create a backing full-fledged constraint with SET
> NOT NULL or ADD CONSTRAINT.  So here's a partial crude initial patch to
> do that.
>

Hmm, the patch looks like the patch in my first email in this thread. But
my v1 patch seem
a poc at most.

>
>
> One thing missing here is pg_dump support.  If you just dump this table,
> it'll end up with no constraint at all.  That's obviously bad, so I
> propose we have pg_dump add a regular NOT NULL constraint for those, to
> avoid perpetuating the weird situation further.
>
> Another thing I wonder if whether I should use the existing
> set_attnotnull() instead of adding drop_orphaned_notnull().  Or we could
> just inline the code in ATExecDropNotNull, since it's small and
> self-contained.
>

I like just inline the code in  ATExecDropNotNull, as you said, it's small
and self-contained.
in ATExecDropNotNull(), we had open the pg_attributed table and hold
RowExclusiveLock,
the tuple we also get.
What we do is set attnotnull = false, and call CatalogTupleUpdate.

-- 
> Álvaro Herrera   48°01'N 7°57'E  —
> https://www.EnterpriseDB.com/
> "Postgres is bloatware by design: it was built to house
>  PhD theses." (Joey Hellerstein, SIGMOD annual conference 2002)
>

[1]
https://www.postgresql.org/message-id/CAHewXNn_So7LUCxxxyDNfdvCQp1TnD3gTVECBZX2bT_nbPgraQ%40mail.gmail.com
-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Revise some error messages in split partition code

2024-04-10 Thread Tender Wang

Richard Guo  于2024年4月10日周三 19:32写道：

> I noticed some error messages in the split partition code that are not
> up to par.  Such as:
>
> "new partitions not have value %s but split partition has"
>
> how about we revise it to:
>
> "new partitions do not have value %s but split partition does"
>
> Another one is:
>
> "any partition in the list should be DEFAULT because split partition is
> DEFAULT"
>
> how about we revise it to:
>
> "all partitions in the list should be DEFAULT because split partition is
> DEFAULT"
>
> Another problem I noticed is that in the test files partition_split.sql
> and partition_merge.sql, there are comments specifying the expected
> error messages for certain test queries.  However, in some cases, the
> error message mentioned in the comment does not match the error message
> actually generated by the query.  Such as:
>
> -- ERROR:  invalid partitions order, partition "sales_mar2022" can not be
> merged
> -- (space between sections sales_jan2022 and sales_mar2022)
> ALTER TABLE sales_range MERGE PARTITIONS (sales_jan2022, sales_mar2022)
> INTO sales_jan_mar2022;
> ERROR:  lower bound of partition "sales_mar2022" conflicts with upper
> bound of previous partition "sales_jan2022"
>
> I'm not sure if it's a good practice to specify the expected error
> message in the comment.  But if we choose to do so, I think we at least
> need to ensure that the specified error message in the comment remains
> consistent with the error message produced by the query.
>
> Also there are some comments containing grammatical issues.  Such as:
>
> -- no error: bounds of sales_noerror equals to lower and upper bounds of
> sales_dec2022 and sales_feb2022
>
> Attached is a patch to fix the issues I've observed.  I suspect there
> may be more to be found.
>

Yeah. The day before yesterday I found some grammer errors from error
messages and code comments [1] .
Except  those issues, @Alexander Lakhin   has found
some bugs [2]
I have some concerns that whether this patch is ready to commit.


[1]
https://www.postgresql.org/message-id/CAHewXNkGMPU50QG7V6Q60JGFORfo8LfYO1_GCkCa0VWbmB-fEw%40mail.gmail.com
[2]
https://www.postgresql.org/message-id/dbc8b96c-3cf0-d1ee-860d-0e491da20485%40gmail.com

-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Can't find not null constraint, but \d+ shows that

2024-04-10 Thread Tender Wang

jian he  于2024年4月10日周三 17:34写道：

>
> another related bug, in master.
>
> drop table if exists notnull_tbl1;
> CREATE TABLE notnull_tbl1 (c0 int not null, c1 int);
> ALTER TABLE notnull_tbl1 ADD CONSTRAINT Q PRIMARY KEY(c0, c1);
> \d+ notnull_tbl1
> ALTER TABLE notnull_tbl1 ALTER c0 DROP NOT NULL;
> ALTER TABLE notnull_tbl1 ALTER c1 DROP NOT NULL;
>
> "ALTER TABLE notnull_tbl1 ALTER c0 DROP NOT NULL;"
> should fail?
>

Yeah, it should fail as before, because c0 is primary key.
In master, although c0's pg_attribute.attnotnull is still true, but its
not-null constraint has been deleted
in dropconstraint_internal().

If we drop column c1 after dropping c0 not null, the primary key will be
dropped indirectly.
And now you can see c0 is still not-null if you do \d+ notnull_tbl1. But it
will report error "not found not-null"
if you alter c0 drop not null.

postgres=# ALTER TABLE notnull_tbl1 ALTER c0 DROP NOT NULL;
ALTER TABLE
postgres=# \d+ notnull_tbl1
  Table "public.notnull_tbl1"
 Column |  Type   | Collation | Nullable | Default | Storage | Compression
| Stats target | Description
+-+---+--+-+-+-+--+-
 c0 | integer |   | not null | | plain   |
|  |
 c1 | integer |   | not null | | plain   |
|  |
Indexes:
"q" PRIMARY KEY, btree (c0, c1)
Access method: heap

postgres=# alter table notnull_tbl1 drop c1;
ALTER TABLE
postgres=# \d notnull_tbl1
Table "public.notnull_tbl1"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
 c0 | integer |   | not null |

postgres=# alter table notnull_tbl1 alter c0 drop not null;
ERROR:  could not find not-null constraint on column "c0", relation
"notnull_tbl1"


-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Can't find not null constraint, but \d+ shows that

2024-04-10 Thread Tender Wang

jian he  于2024年4月10日周三 14:10写道：

> On Wed, Apr 10, 2024 at 1:29 AM Alvaro Herrera 
> wrote:
> >
> > On 2024-Mar-29, Tender Wang wrote:
> >
> > > I think aboved case can explain what's meaning about comments in
> > > dropconstraint_internal.
> > > But here, in RemoveConstraintById() , we only care about primary key
> case,
> > > so NOT NULL is better to removed from comments.
> >
> > Actually, I think it's better if all the resets of attnotnull occur in
> > RemoveConstraintById, for both types of constraints; we would keep that
> > block in dropconstraint_internal only to raise errors in the cases where
> > the constraint is protecting a replica identity or a generated column.
> > Something like the attached, perhaps, may need more polish.
> >
>
> DROP TABLE if exists notnull_tbl2;
> CREATE TABLE notnull_tbl2 (c0 int generated by default as IDENTITY, c1
> int);
> ALTER TABLE notnull_tbl2 ADD CONSTRAINT Q PRIMARY KEY(c0, c1);
> ALTER TABLE notnull_tbl2 DROP CONSTRAINT notnull_tbl2_c0_not_null;
> ALTER TABLE notnull_tbl2 DROP c1;
> \d notnull_tbl2
>
> last "\d notnull_tbl2" command, master output is:
> Table "public.notnull_tbl2"
>  Column |  Type   | Collation | Nullable | Default
>
> +-+---+--+--
>  c0 | integer |   | not null | generated by default as identity
>
>
>
> last "\d notnull_tbl2" command, applying
> 0001-Correctly-reset-attnotnull-when-constraints-dropped-.patch
> output:
> Table "public.notnull_tbl2"
>  Column |  Type   | Collation | Nullable | Default
>
> +-+---+--+--
>  c0 | integer |   |  | generated by default as identity
>

Hmm,
ALTER TABLE notnull_tbl2 DROP c1; will not call dropconstraint_internal().
When dropping PK constraint indirectly, c0's attnotnull was set to false in
RemoveConstraintById().


-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2024-04-08 Thread Tender Wang

Andrey M. Borodin  于2024年4月8日周一 17:40写道：

>
>
> > On 27 Sep 2023, at 16:06, tender wang  wrote:
> >
> >Do you have any comments or suggestions on this issue? Thanks.
> Hi Tender,
>
> there are some review comments in the thread, that you might be interested
> in.
> I'll mark this [0] entry "Waiting on Author" and move to next CF.
>

 Thank you for the reminder.  I will update the patch later.
I also deeply hope to get more advice about this patch.
(even the advice that not worth  continuint to work on this patch).

Thanks.

Thanks!
>
>
> Best regards, Andrey Borodin.
>
> [0]https://commitfest.postgresql.org/47/4549/



-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2024-04-08 Thread Tender Wang

Hi all,
  I went through the MERGE/SPLIT partition codes today, thanks for the
works.  I found some grammar errors:
 i. in error messages(Users can see this grammar errors, not friendly).
ii. in codes comments



Alexander Korotkov  于2024年4月7日周日 06:23写道：

> Hi, Dmitry!
>
> On Fri, Apr 5, 2024 at 4:00 PM Dmitry Koval 
> wrote:
> > > I've revised the patchset.
> >
> > Thanks for the corrections (especially ddl.sgml).
> > Could you also look at a small optimization for the MERGE PARTITIONS
> > command (in a separate file
> > v31-0003-Additional-patch-for-ALTER-TABLE-.-MERGE-PARTITI.patch, I wrote
> > about it in an email 2024-03-31 00:56:50)?
> >
> > Files v31-0001-*.patch, v31-0002-*.patch are the same as
> > v30-0001-*.patch, v30-0002-*.patch (after rebasing because patch stopped
> > applying due to changes in upstream).
>
> I've pushed 0001 and 0002.  I didn't push 0003 for the following reasons.
> 1) This doesn't keep functionality equivalent to 0001.  With 0003, the
> merged partition will inherit indexes, constraints, and so on from the
> one of merging partitions.
> 2) This is not necessarily an optimization. Without 0003 indexes on
> the merged partition are created after moving the rows in
> attachPartitionTable(). With 0003 we merge data into the existing
> partition which saves its indexes.  That might cause a significant
> performance loss because mass inserts into indexes may be much slower
> than building indexes from scratch.
> I think both aspects need to be carefully considered.  Even if we
> accept them, this needs to be documented.  I think now it's too late
> for both of these.  So, this should wait for v18.
>
> --
> Regards,
> Alexander Korotkov
>
>
>

-- 
Tender Wang
OpenPie:  https://en.openpie.com/


0001-Fix-some-grammer-errors-from-error-messages-and-code.patch
Description: Binary data

Re: Can't find not null constraint, but \d+ shows that

2024-04-07 Thread Tender Wang

It has been several days since the last email.  Do you have any
suggestions, please?

What I'm concerned about  is that adding a new AT_PASS is good fix?  Is it
a big try?
More concerned is that it can cover all ALTER TABLE cases?

Any thoughts.
-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Can't find not null constraint, but \d+ shows that

2024-03-29 Thread Tender Wang

jian he  于2024年3月29日周五 14:56写道：

> hi.
> about v4, i think, i understand the changes you made.
> RemoveConstraintById(Oid conId)
> will drop a single constraint record.
> if the constraint is primary key, then primary key associated
> attnotnull should set to false.
> but sometimes it shouldn't.
>

Yeah, indeed.

>
>
> for example:
> drop table if exists t2;
> CREATE TABLE t2(c0 int, c1 int);
> ALTER TABLE  t2 ADD CONSTRAINT t2_pk PRIMARY KEY(c0, c1);
> ALTER TABLE t2 ALTER COLUMN c0 ADD GENERATED ALWAYS AS IDENTITY;
> ALTER TABLE  t2 DROP c1;
>
> + * If this was a NOT NULL or the primary key, the constrained columns must
> + * have had pg_attribute.attnotnull set.  See if we need to reset it, and
> + * do so.
> + */
> + if (unconstrained_cols != NIL)
>
> unconstrained_cols is not NIL, which means we have dropped a primary key.
> I am confused by "If this was a NOT NULL or the primary key".
>

NOT NULL means the definition of column having not-null constranit. For
example:
create table t1(a int not null);
the pg_attribute.attnotnull is set to true.
primary key case like below:
create table t2(a int primary key);
the pg_attribute.attnotnull is set to true.

I think aboved case can explain what's meaning about comments in
dropconstraint_internal.
But here, in RemoveConstraintById() , we only care about primary key case,
so NOT NULL is better
to removed from comments.


>
> +
> + /*
> + * Since the above deletion has been made visible, we can now
> + * search for any remaining constraints on this column (or these
> + * columns, in the case we're dropping a multicol primary key.)
> + * Then, verify whether any further NOT NULL exists, and reset
> + * attnotnull if none.
> + */
> + contup = findNotNullConstraintAttnum(RelationGetRelid(rel), attnum);
> + if (HeapTupleIsValid(contup))
> + {
> + heap_freetuple(contup);
> + heap_freetuple(atttup);
> + continue;
> + }
>
> I am a little bit confused by the above comment.
> I think the comments should say,
> if contup is valid, that means, we already have one  "not null"
> constraint associate with the attnum
> in that condition, we must not set attnotnull, otherwise the
> consistency between attnotnull and "not null"
> table constraint will be broken.
>
> other than that, the change in RemoveConstraintById looks sane.
>
 Above comments want to say that after pk constranit dropped, if there are
tuples in
pg_constraint, that means the definition of column has not-null constraint.
So we can't
set pg_attribute.attnotnull to false.

For example:
create table t1(a int not null);
alter table t1 add constraint t1_pk primary key(a);
alter table t1 drop constraint t1_pk;


-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Can't find not null constraint, but \d+ shows that

2024-03-28 Thread Tender Wang

Alvaro Herrera  于2024年3月28日周四 17:18写道：

> On 2024-Mar-28, Tender Wang wrote:
>
> >  RemoveConstraintById() should think recurse(e.g. partition table)? I'm
> not
> > sure now.
> >  If we should think process recurse in RemoveConstraintById(),  the
> > function will look complicated than before.
>
> No -- this function handles just a single constraint, as identified by
> OID.  The recursion is handled by upper layers, which can be either
> dependency.c or tablecmds.c.  I think the problem you found is caused by
> the fact that I worked with the tablecmds.c recursion and neglected the
> one in dependency.c.
>

Indeed.

create table skip_wal_skip_rewrite_index (c varchar(10) primary key);
alter table skip_wal_skip_rewrite_index alter c type varchar(20);

Above SQL need attnotnull to be true when re-add index, but
RemoveConstraintById() is hard to recognize this scenario as I know.
We should re-set attnotnull to be true before re-add index. I add a new
AT_PASS in attached patch.
Any thoughts?
--
Tender Wang
OpenPie:  https://en.openpie.com/


v4-0001-Fix-pg_attribute-attnotnull-not-reset-when-droppe.patch
Description: Binary data

Re: Can't find not null constraint, but \d+ shows that

2024-03-28 Thread Tender Wang

Alvaro Herrera  于2024年3月26日周二 23:25写道：

> On 2024-Mar-26, Tender Wang wrote:
>
> > postgres=# CREATE TABLE t1(c0 int, c1 int);
> > postgres=# ALTER TABLE  t1 ADD CONSTRAINT Q PRIMARY KEY(c0, c1);
> > postgres=# ALTER TABLE  t1 DROP c1;
> >
> > postgres=# ALTER TABLE  t1  ALTER c0 DROP NOT NULL;
> > ERROR:  could not find not-null constraint on column "c0", relation "t1"
>
> Ooh, hah, what happens here is that we drop the PK constraint
> indirectly, so we only go via doDeletion rather than the tablecmds.c
> code, so we don't check the attnotnull flags that the PK was protecting.
>
> > The attached patch is my workaround solution.  Look forward your apply.
>
> Yeah, this is not a very good approach -- I think you're just guessing
> that the column is marked NOT NULL because a PK was dropped in the
> past -- but really what this catalog state is, is corrupted contents
> because the PK drop was mishandled.  At least in theory there are other
> ways to drop a constraint other than dropping one of its columns (for
> example, maybe this could happen if you drop a collation that the PK
> depends on).  The right approach is to ensure that the PK drop always
> does the dance that ATExecDropConstraint does.  A good fix probably just
> moves some code from dropconstraint_internal to RemoveConstraintById.
>

 RemoveConstraintById() should think recurse(e.g. partition table)? I'm not
sure now.
 If we should think process recurse in RemoveConstraintById(),  the
function will look complicated than before.

-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Can't find not null constraint, but \d+ shows that

2024-03-28 Thread Tender Wang

jian he  于2024年3月28日周四 13:18写道：

> On Wed, Mar 27, 2024 at 10:26 PM Tender Wang  wrote:
> >
> > Alvaro Herrera  于2024年3月26日周二 23:25写道：
> >>
> >> On 2024-Mar-26, Tender Wang wrote:
> >>
> >> > postgres=# CREATE TABLE t1(c0 int, c1 int);
> >> > postgres=# ALTER TABLE  t1 ADD CONSTRAINT Q PRIMARY KEY(c0, c1);
> >> > postgres=# ALTER TABLE  t1 DROP c1;
> >> >
> >> > postgres=# ALTER TABLE  t1  ALTER c0 DROP NOT NULL;
> >> > ERROR:  could not find not-null constraint on column "c0", relation
> "t1"
> >>
> >> Ooh, hah, what happens here is that we drop the PK constraint
> >> indirectly, so we only go via doDeletion rather than the tablecmds.c
> >> code, so we don't check the attnotnull flags that the PK was protecting.
> >>
> >> > The attached patch is my workaround solution.  Look forward your
> apply.
> >>
>
> after applying
> v2-0001-Fix-pg_attribute-attnotnull-not-reset-when-droppe.patch
>
> something is off, now i cannot drop a table.
> demo:
> CREATE TABLE t2(c0 int, c1 int);
> ALTER TABLE  t2 ADD CONSTRAINT t2_pk PRIMARY KEY(c0, c1);
> ALTER TABLE t2 ALTER COLUMN c0 ADD GENERATED ALWAYS AS IDENTITY;
> DROP TABLE t2 cascade;
> Similarly, maybe there will be some issue with replica identity.
>
Thanks for review this patch. Yeah, I can reproduce it.  The error reported
in RemoveConstraintById(), where I moved
some codes from dropconstraint_internal(). But some check seems to no need
in RemoveConstraintById(). For example:

/*
* It's not valid to drop the not-null constraint for a GENERATED
* AS IDENTITY column.
*/
if (attForm->attidentity)
ereport(ERROR,
errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("column \"%s\" of relation \"%s\" is an identity column",
  get_attname(RelationGetRelid(rel), attnum,
  false),
  RelationGetRelationName(rel)));

/*
* It's not valid to drop the not-null constraint for a column in
* the replica identity index, either. (FULL is not affected.)
*/
if (bms_is_member(lfirst_int(lc) - FirstLowInvalidHeapAttributeNumber,
ircols))
ereport(ERROR,
errcode(ERRCODE_INVALID_TABLE_DEFINITION),
errmsg("column \"%s\" is in index used as replica identity",
  get_attname(RelationGetRelid(rel), lfirst_int(lc), false)));

Above two check can remove from RemoveConstraintById()? I need more test.

>
> + /*
> + * If this was a NOT NULL or the primary key, the constrained columns must
> + * have had pg_attribute.attnotnull set.  See if we need to reset it, and
> + * do so.
> + */
> + if (unconstrained_cols)
> it should be if (unconstrained_cols != NIL)?,
> given unconstrained_cols is a List, also "unconstrained_cols" naming
> seems not intuitive.
> maybe pk_attnums or pk_cols or pk_columns.
>
As I said above, the codes were copied from  dropconstraint_internal(). NOT
NULL columns were not alwayls PK.
So I thinks "unconstrained_cols" is OK.

>
> + attrel = table_open(AttributeRelationId, RowExclusiveLock);
> + rel = table_open(con->conrelid, RowExclusiveLock);
> I am not sure why we using RowExclusiveLock for con->conrelid?
> given we use AccessExclusiveLock at:
> /*
> * If the constraint is for a relation, open and exclusive-lock the
> * relation it's for.
> */
> rel = table_open(con->conrelid, AccessExclusiveLock);
>
Yeah, you are right.

>
>
> + /*
> + * Since the above deletion has been made visible, we can now
> + * search for any remaining constraints on this column (or these
> + * columns, in the case we're dropping a multicol primary key.)
> + * Then, verify whether any further NOT NULL or primary key
> + * exists, and reset attnotnull if none.
> + *
> + * However, if this is a generated identity column, abort the
> + * whole thing with a specific error message, because the
> + * constraint is required in that case.
> + */
> + contup = findNotNullConstraintAttnum(RelationGetRelid(rel), attnum);
> + if (contup ||
> + bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber,
> +  pkcols))
> + continue;
>
> I didn't get this part.
> if you drop delete a primary key,
> the "NOT NULL" constraint within pg_constraint should definitely be
> removed?
> therefore contup should be pretty sure is NULL?
>

No,  If the original definaiton of column includes NOT NULL, we can't reset
attnotnull to false when
We we drop PK.


>
>   /*
> - * The parser will create AT_AttSetNotNull subcommands for
> - * columns of PRIMARY KEY indexes/constraints, but we need
> - * not do anything with them here, because the columns'
> - * NOT NULL marks will already have been propagated into
> - * the new table definition.
> + * PK drop now will reset pg_attribute attnotnull to false.
> + * We should set attnotnull to true again.
>   */
> PK drop now will reset pg_attribute attnotnull to false,
> which is what we should be expecting.
> the comment didn't explain why should set attnotnull to true again?
>

The V2 patch still needs more cases to test, Probably not right solution.
Anyway, I will  send a v3 version patch  after I do more test.


-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Can't find not null constraint, but \d+ shows that

2024-03-27 Thread Tender Wang

Alvaro Herrera  于2024年3月26日周二 23:25写道：

> On 2024-Mar-26, Tender Wang wrote:
>
> > postgres=# CREATE TABLE t1(c0 int, c1 int);
> > postgres=# ALTER TABLE  t1 ADD CONSTRAINT Q PRIMARY KEY(c0, c1);
> > postgres=# ALTER TABLE  t1 DROP c1;
> >
> > postgres=# ALTER TABLE  t1  ALTER c0 DROP NOT NULL;
> > ERROR:  could not find not-null constraint on column "c0", relation "t1"
>
> Ooh, hah, what happens here is that we drop the PK constraint
> indirectly, so we only go via doDeletion rather than the tablecmds.c
> code, so we don't check the attnotnull flags that the PK was protecting.
>
> > The attached patch is my workaround solution.  Look forward your apply.
>
> Yeah, this is not a very good approach -- I think you're just guessing
> that the column is marked NOT NULL because a PK was dropped in the
> past -- but really what this catalog state is, is corrupted contents
> because the PK drop was mishandled.  At least in theory there are other
> ways to drop a constraint other than dropping one of its columns (for
> example, maybe this could happen if you drop a collation that the PK
> depends on).  The right approach is to ensure that the PK drop always
> does the dance that ATExecDropConstraint does.  A good fix probably just
> moves some code from dropconstraint_internal to RemoveConstraintById.
>

I found some types ddl would check the attnotnull of column is true, for
example:  AT_ReAddIndex, AT_ReplicaIdentity.
So we should add AT_SetAttNotNull sub-command to the wqueue. I add a
new AT_PASS_OLD_COL_ATTRS to make sure
AT_SetAttNotNull  will have done when do AT_ReAddIndex or
AT_ReplicaIdentity.



-- 
Tender Wang
OpenPie:  https://en.openpie.com/


v2-0001-Fix-pg_attribute-attnotnull-not-reset-when-droppe.patch
Description: Binary data

Re: Can't find not null constraint, but \d+ shows that

2024-03-27 Thread Tender Wang

Alvaro Herrera  于2024年3月26日周二 23:25写道：

> On 2024-Mar-26, Tender Wang wrote:
>
> > postgres=# CREATE TABLE t1(c0 int, c1 int);
> > postgres=# ALTER TABLE  t1 ADD CONSTRAINT Q PRIMARY KEY(c0, c1);
> > postgres=# ALTER TABLE  t1 DROP c1;
> >
> > postgres=# ALTER TABLE  t1  ALTER c0 DROP NOT NULL;
> > ERROR:  could not find not-null constraint on column "c0", relation "t1"
>
> Ooh, hah, what happens here is that we drop the PK constraint
> indirectly, so we only go via doDeletion rather than the tablecmds.c
> code, so we don't check the attnotnull flags that the PK was protecting.
>
> > The attached patch is my workaround solution.  Look forward your apply.
>
> Yeah, this is not a very good approach -- I think you're just guessing
> that the column is marked NOT NULL because a PK was dropped in the
> past -- but really what this catalog state is, is corrupted contents
> because the PK drop was mishandled.  At least in theory there are other
> ways to drop a constraint other than dropping one of its columns (for
> example, maybe this could happen if you drop a collation that the PK
> depends on).  The right approach is to ensure that the PK drop always
> does the dance that ATExecDropConstraint does.  A good fix probably just
> moves some code from dropconstraint_internal to RemoveConstraintById.
>

I think again, below solutin maybe looks more better:
i. move some code from  dropconstraint_internal to RemoveConstraintById,
  not change the RemoveConstraintById interface. Ensure that the PK drop
always
  does the dance that ATExecDropConstraint does.

ii. After i phase, the attnotnull of some column of primary key  may be
reset to false as I provided example in last email.
   We can set attnotnull to true again in some place.


-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: Can't find not null constraint, but \d+ shows that

2024-03-26 Thread Tender Wang

Alvaro Herrera  于2024年3月26日周二 23:25写道：

> On 2024-Mar-26, Tender Wang wrote:
>
> > postgres=# CREATE TABLE t1(c0 int, c1 int);
> > postgres=# ALTER TABLE  t1 ADD CONSTRAINT Q PRIMARY KEY(c0, c1);
> > postgres=# ALTER TABLE  t1 DROP c1;
> >
> > postgres=# ALTER TABLE  t1  ALTER c0 DROP NOT NULL;
> > ERROR:  could not find not-null constraint on column "c0", relation "t1"
>
> Ooh, hah, what happens here is that we drop the PK constraint
> indirectly, so we only go via doDeletion rather than the tablecmds.c
> code, so we don't check the attnotnull flags that the PK was protecting.
>

Yeah,   Indeed, as you said.

> The attached patch is my workaround solution.  Look forward your apply.
>
> Yeah, this is not a very good approach -- I think you're just guessing
> that the column is marked NOT NULL because a PK was dropped in the
> past -- but really what this catalog state is, is corrupted contents
> because the PK drop was mishandled.  At least in theory there are other
> ways to drop a constraint other than dropping one of its columns (for
> example, maybe this could happen if you drop a collation that the PK
> depends on).  The right approach is to ensure that the PK drop always
> does the dance that ATExecDropConstraint does.  A good fix probably just
> moves some code from dropconstraint_internal to RemoveConstraintById.
>

Agreed. It is look better.  But it will not work if simply move some codes
from dropconstraint_internal
to RemoveConstraintById. I have tried this fix before 0001 patch, but
failed.

For example:
create table skip_wal_skip_rewrite_index (c varchar(10) primary key);
alter table skip_wal_skip_rewrite_index alter c type varchar(20);
ERROR:  primary key column "c" is not marked NOT NULL

index_check_primary_key() in index.c has below comments;

"We check for a pre-existing primary key, and that all columns of the index
are simple column references (not expressions), and that all those columns
are marked NOT NULL.  If not, fail."

So in aboved example, RemoveConstraintById() can't reset attnotnull. We can
pass some information  to
RemoveConstraintById() like a bool var to indicate that attnotnull should
be reset or not.



--
Tender Wang
OpenPie:  https://en.openpie.com/

Can't find not null constraint, but \d+ shows that

2024-03-26 Thread Tender Wang

Hi Alvaro,

I met an issue related to Catalog not-null commit on HEAD.

postgres=# CREATE TABLE t1(c0 int, c1 int);
CREATE TABLE
postgres=# ALTER TABLE  t1 ADD CONSTRAINT Q PRIMARY KEY(c0, c1);
ALTER TABLE
postgres=# \d+ t1
   Table "public.t1"
 Column |  Type   | Collation | Nullable | Default | Storage | Compression
| Stats target | Description
+-+---+--+-+-+-+--+-
 c0 | integer |   | not null | | plain   |
|  |
 c1 | integer |   | not null | | plain   |
|  |
Indexes:
"q" PRIMARY KEY, btree (c0, c1)
Access method: heap

postgres=# ALTER TABLE  t1 DROP c1;
ALTER TABLE
postgres=# \d+ t1
   Table "public.t1"
 Column |  Type   | Collation | Nullable | Default | Storage | Compression
| Stats target | Description
+-+---+--+-+-+-+--+-
 c0 | integer |   | not null | | plain   |
|  |
Access method: heap

postgres=# ALTER TABLE  t1  ALTER c0 DROP NOT NULL;
ERROR:  could not find not-null constraint on column "c0", relation "t1"
postgres=# insert into t1 values (NULL);
ERROR:  null value in column "c0" of relation "t1" violates not-null
constraint
DETAIL:  Failing row contains (null).

I couldn't reproduce aboved issue on older version(12.12 ...16.1).
to be more precisely， since  b0e96f3119 commit.

Without the b0e96f3119, when we drop not null constraint, we  just update
the pg_attribute attnotnull to false
in ATExecDropNotNull().  But now we first check pg_constraint if has the
tuple. if attnotnull is ture, but pg_constraint
doesn't has that tuple. Aboved error will report.

It will be confuesed for users. Because \d shows the column c0 has not
null, and we cann't insert NULL value. But it
reports errore when users drop the NOT NULL constraint.

The attached patch is my workaround solution.  Look forward your apply.

-- 
Tender Wang
OpenPie:  https://en.openpie.com/


0001-Fix-attnotnull-not-correct-reset-issue.patch
Description: Binary data

Re: "type with xxxx does not exist" when doing ExecMemoize()

2024-03-10 Thread Tender Wang

David Rowley  于2024年3月11日周一 13:25写道：

> On Thu, 7 Mar 2024 at 22:50, David Rowley  wrote:
> >
> > On Thu, 7 Mar 2024 at 15:24, Tender Wang  wrote:
> > >
> > > Andrei Lepikhov  于2024年3月6日周三 11:37写道：
> > >> I think, it is a bug. Should it be fixed (and back-patched) earlier?
> > >
> > > Agreed.   Need David to review it as he knows this area best.
> >
> > This is on my list of things to do. Just not at the top yet.
>
> I've gone over this patch and I'm happy with the changes to
> nodeMemoize.c.  The thing I did change was the newly added test.  The
> problem there was the test was passing for me with and without the
> code fix.  I ended up changing the test so the cache hits and misses
> are reported.  That required moving the test to above where the
> work_mem is set to 64KB so we can be certain the values will all be
> cached and the cache hits are predictable.
>
> My other changes were just cosmetic.
>
> Thanks for working on this fix.  I've pushed the patch.
>
> David
>

Thanks for pushing the patch.
-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: "type with xxxx does not exist" when doing ExecMemoize()

2024-03-06 Thread Tender Wang

Andrei Lepikhov  于2024年3月6日周三 11:37写道：

> I think, it is a bug. Should it be fixed (and back-patched) earlier?
>

Agreed.   Need David to review it as he knows this area best.


-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: "type with xxxx does not exist" when doing ExecMemoize()

2024-03-05 Thread Tender Wang

Andrei Lepikhov  于2024年3月5日周二 17:36写道：

> On 1/3/2024 14:18, Tender Wang wrote:
> > During debug, I learned that numeric_add doesn't have type check like
> > rangetype, so aboved query will not report "type with xxx does not
> exist".
> >
> > And I realize that  the test case added by Andrei Lepikhov  in v3 is
> > right. So in v6 patch I add Andrei Lepikhov's test case.  Thanks a lot.
> >
> > Now I think the v6 version patch seems to be complete now.
> I've passed through the patch, and it looks okay. Although I am afraid
> of the same problems that future changes can cause and how to detect
> them, it works correctly.
>

Thanks for reviewing it, and I add it to commitfest 2024-07.


-- 
Tender Wang
OpenPie:  https://en.openpie.com/

Re: "type with xxxx does not exist" when doing ExecMemoize()

2024-02-29 Thread Tender Wang

Hi,

When I think about how to add a test case for v5 version patch, and I want
to test if v5 version patch has memory leak.
This thread [1] provided a way how to repeat the memory leak, so I used it
to test v5 patch. I didn't found memory leak on
v5 patch.

But I found other interesting issue. When changed whereClause in [1],  the
query reported below error:

"ERROR could not find memoization table entry"

the query:
EXPLAIN analyze
select sum(q.id_table1)
from (
SELECT t2.*
FROM table1 t1
JOIN table2 t2
ON (t2.id_table1 + t2.id_table1) = t1.id) q;

But on v5 patch, it didn't report error.

I guess it is the same reason that data in probeslot was reset in Hash
function.

I debug the above query, and get this:
before
(gdb) p *(DatumGetNumeric(mstate->probeslot->tts_values[0]))
$1 = {vl_len_ = 48, choice = {n_header = 32770, n_long = {n_sign_dscale =
32770, n_weight = 60, n_data = 0x564632ebd708}, n_short = {n_header =
32770, n_data = 0x564632ebd706}}}
after
(gdb) p *(DatumGetNumeric(mstate->probeslot->tts_values[0]))
$2 = {vl_len_ = 264, choice = {n_header = 32639, n_long = {n_sign_dscale =
32639, n_weight = 32639, n_data = 0x564632ebd6a8}, n_short = {n_header =
32639, n_data = 0x564632ebd6a6}}}

So after call ResetExprContext() in Hash function, the data in probeslot is
corrupted.  It is not sure what error will happen when executing on
corrupted data.

During debug, I learned that numeric_add doesn't have type check like
rangetype, so aboved query will not report "type with xxx does not exist".

And I realize that  the test case added by Andrei Lepikhov  in v3 is right.
So in v6 patch I add Andrei Lepikhov's test case.  Thanks a lot.

Now I think the v6 version patch seems to be complete now.



[1]
https://www.postgresql.org/message-id/83281eed63c74e4f940317186372abfd%40cft.ru

-- 
Tender Wang
OpenPie:  https://en.openpie.com/


v6-0001-Fix-wrong-used-ResetExprContext-in-ExecMemoize.patch
Description: Binary data

Re: "type with xxxx does not exist" when doing ExecMemoize()

2024-02-28 Thread Tender Wang

I read Memoize code and how other node use ResetExprContext() recently.

The comments about per_tuple_memory said that :

 * ecxt_per_tuple_memory is a short-term context for expression results.
 *  As the name suggests, it will typically be reset once per tuple,
 *  before we begin to evaluate expressions for that tuple.  Each
 *  ExprContext normally has its very own per-tuple memory context.

So ResetExprContext() should called once per tuple, but not in Hash and
Equal function just as Richard said before.
In ExecResult() and ExecProjectSet(), they call ResetExprContext() once
when enter these functions.
So I think ExecMemoize() can do the same way.

The attached patch includes below modifications:
1.
When I read the code in nodeMemoize.c, I found a typos: outer should be
inner,
if I don't misunderstand the intend of Memoize.

2.
I found that almost executor node call CHECK_FOR_INTERRUPTS(), so I add it.
Is it right to add it for ExecMemoize()?

3.
I remove ResetExprContext() from Hash and Equal funciton. And I call it
when enter
ExecMemoize() just like ExecPrejectSet() does.
ExecQualAndReset() is replaed with ExecQual().

4.
This patch doesn't include test case.  I use the Andrei's test case, but I
don't repeat the aboved issue.
I may need to spend some more time to think about how to repeat this issue
easily.

So, what do you think about the one proposed in v5? @Andrei Lepikhov
  @Richard Guo  @David
Rowley  .
I don't want to continue to do work based on v3 patch. As Andrei Lepikhov
said, using mstate->tableContext for probeslot
is not good.  v5 looks more simple.


v5-0001-Fix-wrong-used-ResetExprContext-in-ExecMemoize.patch
Description: Binary data

Re: "type with xxxx does not exist" when doing ExecMemoize()

2024-02-27 Thread Tender Wang

The attached patch is a new version based on v3(not including Andrei's the
test case). There is no need to call datumCopy when
isnull is true.

I have not added a new runtime memoryContext so far. Continue to use
mstate->tableContext, I'm not sure the memory used of probeslot will affect
mstate->mem_limit.
Maybe adding a new memoryContext is better. I think I should spend a little
time to learn nodeMemoize.c more deeply.

Andrei Lepikhov  于2024年2月26日周一 20:29写道：

> On 26/2/2024 18:34, Richard Guo wrote:
> >
> > On Mon, Feb 26, 2024 at 3:54 PM Andrei Lepikhov
> > mailto:a.lepik...@postgrespro.ru>> wrote:
> >
> > On 26/2/2024 12:44, Tender Wang wrote:
> >  > Make sense. I found MemoizeState already has a MemoryContext, so
> > I used it.
> >  > I update the patch.
> > This approach is better for me. In the next version of this patch, I
> > included a test case. I am still unsure about the context chosen and
> > the
> > stability of the test case. Richard, you recently fixed some Memoize
> > issues, could you look at this problem and patch?
> >
> >
> > I looked at this issue a bit.  It seems to me what happens is that at
> > first the memory areas referenced by probeslot->tts_values[] are
> > allocated in the per tuple context (see prepare_probe_slot).  And then
> > in MemoizeHash_hash, after we've calculated the hashkey, we will reset
> > the per tuple context.  However, later in MemoizeHash_equal, we still
> > need to reference the values in probeslot->tts_values[], which have been
> > cleared.
> Agree
> >
> > Actually the context would always be reset in MemoizeHash_equal, for
> > both binary and logical mode.  So I kind of wonder if it's necessary to
> > reset the context in MemoizeHash_hash.
> I can only provide one thought against this solution: what if we have a
> lot of unique hash values, maybe all of them? In that case, we still
> have a kind of 'leak' David fixed by the commit 0b053e78b5.
> Also, I have a segfault report of one client. As I see, it was caused by
> too long text column in the table slot. As I see, key value, stored in
> the Memoize hash table, was corrupted, and the most plain reason is this
> bug. Should we add a test on this bug, and what do you think about the
> one proposed in v3?
>
> --
> regards,
> Andrei Lepikhov
> Postgres Professional
>
>

-- 
Tender Wang
OpenPie:  https://en.openpie.com/


v4-0001-Fix-RangeType-oid-not-found-when-doing-Memoize.patch
Description: Binary data

"type with xxxx does not exist" when doing ExecMemoize()

2024-02-25 Thread Tender Wang

Hi,

I met Memoize node failed When I used sqlancer test postgres.
database0=# explain select t0.c0 from t0 join t5 on t0.c0 = (t5.c0 - t5.c0);
  QUERY PLAN
--
 Nested Loop  (cost=0.17..21.20 rows=4 width=32)
   ->  Seq Scan on t5  (cost=0.00..1.04 rows=4 width=14)
   ->  Memoize  (cost=0.17..6.18 rows=1 width=32)
 Cache Key: (t5.c0 - t5.c0)
 Cache Mode: logical
 ->  Index Only Scan using t0_c0_key on t0  (cost=0.15..6.17 rows=1
width=32)
   Index Cond: (c0 = (t5.c0 - t5.c0))
(7 rows)

database0=# select t0.c0 from t0 join t5 on t0.c0 = (t5.c0 - t5.c0);
ERROR:  type with OID 2139062143 does not exist

How to repeat:
The attached database0.log (created by sqlancer) included statements to
repeat this issue.
Firstly, create database test;
then;
psql postgres
\i   /xxx/database0.log

I analyzed aboved issue this weekend. And I found that
After called ResetExprContext() in MemoizeHash_hash(), the data in
mstate->probeslot was corrputed.

in prepare_probe_slot: the data as below:
(gdb) p *(DatumGetRangeTypeP(pslot->tts_values[0]))
$1 = {vl_len_ = 36, rangetypid = 3904}
after called ResetExprContext() in MemoizeHash_hash:
(gdb) p *(DatumGetRangeTypeP(pslot->tts_values[0]))
$3 = {vl_len_ = 264, rangetypid = 2139062143}

I think in prepare_probe_slot(), should called datumCopy as the attached
patch does.

Any thoughts? Thanks.
--
Tender Wang
OpenPie:  https://en.openpie.com/


database0.log
Description: Binary data


0001-Fix-RangeType-oid-not-found-when-doing-Memoize.patch
Description: Binary data

Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock

2023-12-14 Thread tender wang

Andrey M. Borodin  于2023年12月14日周四 17:35写道：

>
>
> > On 14 Dec 2023, at 14:28, tender wang  wrote:
> >
> >   Now that AND is more faster, Can we  replace the '%
> SLRU_MAX_BANKLOCKS' operation in  SimpleLruGetBankLock()  with '& 127'
>
> unsigned int GetBankno1(unsigned int pageno) {
> return pageno & 127;
> }
>
> unsigned int GetBankno2(unsigned int pageno) {
> return pageno % 128;
> }
>
> Generates with -O2
>
> GetBankno1(unsigned int):
> mov eax, edi
> and eax, 127
> ret
> GetBankno2(unsigned int):
> mov eax, edi
> and eax, 127
> ret
>
>
> Compiler is smart enough with constants.
>

Yeah, that's true.

int GetBankno(long pageno) {
unsigned short bank_mask = 128;
int bankno = (pageno & bank_mask) % 128;
return bankno;
}
enable -O2, only one  instruction:
xor eax, eax

But if we all use '%',  thing changs as below:
int GetBankno(long pageno) {
unsigned short bank_mask = 128;
int bankno = (pageno % bank_mask) % 128;
return bankno;
}
mov rdx, rdi
sar rdx, 63
shr rdx, 57
lea rax, [rdi+rdx]
and eax, 127
sub eax, edx


> Best regards, Andrey Borodin.

Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock

2023-12-14 Thread tender wang

Andrey M. Borodin  于2023年12月14日周四 17:02写道：

>
>
> > On 14 Dec 2023, at 08:12, Amul Sul  wrote:
> >
> >
> > + int bankno = pageno & ctl->bank_mask;
> >
> > I am a bit uncomfortable seeing it as a mask, why can't it be simply a
> number
> > of banks (num_banks) and get the bank number through modulus op (pageno %
> > num_banks) instead of bitwise & operation (pageno & ctl->bank_mask)
> which is a
> > bit difficult to read compared to modulus op which is quite simple,
> > straightforward and much common practice in hashing.
> >
> > Are there any advantages of using &  over % ?
>

use Compiler Explorer[1] tool, '%' has more Assembly instructions than '&'
.
int GetBankno1(int pageno) {
return pageno & 127;
}

int GetBankno2(int pageno) {
return pageno % 127;
}
under clang 13.0
GetBankno1: # @GetBankno1
pushrbp
mov rbp, rsp
mov dword ptr [rbp - 4], edi
mov eax, dword ptr [rbp - 4]
and eax, 127
pop rbp
ret
GetBankno2: # @GetBankno2
pushrbp
mov rbp, rsp
mov dword ptr [rbp - 4], edi
mov eax, dword ptr [rbp - 4]
mov ecx, 127
cdq
idivecx
mov eax, edx
pop rbp
ret
under gcc 13.2
GetBankno1:
pushrbp
mov rbp, rsp
mov DWORD PTR [rbp-4], edi
mov eax, DWORD PTR [rbp-4]
and eax, 127
pop rbp
ret
GetBankno2:
pushrbp
mov rbp, rsp
mov DWORD PTR [rbp-4], edi
mov eax, DWORD PTR [rbp-4]
movsx   rdx, eax
imulrdx, rdx, -2130574327
shr rdx, 32
add edx, eax
mov ecx, edx
sar ecx, 6
cdq
sub ecx, edx
mov edx, ecx
sal edx, 7
sub edx, ecx
sub eax, edx
mov ecx, eax
mov eax, ecx
pop rbp
ret


[1] https://godbolt.org/

The instruction AND is ~20 times faster than IDIV [0]. This is relatively
> hot function, worth sacrificing some readability to save ~ten nanoseconds
> on each check of a status of a transaction.
>

  Now that AND is more faster, Can we  replace the '% SLRU_MAX_BANKLOCKS'
operation in  SimpleLruGetBankLock()  with '& 127' :
  SimpleLruGetBankLock()
{
   int banklockno = (pageno & ctl->bank_mask) % SLRU_MAX_BANKLOCKS;

   use '&'
   return &(ctl->shared->bank_locks[banklockno].lock);
}
Thoughts?

>
> [0] https://www.agner.org/optimize/instruction_tables.pdf
>
>

Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock

2023-11-29 Thread tender wang

The v8-0001 patch failed to apply in my local repo as below:

git apply v8-0001-Make-all-SLRU-buffer-sizes-configurable.patch
error: patch failed: src/backend/access/transam/multixact.c:1851
error: src/backend/access/transam/multixact.c: patch does not apply
error: patch failed: src/backend/access/transam/subtrans.c:184
error: src/backend/access/transam/subtrans.c: patch does not apply
error: patch failed: src/backend/commands/async.c:117
error: src/backend/commands/async.c: patch does not apply
error: patch failed: src/backend/storage/lmgr/predicate.c:808
error: src/backend/storage/lmgr/predicate.c: patch does not apply
error: patch failed: src/include/commands/async.h:15
error: src/include/commands/async.h: patch does not apply

My local head commit is 15c9ac36299.  Is there something I missed?

Dilip Kumar  于2023年11月24日周五 17:08写道：

> On Fri, Nov 24, 2023 at 10:17 AM Dilip Kumar 
> wrote:
> >
> > On Thu, Nov 23, 2023 at 11:34 AM Dilip Kumar 
> wrote:
> > >
> > > Note: With this testing, we have found a bug in the bank-wise
> > > approach, basically we are clearing a procglobal->clogGroupFirst, even
> > > before acquiring the bank lock that means in most of the cases there
> > > will be a single process in each group as a group leader
> >
> > I realized that the bug fix I have done is not proper, so will send
> > the updated patch set with the proper fix soon.
>
> PFA, updated patch set fixes the bug found during the testing of the
> group update using the injection point.  Also attached a path to test
> the injection point but for that, we need to apply the injection point
> patches [1]
>
> [1] https://www.postgresql.org/message-id/ZWACtHPetBFIvP61%40paquier.xyz
>
> --
> Regards,
> Dilip Kumar
> EnterpriseDB: http://www.enterprisedb.com
>

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2023-10-27 Thread tender wang

 v2 patch.
No test case is not good patch. I just share my idea about this issue. Hope
to get your reply.




Alvaro Herrera  于2023年10月25日周三 20:13写道：

> On 2023-Oct-25, tender wang wrote:
>
> > Hi
> >Is there any conclusion to this issue?
>
> None yet.  I intend to work on this at some point, hopefully soon.
>
> --
> Álvaro Herrera PostgreSQL Developer  —
> https://www.EnterpriseDB.com/
>
From 24491419ad65871e54207d3ef481d8abe529e1e1 Mon Sep 17 00:00:00 2001
From: "tender.wang" 
Date: Fri, 27 Oct 2023 13:48:48 +0800
Subject: [PATCH v2] Fix partition detach issue.

---
 src/backend/commands/tablecmds.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 416a98e7ce..3b897b620a 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -19356,7 +19356,9 @@ DetachPartitionFinalize(Relation rel, Relation partRel, bool concurrent,
 	{
 		ForeignKeyCacheInfo *fk = lfirst(cell);
 		HeapTuple	contup;
+		HeapTuple	parentConTup;
 		Form_pg_constraint conform;
+		Form_pg_constraint parentConForm;
 		Constraint *fkconstraint;
 		Oid			insertTriggerOid,
 	updateTriggerOid;
@@ -19374,6 +19376,24 @@ DetachPartitionFinalize(Relation rel, Relation partRel, bool concurrent,
 			continue;
 		}
 
+		/* For referenced-side, if it is partitioned table, each partition
+		 * has one row in pg_constraint. But it doesn't have INSERT CHECK trigger
+		 */
+		Assert(OidIsValid(conform->conparentid));
+		parentConTup = SearchSysCache1(CONSTROID,
+	   ObjectIdGetDatum(conform->conparentid));
+		if (!HeapTupleIsValid(parentConTup))
+			elog(ERROR, "cache lookup failed for constraint %u",
+ conform->conparentid);
+		parentConForm = (Form_pg_constraint)GETSTRUCT(parentConTup);
+		if (parentConForm->confrelid != conform->confrelid &&
+			parentConForm->conrelid == conform->conrelid)
+		{
+			ReleaseSysCache(contup);
+			ReleaseSysCache(parentConTup);
+			continue;
+		}
+
 		/* unset conparentid and adjust conislocal, coninhcount, etc. */
 		ConstraintSetParentConstraint(fk->conoid, InvalidOid, InvalidOid);
 
@@ -19421,6 +19441,7 @@ DetachPartitionFinalize(Relation rel, Relation partRel, bool concurrent,
 	   NULL, NULL);
 
 		ReleaseSysCache(contup);
+		ReleaseSysCache(parentConTup);
 	}
 	list_free_deep(fks);
 	if (trigrel)
-- 
2.25.1

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2023-10-25 Thread tender wang

Hi
   Is there any conclusion to this issue?

Jehan-Guillaume de Rorthais  于2023年8月10日周四 23:03写道：

> On Thu, 3 Aug 2023 11:02:43 +0200
> Alvaro Herrera  wrote:
>
> > On 2023-Aug-03, tender wang wrote:
> >
> > > I think  old "sub-FK" should not be dropped, that will be violates
> foreign
> > > key constraint.
> >
> > Yeah, I've been playing more with the patch and it is definitely not
> > doing the right things.  Just eyeballing the contents of pg_trigger and
> > pg_constraint for partitions added by ALTER...ATTACH shows that the
> > catalog contents are inconsistent with those added by CREATE TABLE
> > PARTITION OF.
>
> Well, as stated in my orignal message, at the patch helps understanding the
> problem and sketch a possible solution. It definitely is not complete.
>
> After DETACHing the table, we surely needs to check everything again and
> recreating what is needed to keep the FK consistent.
>
> But should we keep the FK after DETACH? Did you check the two other
> discussions
> related to FK, self-FK & partition? Unfortunately, as Tender experienced,
> the
> more we dig the more we find bugs. Moreover, the second one might seems
> unsolvable and deserve a closer look. See:
>
> * FK broken after DETACHing referencing part
>   https://www.postgresql.org/message-id/20230420144344.40744130%40karst
> * Issue attaching a table to a partitioned table with an  auto-referenced
>   foreign key
>   https://www.postgresql.org/message-id/20230707175859.17c91538%40karst
>
>

Re: Problem, partition pruning for prepared statement with IS NULL clause.

2023-10-11 Thread tender wang

David Rowley  于2023年10月11日周三 15:52写道：

> On Wed, 11 Oct 2023 at 15:49, David Rowley  wrote:
> > It might have been better if PartClauseInfo could also describe IS
> > NULL quals, but I feel if we do that now then it would require lots of
> > careful surgery in partprune.c to account for that.  Probably the fix
> > should be localised to get_steps_using_prefix_recurse() to have it do
> > something like pass the keyno to try and work on rather than trying to
> > get that from the "prefix" list. That way if there's no item in that
> > list for that keyno, we can check in step_nullkeys for the keyno.
> >
> > I'll continue looking.
>
> The fix seems to amount to the attached.  The following condition
> assumes that by not recursively processing step_lastkeyno - 1 that
> there will be at least one more PartClauseInfo in the prefix List to
> process.  However, that didn't work when that partition key clause was
> covered by an IS NULL clause.
>
> If we adjust the following condition:
>
> if (cur_keyno < step_lastkeyno - 1)
>
> to become:
>
> final_keyno = ((PartClauseInfo *) llast(prefix))->keyno;
> if (cur_keyno < final_keyno)
>

Yeah, aggred.


> then that ensures that the else clause can pick up any clauses for the
> final column mentioned in the 'prefix' list, plus any nullkeys if
> there happens to be any of those too.
>
> For testing, given that 13838740f (from 2020) had a go at fixing this
> already, I'm kinda thinking that it's not overkill to test all
> possible 16 combinations of IS NULL and equality equals on the 4
> partition key column partitioned table that commit added in
> partition_prune.sql.
>
> I added some tests there using \gexec to prevent having to write out
> each of the 16 queries by hand. I tested that pruning worked (i.e 1
> matching partition in EXPLAIN), and that we get the correct results
> (i.e we pruned the correct partition) by running the query and we get
> the expected 1 row after having inserted 16 rows, one for each
> combination of quals to test.
>
> I wanted to come up with some tests that test for multiple quals
> matching the same partition key.  This is tricky due to the
> equivalence class code being smart and removing any duplicates or
> marking the rel as dummy when it finds conflicting quals.  With hash
> partitioning, we're limited to just equality quals, so maybe something
> could be done with range-partitioned tables instead.  I see there are
> some tests just above the ones I modified which try to cover this.
>
> I also tried to outsmart the planner by using Params and prepared
> queries. Namely:
>
> set plan_cache_mode = 'force_generic_plan';
> prepare q1 (int, int, int, int, int, int, int, int) as select
> tableoid::regclass,* from hp_prefix_test where a = $1 and b = $2 and c
> = $3 and d = $4 and a = $5 and b = $6 and c = $7 and d = $8;
> explain (costs off) execute q1 (1,2,3,4,1,2,3,4);
>
> But I was outsmarted again with a gating qual which checked the pairs
> match before doing the scan :-(
>
>  Append
>Subplans Removed: 15
>->  Result
>  One-Time Filter: (($1 = $5) AND ($2 = $6) AND ($3 = $7) AND ($4 =
> $8))
>  ->  Seq Scan on hp_prefix_test_p14 hp_prefix_test_1
>Filter: ((a = $5) AND (b = $6) AND (c = $7) AND (d = $8))
>
> I'm aiming to commit these as two separate fixes, so I'm going to go
> look again at the first one and wait to see if anyone wants to comment
> on this patch in the meantime.
>
+1, LGTM


> David
>

Re: Problem, partition pruning for prepared statement with IS NULL clause.

2023-10-10 Thread tender wang

For hash partition table, if partition key is IS NULL clause,  the
condition in if  in get_steps_using_prefix_recurse:
if (cur_keyno < step_lastkeyno - 1)
is not enough.
Like the decode crashed case, explain select * from hp where a = 1 and b is
null and c = 1;
prefix list just has a = 1 clause.
I try fix this in attached patch.
David Rowley  于2023年10月11日周三 10:50写道：

> On Tue, 10 Oct 2023 at 21:31, Sergei Glukhov 
> wrote:
> > create table hp (a int, b text, c int, d int)
> >partition by hash (a part_test_int4_ops, b part_test_text_ops, c
> > part_test_int4_ops);
> > create table hp0 partition of hp for values with (modulus 4, remainder
> 0);
> > create table hp3 partition of hp for values with (modulus 4, remainder
> 3);
> > create table hp1 partition of hp for values with (modulus 4, remainder
> 1);
> > create table hp2 partition of hp for values with (modulus 4, remainder
> 2);
> >
> >
> > Another crash in the different place even with the fix:
> > explain select * from hp where a = 1 and b is null and c = 1;
>
> Ouch.  It looks like 13838740f tried to fix things in this area before
> and even added a regression test for it. Namely:
>
> -- Test that get_steps_using_prefix() handles non-NULL step_nullkeys
> explain (costs off) select * from hp_prefix_test where a = 1 and b is
> null and c = 1 and d = 1;
>
> I guess that one does not crash because of the "d = 1" clause is in
> the "start" ListCell in get_steps_using_prefix_recurse(), whereas,
> with your case start is NULL which is an issue for cur_keyno =
> ((PartClauseInfo *) lfirst(start))->keyno;.
>
> It might have been better if PartClauseInfo could also describe IS
> NULL quals, but I feel if we do that now then it would require lots of
> careful surgery in partprune.c to account for that.  Probably the fix
> should be localised to get_steps_using_prefix_recurse() to have it do
> something like pass the keyno to try and work on rather than trying to
> get that from the "prefix" list. That way if there's no item in that
> list for that keyno, we can check in step_nullkeys for the keyno.
>
> I'll continue looking.
>
> David
>
>
>
From 1640c0d05269c3368fb41fcffc66e38630ff522d Mon Sep 17 00:00:00 2001
From: "tender.wang" 
Date: Wed, 11 Oct 2023 11:32:04 +0800
Subject: [PATCH] Fix null partition key pruning for hash parittion table.

---
 src/backend/partitioning/partprune.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/backend/partitioning/partprune.c 
b/src/backend/partitioning/partprune.c
index 7179b22a05..c2a388d454 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -2438,6 +2438,7 @@ 
get_steps_using_prefix_recurse(GeneratePruningStepsContext *context,
List   *result = NIL;
ListCell   *lc;
int cur_keyno;
+   int prefix_lastkeyno;
 
/* Actually, recursion would be limited by PARTITION_MAX_KEYS. */
check_stack_depth();
@@ -2445,7 +2446,11 @@ 
get_steps_using_prefix_recurse(GeneratePruningStepsContext *context,
/* Check if we need to recurse. */
Assert(start != NULL);
cur_keyno = ((PartClauseInfo *) lfirst(start))->keyno;
-   if (cur_keyno < step_lastkeyno - 1)
+   /* Note that for hash partitioning, if partition key is IS NULL clause,
+* that partition key will not present in prefix List.
+*/
+   prefix_lastkeyno = ((PartClauseInfo *) llast(prefix))->keyno;
+   if (cur_keyno < step_lastkeyno - 1 && cur_keyno != prefix_lastkeyno)
{
PartClauseInfo *pc;
ListCell   *next_start;
-- 
2.25.1

Re: Problem, partition pruning for prepared statement with IS NULL clause.

2023-10-10 Thread tender wang

The comment  /* not needed for Consts */  may be more better close to if
(!IsA(expr, Const)).
Others look good to me.

David Rowley  于2023年10月9日周一 07:28写道：

> On Sat, 7 Oct 2023 at 03:11, Sergei Glukhov 
> wrote:
> > I noticed that combination of prepared statement with generic plan and
> > 'IS NULL' clause could lead partition pruning to crash.
>
> > Test case:
> > --
> > set plan_cache_mode to force_generic_plan;
> > prepare stmt AS select * from hp where a is null and b = $1;
> > explain execute stmt('xxx');
>
> Thanks for the detailed report and proposed patch.
>
> I think your proposed fix isn't quite correct.  I think the problem
> lies in InitPartitionPruneContext() where we assume that the list
> positions of step->exprs are in sync with the keyno.  If you look at
> perform_pruning_base_step() the code there makes a special effort to
> skip over any keyno when a bit is set in opstep->nullkeys.
>
> It seems that your patch is adjusting the keyno that's given to the
> PruneCxtStateIdx() and it looks like (for your test case) it'll end up
> passing keyno==0 when it should be passing keyno==1.  keyno is the
> index of the partition key, so you can't pass 0 when it's for key
> index 1.
>
> I wonder if it's worth expanding the tests further to cover more of
> the pruning cases to cover run-time pruning too.
>

   I think it's worth doing that.

David
>

Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2023-09-27 Thread tender wang

Hi tom,
   Do you have any comments or suggestions on this issue? Thanks.

Richard Guo  于2023年9月8日周五 14:06写道：

>
> On Fri, Sep 8, 2023 at 3:15 AM Robert Haas  wrote:
>
>> The example query provided here seems rather artificial. Surely few
>> people write a join clause that references neither of the tables being
>> joined. Is there a more realistic case where this makes a big
>> difference?
>
>
> Yes the given example query is not that convincing.  I tried a query
> with plans as below (after some GUC setting) which might be more
> realistic in real world.
>
> unpatched:
>
> explain select * from partsupp join lineitem on l_partkey > ps_partkey;
>   QUERY PLAN
>
> --
>  Gather  (cost=0.00..5522666.44 rows=16047 width=301)
>Workers Planned: 4
>->  Nested Loop  (cost=0.00..5522666.44 rows=40116667 width=301)
>  Join Filter: (lineitem.l_partkey > partsupp.ps_partkey)
>  ->  Parallel Seq Scan on lineitem  (cost=0.00..1518.44 rows=15044
> width=144)
>  ->  Seq Scan on partsupp  (cost=0.00..267.00 rows=8000 width=157)
> (6 rows)
>
> patched:
>
> explain select * from partsupp join lineitem on l_partkey > ps_partkey;
>   QUERY PLAN
>
> --
>  Gather  (cost=0.00..1807085.44 rows=16047 width=301)
>Workers Planned: 4
>->  Nested Loop  (cost=0.00..1807085.44 rows=40116667 width=301)
>  Join Filter: (lineitem.l_partkey > partsupp.ps_partkey)
>  ->  Parallel Seq Scan on lineitem  (cost=0.00..1518.44 rows=15044
> width=144)
>  ->  Materialize  (cost=0.00..307.00 rows=8000 width=157)
>->  Seq Scan on partsupp  (cost=0.00..267.00 rows=8000
> width=157)
> (7 rows)
>
> The execution time (ms) are (avg of 3 runs):
>
> unpatched:  71769.21
> patched:65510.04
>
> So we can see some (~9%) performance gains in this case.
>
> Thanks
> Richard
>

Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2023-09-07 Thread tender wang

Richard Guo  于2023年9月5日周二 18:51写道：

>
> On Tue, Sep 5, 2023 at 4:52 PM tender wang  wrote:
>
>>I recently run benchmark[1] on master, but I found performance problem
>> as below:
>> ...
>>
>> I debug the code and find consider_parallel_nestloop() doesn't consider
>> materialized form of the cheapest inner path.
>>
>
> Yeah, this seems an omission in commit 45be99f8.  I reviewed the patch
> and here are some comments.
>
> * I think we should not consider materializing the cheapest inner path
>   if we're doing JOIN_UNIQUE_INNER, because in this case we have to
>   unique-ify the inner path.
>

 That's right. The V2 patch has been fixed.


> * I think we can check if it'd be parallel safe before creating the
>   material path, thus avoid the creation in unsafe cases.
>

Agreed.



> * I don't think the test case you added works for the code changes.
>   Maybe a plan likes below is better:
>

  Agreed.

explain (costs off)
> select * from tenk1, tenk2 where tenk1.two = tenk2.two;
>   QUERY PLAN
> --
>  Gather
>Workers Planned: 4
>->  Nested Loop
>  Join Filter: (tenk1.two = tenk2.two)
>  ->  Parallel Seq Scan on tenk1
>  ->  Materialize
>->  Seq Scan on tenk2
> (7 rows)
>
> Thanks
> Richard
>
From 096c645d7c8d06df3a4e6a8aa7fc4edd1c0a3755 Mon Sep 17 00:00:00 2001
From: "tender.wang" 
Date: Thu, 7 Sep 2023 17:55:04 +0800
Subject: [PATCH v2] Parallel seq scan should consider materila inner path in
 nestloop case.

---
 src/backend/optimizer/path/joinpath.c | 19 +++
 src/test/regress/expected/select_parallel.out | 24 +++
 src/test/regress/sql/select_parallel.sql  |  9 +++
 3 files changed, 52 insertions(+)

diff --git a/src/backend/optimizer/path/joinpath.c 
b/src/backend/optimizer/path/joinpath.c
index 821d282497..87111ad643 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -2004,10 +2004,25 @@ consider_parallel_nestloop(PlannerInfo *root,
 {
JoinTypesave_jointype = jointype;
ListCell   *lc1;
+   Path *matpath = NULL;
+   Path *inner_cheapest_total = innerrel->cheapest_total_path;
 
if (jointype == JOIN_UNIQUE_INNER)
jointype = JOIN_INNER;
 
+   /*
+* Consider materializing the cheapest inner path, unless we're
+* doing JOIN_UNIQUE_INNER or enable_material is off or the subpath
+* is parallel unsafe or the path in question materializes its output 
anyway.
+*/
+   if (save_jointype != JOIN_UNIQUE_INNER &&
+   enable_material &&
+   inner_cheapest_total != NULL &&
+   inner_cheapest_total->parallel_safe &&
+   !ExecMaterializesOutput(inner_cheapest_total->pathtype))
+   matpath = (Path *)
+   create_material_path(innerrel, inner_cheapest_total);
+
foreach(lc1, outerrel->partial_pathlist)
{
Path   *outerpath = (Path *) lfirst(lc1);
@@ -2064,6 +2079,10 @@ consider_parallel_nestloop(PlannerInfo *root,
try_partial_nestloop_path(root, joinrel, 
outerpath, mpath,

  pathkeys, jointype, extra);
}
+   /* Also consider materialized form of the cheapest inner path */
+   if (matpath != NULL && matpath->parallel_safe)
+   try_partial_nestloop_path(root, joinrel, outerpath, 
matpath,
+ 
pathkeys, jointype, extra);
}
 }
 
diff --git a/src/test/regress/expected/select_parallel.out 
b/src/test/regress/expected/select_parallel.out
index d88353d496..5b9f5c58cc 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -844,6 +844,30 @@ select * from
 (12 rows)
 
 reset enable_material;
+-- test materialized form of the cheapest inner path
+set min_parallel_table_scan_size = '512kB';
+explain(costs off)
+select count(*) from tenk1, int4_tbl where tenk1.two < int4_tbl.f1;
+ QUERY PLAN 
+
+ Finalize Aggregate
+   ->  Gather
+ Workers Planned: 4
+ ->  Partial Aggregate
+   ->  Nested Loop
+ Join Filter: (tenk1.two < int4_tbl.f1)
+ ->  Parallel Seq Scan on tenk1
+ ->  Materialize
+   ->  Seq Scan on int4_tbl
+(9 rows)
+
+select count(*) from tenk1, int4_

Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2023-09-05 Thread tender wang

After using patch, the result as below :
   QUERY
PLAN

 Limit  (cost=1078.00..26630101.20 rows=1 width=27) (actual
time=160571.005..160571.105 rows=0 loops=1)
   InitPlan 1 (returns $0)
 ->  Limit  (cost=78.00..78.00 rows=1 width=8) (actual
time=1.065..1.066 rows=1 loops=1)
   ->  Sort  (cost=78.00..83.00 rows=2000 width=8) (actual
time=1.064..1.065 rows=1 loops=1)
 Sort Key: part.p_partkey
 Sort Method: top-N heapsort  Memory: 25kB
 ->  Seq Scan on part  (cost=0.00..68.00 rows=2000 width=8)
(actual time=0.046..0.830 rows=2000 loops=1)
   ->  Gather  (cost=1000.00..26630023.20 rows=1 width=27) (actual
time=160571.003..160571.102 rows=0 loops=1)
 Workers Planned: 1
 Params Evaluated: $0
 Workers Launched: 1
 ->  Nested Loop Left Join  (cost=0.00..26629023.10 rows=1
width=27) (actual time=160549.257..160549.258 rows=0 loops=2)
   Join Filter: ($0 IS NOT NULL)
   Filter: ((sample_0.l_orderkey IS NULL) AND
(sample_0.l_shipmode IS NULL))
   Rows Removed by Filter: 1810515312
   ->  Parallel Seq Scan on lineitem ref_0  (cost=0.00..1721.97
rows=35397 width=11) (actual time=0.010..3.393 rows=30088 loops=2)
   ->  Materialize  (cost=0.00..2270.62 rows=60175 width=27)
(actual time=0.000..2.839 rows=60175 loops=60175)
 ->  Seq Scan on lineitem sample_0  (cost=0.00..1969.75
rows=60175 width=27) (actual time=0.008..11.381 rows=60175 loops=2)
 Planning Time: 0.174 ms
 Execution Time: 160571.476 ms
(20 rows)

tender wang  于2023年9月5日周二 16:52写道：

> Hi all,
>
>I recently run benchmark[1] on master, but I found performance problem
> as below:
>
> explain analyze select
>   subq_0.c0 as c0,
>   subq_0.c1 as c1,
>   subq_0.c2 as c2
> from
>   (select
> ref_0.l_shipmode as c0,
> sample_0.l_orderkey as c1,
> sample_0.l_quantity as c2,
> ref_0.l_orderkey as c3,
> sample_0.l_shipmode as c5,
> ref_0.l_shipinstruct as c6
>   from
> public.lineitem as ref_0
>   left join public.lineitem as sample_0
>   on ((select p_partkey from public.part order by p_partkey limit
> 1)
>  is not NULL)
>   where sample_0.l_orderkey is NULL) as subq_0
> where subq_0.c5 is NULL
> limit 1;
>QUERY PLAN
>
>
> -
>  Limit  (cost=78.00..45267050.75 rows=1 width=27) (actual
> time=299695.097..299695.099 rows=0 loops=1)
>InitPlan 1 (returns $0)
>  ->  Limit  (cost=78.00..78.00 rows=1 width=8) (actual
> time=0.651..0.652 rows=1 loops=1)
>->  Sort  (cost=78.00..83.00 rows=2000 width=8) (actual
> time=0.650..0.651 rows=1 loops=1)
>  Sort Key: part.p_partkey
>  Sort Method: top-N heapsort  Memory: 25kB
>  ->  Seq Scan on part  (cost=0.00..68.00 rows=2000
> width=8) (actual time=0.013..0.428 rows=2000 loops=1)
>->  Nested Loop Left Join  (cost=0.00..45266972.75 rows=1 width=27)
> (actual time=299695.096..299695.096 rows=0 loops=1)
>  Join Filter: ($0 IS NOT NULL)
>  Filter: ((sample_0.l_orderkey IS NULL) AND (sample_0.l_shipmode
> IS NULL))
>  Rows Removed by Filter: 3621030625
>  ->  Seq Scan on lineitem ref_0  (cost=0.00..1969.75 rows=60175
> width=11) (actual time=0.026..6.225 rows=60175 loops=1)
>  ->  Materialize  (cost=0.00..2270.62 rows=60175 width=27) (actual
> time=0.000..2.554 rows=60175 loops=60175)
>->  Seq Scan on lineitem sample_0  (cost=0.00..1969.75
> rows=60175 width=27) (actual time=0.004..8.169 rows=60175 loops=1)
>  Planning Time: 0.172 ms
>  Execution Time: 299695.501 ms
> (16 rows)
>
> After I set enable_material to off, the same query run faster, as below:
> set enable_material = off;
> explain analyze  select
>   subq_0.c0 as c0,
>   subq_0.c1 as c1,
>   subq_0.c2 as c2
> from
>   (select
> ref_0.l_shipmode as c0,
> sample_0.l_orderkey as c1,
> sample_0.l_quantity as c2,
> ref_0.l_orderkey as c3,
> sample_0.l_shipmode as c5,
> ref_0.l_shipinstruct as c6
>   from
> public.lineitem as ref_0
>   left join public.lineitem as sample_0
>   o

Should consider materializing the cheapest inner path in consider_parallel_nestloop()

2023-09-05 Thread tender wang

Hi all,

   I recently run benchmark[1] on master, but I found performance problem
as below:

explain analyze select
  subq_0.c0 as c0,
  subq_0.c1 as c1,
  subq_0.c2 as c2
from
  (select
ref_0.l_shipmode as c0,
sample_0.l_orderkey as c1,
sample_0.l_quantity as c2,
ref_0.l_orderkey as c3,
sample_0.l_shipmode as c5,
ref_0.l_shipinstruct as c6
  from
public.lineitem as ref_0
  left join public.lineitem as sample_0
  on ((select p_partkey from public.part order by p_partkey limit 1)
 is not NULL)
  where sample_0.l_orderkey is NULL) as subq_0
where subq_0.c5 is NULL
limit 1;
   QUERY PLAN

-
 Limit  (cost=78.00..45267050.75 rows=1 width=27) (actual
time=299695.097..299695.099 rows=0 loops=1)
   InitPlan 1 (returns $0)
 ->  Limit  (cost=78.00..78.00 rows=1 width=8) (actual
time=0.651..0.652 rows=1 loops=1)
   ->  Sort  (cost=78.00..83.00 rows=2000 width=8) (actual
time=0.650..0.651 rows=1 loops=1)
 Sort Key: part.p_partkey
 Sort Method: top-N heapsort  Memory: 25kB
 ->  Seq Scan on part  (cost=0.00..68.00 rows=2000 width=8)
(actual time=0.013..0.428 rows=2000 loops=1)
   ->  Nested Loop Left Join  (cost=0.00..45266972.75 rows=1 width=27)
(actual time=299695.096..299695.096 rows=0 loops=1)
 Join Filter: ($0 IS NOT NULL)
 Filter: ((sample_0.l_orderkey IS NULL) AND (sample_0.l_shipmode IS
NULL))
 Rows Removed by Filter: 3621030625
 ->  Seq Scan on lineitem ref_0  (cost=0.00..1969.75 rows=60175
width=11) (actual time=0.026..6.225 rows=60175 loops=1)
 ->  Materialize  (cost=0.00..2270.62 rows=60175 width=27) (actual
time=0.000..2.554 rows=60175 loops=60175)
   ->  Seq Scan on lineitem sample_0  (cost=0.00..1969.75
rows=60175 width=27) (actual time=0.004..8.169 rows=60175 loops=1)
 Planning Time: 0.172 ms
 Execution Time: 299695.501 ms
(16 rows)

After I set enable_material to off, the same query run faster, as below:
set enable_material = off;
explain analyze  select
  subq_0.c0 as c0,
  subq_0.c1 as c1,
  subq_0.c2 as c2
from
  (select
ref_0.l_shipmode as c0,
sample_0.l_orderkey as c1,
sample_0.l_quantity as c2,
ref_0.l_orderkey as c3,
sample_0.l_shipmode as c5,
ref_0.l_shipinstruct as c6
  from
public.lineitem as ref_0
  left join public.lineitem as sample_0
  on ((select p_partkey from public.part order by p_partkey limit 1)
 is not NULL)
  where sample_0.l_orderkey is NULL) as subq_0
where subq_0.c5 is NULL
limit 1;
  QUERY
PLAN
---
 Limit  (cost=1078.00..91026185.57 rows=1 width=27) (actual
time=192669.605..192670.425 rows=0 loops=1)
   InitPlan 1 (returns $0)
 ->  Limit  (cost=78.00..78.00 rows=1 width=8) (actual
time=0.662..0.663 rows=1 loops=1)
   ->  Sort  (cost=78.00..83.00 rows=2000 width=8) (actual
time=0.661..0.662 rows=1 loops=1)
 Sort Key: part.p_partkey
 Sort Method: top-N heapsort  Memory: 25kB
 ->  Seq Scan on part  (cost=0.00..68.00 rows=2000 width=8)
(actual time=0.017..0.430 rows=2000 loops=1)
   ->  Gather  (cost=1000.00..91026107.57 rows=1 width=27) (actual
time=192669.604..192670.422 rows=0 loops=1)
 Workers Planned: 1
 Params Evaluated: $0
 Workers Launched: 1
 ->  Nested Loop Left Join  (cost=0.00..91025107.47 rows=1
width=27) (actual time=192588.143..192588.144 rows=0 loops=2)
   Join Filter: ($0 IS NOT NULL)
   Filter: ((sample_0.l_orderkey IS NULL) AND
(sample_0.l_shipmode IS NULL))
   Rows Removed by Filter: 1810515312
   ->  Parallel Seq Scan on lineitem ref_0  (cost=0.00..1721.97
rows=35397 width=11) (actual time=0.007..3.797 rows=30088 loops=2)
   ->  Seq Scan on lineitem sample_0  (cost=0.00..1969.75
rows=60175 width=27) (actual time=0.000..2.637 rows=60175 loops=60175)
 Planning Time: 0.174 ms
 Execution Time: 192670.458 ms
(19 rows)

I debug the code and find consider_parallel_nestloop() doesn't consider
materialized form of the cheapest inner path.
When enable_material = true,  we can see Material path won in first plan,
but Parallel Seq Scan node doesn't add as outer path, which because
in try_partial_nestloop_path() , the cost of nestloop wat computed using
seq scan path not material path.

[1] include test table schema and data, you can repeat above problem.

I try fix this problem in attached patch, and I found pg12.12 also had this

Re: Improve heapgetpage() performance, overhead from serializable

2023-09-01 Thread tender wang

This thread [1] also Improving the heapgetpage function, and looks like
this thread.

[1]
https://www.postgresql.org/message-id/a9f40066-3d25-a240-4229-ec2fbe94e7a5%40yeah.net

Muhammad Malik  于2023年9月1日周五 04:04写道：

> Hi,
>
> Is there a plan to merge this patch in PG16?
>
> Thanks,
> Muhammad
>
> --
> *From:* Andres Freund 
> *Sent:* Saturday, July 15, 2023 6:56 PM
> *To:* pgsql-hack...@postgresql.org 
> *Cc:* Thomas Munro 
> *Subject:* Improve heapgetpage() performance, overhead from serializable
>
> Hi,
>
> Several loops which are important for query performance, like
> heapgetpage()'s
> loop over all tuples, have to call functions like
> HeapCheckForSerializableConflictOut() and PredicateLockTID() in every
> iteration.
>
> When serializable is not in use, all those functions do is to to return.
> But
> being situated in a different translation unit, the compiler can't inline
> (without LTO at least) the check whether serializability is needed.  It's
> not
> just the function call overhead that's noticable, it's also that registers
> have to be spilled to the stack / reloaded from memory etc.
>
> On a freshly loaded pgbench scale 100, with turbo mode disabled, postgres
> pinned to one core. Parallel workers disabled to reduce noise.  All times
> are
> the average of 15 executions with pgbench, in a newly started, but
> prewarmed
> postgres.
>
> SELECT * FROM pgbench_accounts OFFSET 1000;
> HEAD:
> 397.977
>
> removing the HeapCheckForSerializableConflictOut() from heapgetpage()
> (incorrect!), to establish the baseline of what serializable costs:
> 336.695
>
> pulling out CheckForSerializableConflictOutNeeded() from
> HeapCheckForSerializableConflictOut() in heapgetpage(), and avoiding
> calling
> HeapCheckForSerializableConflictOut() in the loop:
> 339.742
>
> moving the loop into a static inline function, marked as pg_always_inline,
> called with static arguments for always_visible, check_serializable:
> 326.546
>
> marking the always_visible, !check_serializable case likely():
> 322.249
>
> removing TestForOldSnapshot() calls, which we pretty much already decided
> on:
> 312.987
>
>
> FWIW, there's more we can do, with some hacky changes I got the time down
> to
> 273.261, but the tradeoffs start to be a bit more complicated. And
> 397->320ms
> for something as core as this, is imo worth considering on its own.
>
>
>
>
> Now, this just affects the sequential scan case. heap_hot_search_buffer()
> shares many of the same pathologies.  I find it a bit harder to improve,
> because the compiler's code generation seems to switch between good / bad
> with
> changes that seems unrelated...
>
>
> I wonder why we haven't used PageIsAllVisible() in
> heap_hot_search_buffer() so
> far?
>
>
> Greetings,
>
> Andres Freund
>

[question] multil-column range partition prune

2023-08-10 Thread tender wang

I have an range partition and query below:
create table p_range(a int, b int) partition by range (a,b); create table
p_range1 partition of p_range for values from (1,1) to (3,3); create table
p_range2 partition of p_range for values from (4,4) to (6,6); explain
select * from p_range where b =2;
QUERY PLAN
--
 Append  (cost=0.00..76.61 rows=22 width=8)
   ->  Seq Scan on p_range1 p_range_1  (cost=0.00..38.25 rows=11 width=8)
 Filter: (b = 2)
   ->  Seq Scan on p_range2 p_range_2  (cost=0.00..38.25 rows=11 width=8)
 Filter: (b = 2)
(5 rows)

The result of EXPLAIN shows that no partition prune happened.
And gen_prune_steps_from_opexps() has comments that can answer the result.
/*
* For range partitioning, if we have no clauses for the current key,
* we can't consider any later keys either, so we can stop here.
*/
if (part_scheme->strategy == PARTITION_STRATEGY_RANGE &&
clauselist == NIL)
break;

But I want to know why we don't prune when just have latter partition key
in whereClause.
Thanks.

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2023-08-07 Thread tender wang

The foreign key still works even though partition was detached. Is this
behavior expected?
I can't find the answer in the document. If it is expected behavior ,
please ignore the bug I reported a few days ago.

tender wang  于2023年8月4日周五 17:04写道：

> Oversight the DetachPartitionFinalize(), I found another bug below:
>
> postgres=# CREATE TABLE p ( id bigint PRIMARY KEY ) PARTITION BY list (id);
> CREATE TABLE
> postgres=# CREATE TABLE p_1 PARTITION OF p FOR VALUES IN (1);
> CREATE TABLE
> postgres=# CREATE TABLE r_1 (
> postgres(# id   bigint PRIMARY KEY,
> postgres(# p_id bigint NOT NULL
> postgres(#   );
> CREATE TABLE
> postgres=#  CREATE TABLE r (
> postgres(# id   bigint PRIMARY KEY,
> postgres(# p_id bigint NOT NULL,
> postgres(# FOREIGN KEY (p_id) REFERENCES p (id)
> postgres(#   ) PARTITION BY list (id);
> CREATE TABLE
> postgres=# ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1);
> ALTER TABLE
> postgres=# ALTER TABLE r DETACH PARTITION r_1;
> ALTER TABLE
> postgres=# insert into r_1 values(1,1);
> ERROR:  insert or update on table "r_1" violates foreign key constraint
> "r_p_id_fkey"
> DETAIL:  Key (p_id)=(1) is not present in table "p".
>
> After detach operation, r_1 is normal relation and the inherited foreign
> key 'r_p_id_fkey' should be removed.
>
>
> tender wang  于2023年8月3日周四 17:34写道：
>
>> I think the code to determine that fk of a partition is inherited or not
>> is not enough.
>> For example, in this case, foreign key r_1_p_id_fkey1  is not inherited
>> from parent.
>>
>> If conform->conparentid(in DetachPartitionFinalize func) is valid, we
>> should recheck confrelid(pg_constraint) field.
>>
>> I try to fix this problem in the attached patch.
>> Any thoughts.
>>
>> Alvaro Herrera  于2023年8月3日周四 17:02写道：
>>
>>> On 2023-Aug-03, tender wang wrote:
>>>
>>> > I think  old "sub-FK" should not be dropped, that will be violates
>>> foreign
>>> > key constraint.
>>>
>>> Yeah, I've been playing more with the patch and it is definitely not
>>> doing the right things.  Just eyeballing the contents of pg_trigger and
>>> pg_constraint for partitions added by ALTER...ATTACH shows that the
>>> catalog contents are inconsistent with those added by CREATE TABLE
>>> PARTITION OF.
>>>
>>> --
>>> Álvaro Herrera PostgreSQL Developer  —
>>> https://www.EnterpriseDB.com/
>>>
>>

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2023-08-04 Thread tender wang

Oversight the DetachPartitionFinalize() again, I found the root cause why
'r_p_id_fkey' wat not removed.

DetachPartitionFinalize() call the GetParentedForeignKeyRefs() func to get
tuple from pg_constraint that will be delete but failed.
 according to the comments, the GetParentedForeignKeyRefs() func get the
tuple reference me not I reference others.

I try to fix this bug :
i. ConstraintSetParentConstraint() should not be called in
DetachPartitionFinalize(), because after conparentid was set to 0,
we can not find inherited foreign keys.
ii. create another function like GetParentedForeignKeyRefs(), but the
ScanKey should be conrelid field not confrelid.

I quickly test on my above solution in my env, can be solve above issue.

tender wang  于2023年8月4日周五 17:04写道：

> Oversight the DetachPartitionFinalize(), I found another bug below:
>
> postgres=# CREATE TABLE p ( id bigint PRIMARY KEY ) PARTITION BY list (id);
> CREATE TABLE
> postgres=# CREATE TABLE p_1 PARTITION OF p FOR VALUES IN (1);
> CREATE TABLE
> postgres=# CREATE TABLE r_1 (
> postgres(# id   bigint PRIMARY KEY,
> postgres(# p_id bigint NOT NULL
> postgres(#   );
> CREATE TABLE
> postgres=#  CREATE TABLE r (
> postgres(# id   bigint PRIMARY KEY,
> postgres(# p_id bigint NOT NULL,
> postgres(# FOREIGN KEY (p_id) REFERENCES p (id)
> postgres(#   ) PARTITION BY list (id);
> CREATE TABLE
> postgres=# ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1);
> ALTER TABLE
> postgres=# ALTER TABLE r DETACH PARTITION r_1;
> ALTER TABLE
> postgres=# insert into r_1 values(1,1);
> ERROR:  insert or update on table "r_1" violates foreign key constraint
> "r_p_id_fkey"
> DETAIL:  Key (p_id)=(1) is not present in table "p".
>
> After detach operation, r_1 is normal relation and the inherited foreign
> key 'r_p_id_fkey' should be removed.
>
>
> tender wang  于2023年8月3日周四 17:34写道：
>
>> I think the code to determine that fk of a partition is inherited or not
>> is not enough.
>> For example, in this case, foreign key r_1_p_id_fkey1  is not inherited
>> from parent.
>>
>> If conform->conparentid(in DetachPartitionFinalize func) is valid, we
>> should recheck confrelid(pg_constraint) field.
>>
>> I try to fix this problem in the attached patch.
>> Any thoughts.
>>
>> Alvaro Herrera  于2023年8月3日周四 17:02写道：
>>
>>> On 2023-Aug-03, tender wang wrote:
>>>
>>> > I think  old "sub-FK" should not be dropped, that will be violates
>>> foreign
>>> > key constraint.
>>>
>>> Yeah, I've been playing more with the patch and it is definitely not
>>> doing the right things.  Just eyeballing the contents of pg_trigger and
>>> pg_constraint for partitions added by ALTER...ATTACH shows that the
>>> catalog contents are inconsistent with those added by CREATE TABLE
>>> PARTITION OF.
>>>
>>> --
>>> Álvaro Herrera PostgreSQL Developer  —
>>> https://www.EnterpriseDB.com/
>>>
>>

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2023-08-04 Thread tender wang

Oversight the DetachPartitionFinalize(), I found another bug below:

postgres=# CREATE TABLE p ( id bigint PRIMARY KEY ) PARTITION BY list (id);
CREATE TABLE
postgres=# CREATE TABLE p_1 PARTITION OF p FOR VALUES IN (1);
CREATE TABLE
postgres=# CREATE TABLE r_1 (
postgres(# id   bigint PRIMARY KEY,
postgres(# p_id bigint NOT NULL
postgres(#   );
CREATE TABLE
postgres=#  CREATE TABLE r (
postgres(# id   bigint PRIMARY KEY,
postgres(# p_id bigint NOT NULL,
postgres(# FOREIGN KEY (p_id) REFERENCES p (id)
postgres(#   ) PARTITION BY list (id);
CREATE TABLE
postgres=# ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1);
ALTER TABLE
postgres=# ALTER TABLE r DETACH PARTITION r_1;
ALTER TABLE
postgres=# insert into r_1 values(1,1);
ERROR:  insert or update on table "r_1" violates foreign key constraint
"r_p_id_fkey"
DETAIL:  Key (p_id)=(1) is not present in table "p".

After detach operation, r_1 is normal relation and the inherited foreign
key 'r_p_id_fkey' should be removed.

tender wang  于2023年8月3日周四 17:34写道：

> I think the code to determine that fk of a partition is inherited or not
> is not enough.
> For example, in this case, foreign key r_1_p_id_fkey1  is not inherited
> from parent.
>
> If conform->conparentid(in DetachPartitionFinalize func) is valid, we
> should recheck confrelid(pg_constraint) field.
>
> I try to fix this problem in the attached patch.
> Any thoughts.
>
> Alvaro Herrera  于2023年8月3日周四 17:02写道：
>
>> On 2023-Aug-03, tender wang wrote:
>>
>> > I think  old "sub-FK" should not be dropped, that will be violates
>> foreign
>> > key constraint.
>>
>> Yeah, I've been playing more with the patch and it is definitely not
>> doing the right things.  Just eyeballing the contents of pg_trigger and
>> pg_constraint for partitions added by ALTER...ATTACH shows that the
>> catalog contents are inconsistent with those added by CREATE TABLE
>> PARTITION OF.
>>
>> --
>> Álvaro Herrera PostgreSQL Developer  —
>> https://www.EnterpriseDB.com/
>>
>

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2023-08-03 Thread tender wang

I think the code to determine that fk of a partition is inherited or not is
not enough.
For example, in this case, foreign key r_1_p_id_fkey1  is not inherited
from parent.

If conform->conparentid(in DetachPartitionFinalize func) is valid, we
should recheck confrelid(pg_constraint) field.

I try to fix this problem in the attached patch.
Any thoughts.

Alvaro Herrera  于2023年8月3日周四 17:02写道：

> On 2023-Aug-03, tender wang wrote:
>
> > I think  old "sub-FK" should not be dropped, that will be violates
> foreign
> > key constraint.
>
> Yeah, I've been playing more with the patch and it is definitely not
> doing the right things.  Just eyeballing the contents of pg_trigger and
> pg_constraint for partitions added by ALTER...ATTACH shows that the
> catalog contents are inconsistent with those added by CREATE TABLE
> PARTITION OF.
>
> --
> Álvaro Herrera PostgreSQL Developer  —
> https://www.EnterpriseDB.com/
>
From d84395c7321c201a78661de0b41b76e71ab10678 Mon Sep 17 00:00:00 2001
From: "tender.wang" 
Date: Thu, 3 Aug 2023 17:23:06 +0800
Subject: [PATCH] Recheck foreign key of a partition is inherited from parent.

Previously, fk is inherited if conparentid(pg_constraint) is valid.
It is not enough and we should compare confrelid field to determine
fk is inherited.
---
 src/backend/commands/tablecmds.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 727f151750..1447433109 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -18556,6 +18556,8 @@ DetachPartitionFinalize(Relation rel, Relation partRel, 
bool concurrent,
ForeignKeyCacheInfo *fk = lfirst(cell);
HeapTuple   contup;
Form_pg_constraint conform;
+   HeapTuple parentTup;
+   Form_pg_constraint parentForm;
Constraint *fkconstraint;
Oid insertTriggerOid,
updateTriggerOid;
@@ -18573,6 +18575,23 @@ DetachPartitionFinalize(Relation rel, Relation 
partRel, bool concurrent,
continue;
}
 
+   /* recheck confrelid field */
+   if (OidIsValid(conform->conparentid))
+   {
+   parentTup = SearchSysCache1(CONSTROID, 
ObjectIdGetDatum(conform->conparentid));
+   if (!HeapTupleIsValid(parentTup))
+   elog(ERROR, "cache lookup failed for constraint 
%u", conform->conparentid);
+   parentForm = (Form_pg_constraint) GETSTRUCT(parentTup);
+   /* It is not inherited foreign keys */
+   if (parentForm->confrelid != conform->confrelid)
+   {
+   ReleaseSysCache(contup);
+   ReleaseSysCache(parentTup);
+   continue;
+   }
+   ReleaseSysCache(parentTup);
+   }
+
/* unset conparentid and adjust conislocal, coninhcount, etc. */
ConstraintSetParentConstraint(fk->conoid, InvalidOid, 
InvalidOid);
 
-- 
2.25.1

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2023-08-03 Thread tender wang

I think  old "sub-FK" should not be dropped, that will be violates foreign
key constraint. For example :
postgres=# insert into r values(1,1);
INSERT 0 1
postgres=# ALTER TABLE r DETACH PARTITION r_1;
ALTER TABLE
postgres=# delete from p_1 where id = 1;
DELETE 1
postgres=# select * from r_1;
 id | p_id
+--
  1 |1
(1 row)

If I run above SQLs on pg12.12, it will report error below:
postgres=# delete from p_1 where id = 1;
ERROR:  update or delete on table "p_1" violates foreign key constraint
"r_1_p_id_fkey1" on table "r_1"
DETAIL:  Key (id)=(1) is still referenced from table "r_1".

Alvaro Herrera  于2023年7月31日周一 20:58写道：

> On 2023-Jul-05, Jehan-Guillaume de Rorthais wrote:
>
> >   ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1);
> >
> > The old sub-FKs (below 18289) created in this table to enforce the action
> > triggers on referenced partitions are not deleted when the table becomes
> a
> > partition. Because of this, we have additional and useless triggers on
> the
> > referenced partitions and we can not DETACH this partition on the
> referencing
> > side anymore:
>
> Oh, hm, interesting.  Thanks for the report and patch.  I found a couple
> of minor issues with it (most serious one: nkeys should be 3, not 2;
> also sysscan should use conrelid index), but I'll try and complete it so
> that it's ready for 2023-08-10's releases.
>
> Regards
>
> --
> Álvaro Herrera   48°01'N 7°57'E  —
> https://www.EnterpriseDB.com/
>
>
>

Re: Improve join_search_one_level readibilty (one line change)

2023-06-05 Thread tender wang

謝東霖  于2023年6月3日周六 23:21写道：

> Hello hackers
>
> Attached is my first patch for PostgreSQL, which is a simple one-liner
> that I believe can improve the code.
>
> In the "join_search_one_level" function, I noticed that the variable
> "other_rels_list" always refers to "joinrels[1]" even when the (level
> == 2) condition is met. I propose changing:
>
> "other_rels_list = joinrels[level - 1]" to "other_rels_list = joinrels[1]"
>
> This modification aims to enhance clarity and avoid unnecessary
> instructions.
>

I  guess compiler can make that code more efficiency. But from the point of
code readibilty, I agree with you.
As Julien Rouhaud said, it had better to  to move the other_rels_list
initialization out of the if instruction and put it with the variable
declaration.

I would greatly appreciate any review and feedback on this patch as I
> am a newcomer to PostgreSQL contributions. Your input will help me
> improve and contribute effectively to the project.
>
> I have followed the excellent guide "How to submit a patch by email,
> 2023 edition" by Peter Eisentraut.
>
> Additionally, if anyone has any tips on ensuring that Gmail recognizes
> my attached patches as the "text/x-patch" MIME type when sending them
> from the Chrome client, I would be grateful for the advice.
>
> Or maybe the best practice is to use Git send-email ?
>
> Thank you for your time.
>
> Best regards
> Alex Hsieh
>

ERROR: wrong varnullingrels (b 5 7) (expected (b)) for Var 3/3

2023-05-18 Thread tender wang

Hi hackers,
   I found $subject problem when using SQLancer.

   How to repeat:
 CREATE TEMP TABLE t0(c0 inet, c1 money, c2 TEXT);
CREATE TEMP TABLE IF NOT EXISTS t1(LIKE t0); CREATE TEMP TABLE t2(c0
boolean , c1 DECIMAL NOT NULL UNIQUE); CREATE TEMPORARY TABLE t3(LIKE t1);
CREATE VIEW v0(c0) AS (SELECT DISTINCT 0 FROM t3); SELECT SUM(count) FROM
(SELECT
(((t1.c2)LIKE(t0.c2)||(((103556691)-(v0.c0)||(v0.c0)::INT as
count FROM t0, ONLY t1 RIGHT OUTER JOIN ONLY t2 ON t2.c0 RIGHT OUTER JOIN
v0 ON ((t2.c1)=(0.08182310538090898))) as res;


psql (16devel)
Type "help" for help.

postgres=# \d
Did not find any relations.
postgres=# CREATE TEMP TABLE t0(c0 inet, c1 money, c2 TEXT);
CREATE TABLE
postgres=# CREATE TEMP TABLE IF NOT EXISTS t1(LIKE t0);
CREATE TABLE
postgres=# CREATE TEMP TABLE t2(c0 boolean , c1 DECIMAL  NOT NULL UNIQUE);
CREATE TABLE
postgres=# CREATE TEMPORARY TABLE t3(LIKE t1);
CREATE TABLE
postgres=# CREATE VIEW v0(c0) AS (SELECT DISTINCT 0 FROM t3);
NOTICE:  view "v0" will be a temporary view
CREATE VIEW
postgres=# SELECT SUM(count) FROM (SELECT
(((t1.c2)LIKE(t0.c2)||(((103556691)-(v0.c0)||(v0.c0)::INT as
count FROM t0, ONLY t1 RIGHT OUTER JOIN ONLY t2 ON t2.c0 RIGHT OUTER JOIN
v0 ON ((t2.c1)=(0.08182310538090898))) as res;
ERROR:  wrong varnullingrels (b 5 7) (expected (b)) for Var 3/3

regards, tender wang

Re: [PATCH] Use RelationClose rather than table_close in heap_create_with_catalog

2023-05-11 Thread tender wang

Tom Lane  于2023年5月11日周四 00:32写道：

> Bharath Rupireddy  writes:
> > And, the /* do not unlock till end of xact */, it looks like it's been
> > there from day 1. It may be indicating that the ref count fo the new
> > relation created in heap_create_with_catalog() will be decremented at
> > the end of xact, but I'm not sure what it means.
>
> Hmm, I think it's been copied-and-pasted from somewhere.  It's quite
> common for us to not release locks on modified tables until end of
> transaction.  However, that's not what's happening here, because we
> actually *don't have any such lock* at this point, as you can easily
> prove by stepping through this code and watching the contents of
> pg_locks from another session.  We do acquire AccessExclusiveLock
> on the new table eventually, but not till control returns to
> DefineRelation.
>
> I'm not real sure that I like the proposed code change: it's unclear
> to me whether it's an unwise piercing of a couple of abstraction
> layers or an okay change because those abstraction layers haven't
> yet been applied to the new relation at all.  However, I think the
> existing comment is actively misleading and needs to be changed.
> Maybe something like
>
> /*
>  * Close the relcache entry, since we return only an OID not a
>  * relcache reference.  Note that we do not yet hold any lockmanager
>  * lock on the new rel, so there's nothing to release.
>  */
> table_close(new_rel_desc, NoLock);
>
> /*
>  * ok, the relation has been cataloged, so close catalogs and return
>  * the OID of the newly created relation.
>  */
> table_close(pg_class_desc, RowExclusiveLock);
>
+1
 Personally, I prefer above code.

Given these comments, maybe changing the first call to RelationClose
> would be sensible, but I'm still not quite convinced.
>
> regards, tom lane
>
>
>

Re: [PATCH] Use RelationClose rather than table_close in heap_create_with_catalog

2023-05-10 Thread tender wang

Bharath Rupireddy  于2023年5月10日周三
22:17写道：

> On Sat, Mar 18, 2023 at 12:34 PM Xiaoran Wang  wrote:
> >
> > Hi hackers,
> >
> >  In heap_create_with_catalog, the Relation new_rel_desc is created
> > by RelationBuildLocalRelation, not table_open. So it's better to
> > call RelationClose to release it.
> >
> > What's more, the comment for it seems useless, just delete it.
>
> Essentially, all the close functions are the same with NoLock, IOW,
> table_close(relation, NoLock) = relation_closerelation, NoLock) =
> RelationClose(relation). Therefore, table_close(new_rel_desc, NoLock);
> looks fine to me.

  Agreed.

And, the /* do not unlock till end of xact */, it looks like it's been
> there from day 1. It may be indicating that the ref count fo the new
> relation created in heap_create_with_catalog() will be decremented at
> the end of xact, but I'm not sure what it means.
>
  Me too

> --
> Bharath Rupireddy
> PostgreSQL Contributors Team
> RDS Open Source Databases
> Amazon Web Services: https://aws.amazon.com
>
>
>

Re: [PATCH] Use RelationClose rather than table_close in heap_create_with_catalog

2023-05-09 Thread tender wang

Xiaoran Wang  于2023年3月18日周六 15:04写道：

> Hi hackers,
>
>  In heap_create_with_catalog, the Relation new_rel_desc is created
> by RelationBuildLocalRelation, not table_open. So it's better to
> call RelationClose to release it.
>
Why it's better to call RelationClose? Is there a problem if using
table_close()?

> What's more, the comment for it seems useless, just delete it.
>
> Thanks!
>

regard, tender wang

Re: Improve list manipulation in several places

2023-04-23 Thread tender wang

Richard Guo  于2023年4月21日周五 15:35写道：

> There was discussion in [1] about improvements to list manipulation in
> several places.  But since the discussion is not related to the topic in
> that thread, fork a new thread here and attach a patch to show my
> thoughts.
>
> Some are just cosmetic changes by using macros.  The others should have
> performance gain from the avoidance of moving list entries.  But I doubt
> the performance gain can be noticed or measured, as currently there are
> only a few places affected by the change.  I still think the changes are
> worthwhile though, because it is very likely that future usage of the
> same scenario can benefit from these changes.
>

I doubt the performance gain from lappend_copy func.  new_tail_cell in
lappend may not enter enlarge_list in most cases, because we
may allocate extra cells in new_list(see the comment in new_list).



>
> (Copying in David and Ranier.  Ranier provided a patch about the changes
> in list.c, but I'm not using that one.)
>
> [1]
> https://www.postgresql.org/message-id/CAMbWs49aakL%3DPP7NcTajCtDyaVUE-NMVMGpaLEKreYbQknkQWA%40mail.gmail.com
>
> Thanks
> Richard
>

Re: same query but different result on pg16devel and pg15.2

2023-04-03 Thread tender wang

Attached file included table schema information, but no data.

tender wang  于2023年4月4日周二 10:53写道：

> Hi hackers,
> I encounter a problem, as shown below:
>
> query:
>   select
>   ref_0.ps_suppkey as c0,
>   ref_1.c_acctbal as c1,
>   ref_2.o_totalprice as c2,
>   ref_2.o_orderpriority as c3,
>   ref_2.o_clerk as c4
> from
>   public.partsupp as ref_0
>   left join public.nation as sample_0
> inner join public.customer as sample_1
> on (false)
>   on (true)
> left join public.customer as ref_1
> right join public.orders as ref_2
> on (false)
>   left join public.supplier as ref_3
>   on (false)
> on (sample_0.n_comment = ref_1.c_name )
> where (8 <= NULLIF(CASE WHEN (o_orderkey IS NOT NULL) THEN 4 ELSE 4 END,
> CASE WHEN (o_orderdate >= o_orderdate) THEN 95 ELSE 95 END))
> order by c0, c1, c2, c3, c4 limit 1;
>
> on pg16devel:
> c0 | c1 | c2 | c3 | c4
> ++++
>   1 ||||
> (1 row)
> plan:
>   QUERY PLAN
>
>
> ---
>  Limit
>->  Sort
>  Sort Key: ref_0.ps_suppkey, c_acctbal, o_totalprice,
> o_orderpriority, o_clerk
>  ->  Nested Loop Left Join
>->  Seq Scan on partsupp ref_0
>->  Result
>  One-Time Filter: false
> (7 rows)
>
> on pg15.2:
>  c0 | c1 | c2 | c3 | c4
> ++++
> (0 rows)
> plan:
>
> QUERY PLAN
>
>
> ---
>  Limit
>->  Sort
>  Sort Key: ref_0.ps_suppkey, c_acctbal, o_totalprice,
> o_orderpriority, o_clerk
>  ->  Hash Left Join
>Hash Cond: ((n_comment)::text = (c_name)::text)
>Filter: (8 <= NULLIF(CASE WHEN (o_orderkey IS NOT NULL)
> THEN 4 ELSE 4 END, CASE WHEN (o_orderdate >= o_orderdate) THEN 95 ELSE 95
> END))
>->  Nested Loop Left Join
>  ->  Seq Scan on partsupp ref_0
>  ->  Result
>        One-Time Filter: false
>->  Hash
>  ->  Result
>One-Time Filter: false
> (13 rows)
>
>
>
>  regards, tender
> wang
>


dbt3-s0.01-janm.sql
Description: Binary data

same query but different result on pg16devel and pg15.2

2023-04-03 Thread tender wang

Hi hackers,
I encounter a problem, as shown below:

query:
  select
  ref_0.ps_suppkey as c0,
  ref_1.c_acctbal as c1,
  ref_2.o_totalprice as c2,
  ref_2.o_orderpriority as c3,
  ref_2.o_clerk as c4
from
  public.partsupp as ref_0
  left join public.nation as sample_0
inner join public.customer as sample_1
on (false)
  on (true)
left join public.customer as ref_1
right join public.orders as ref_2
on (false)
  left join public.supplier as ref_3
  on (false)
on (sample_0.n_comment = ref_1.c_name )
where (8 <= NULLIF(CASE WHEN (o_orderkey IS NOT NULL) THEN 4 ELSE 4 END,
CASE WHEN (o_orderdate >= o_orderdate) THEN 95 ELSE 95 END))
order by c0, c1, c2, c3, c4 limit 1;

on pg16devel:
c0 | c1 | c2 | c3 | c4
++++
  1 ||||
(1 row)
plan:
  QUERY PLAN

---
 Limit
   ->  Sort
 Sort Key: ref_0.ps_suppkey, c_acctbal, o_totalprice,
o_orderpriority, o_clerk
 ->  Nested Loop Left Join
   ->  Seq Scan on partsupp ref_0
   ->  Result
 One-Time Filter: false
(7 rows)

on pg15.2:
 c0 | c1 | c2 | c3 | c4
++++
(0 rows)
plan:
  QUERY
PLAN
---
 Limit
   ->  Sort
 Sort Key: ref_0.ps_suppkey, c_acctbal, o_totalprice,
o_orderpriority, o_clerk
 ->  Hash Left Join
   Hash Cond: ((n_comment)::text = (c_name)::text)
   Filter: (8 <= NULLIF(CASE WHEN (o_orderkey IS NOT NULL) THEN
4 ELSE 4 END, CASE WHEN (o_orderdate >= o_orderdate) THEN 95 ELSE 95 END))
   ->  Nested Loop Left Join
 ->  Seq Scan on partsupp ref_0
 ->  Result
   One-Time Filter: false
   ->  Hash
 ->  Result
   One-Time Filter: false
(13 rows)



     regards, tender
wang

Re: wrong results due to qual pushdown

2023-03-06 Thread tender wang

Results on 16devel:
c0 | c3 | c6 | c7 |c8
++++---
  0 |||| ALGERIA
  0 |||| ETHIOPIA
  0 |||| KENYA
  0 |||| MOROCCO
  0 |||| MOZAMBIQUE
  1 |||| ARGENTINA
  1 |||| BRAZIL
  1 |||| CANADA
  1 |||| PERU
  1 |||| UNITED STATES
  2 |||| CHINA
  2 |||| INDIA
  2 |||| INDONESIA
  2 |||| JAPAN
  2 |||| VIETNAM
  3 |||| FRANCE
  3 |||| GERMANY
  3 |||| ROMANIA
  3 |||| RUSSIA
  3 |||| UNITED KINGDOM
  4 |||| EGYPT
  4 |||| IRAN
  4 |||| IRAQ
  4 |||| JORDAN
  4 |||| SAUDI ARABIA
(25 rows)

Results on 15.2:
 c0 | c3 | c6 | c7 | c8
++++
(0 rows)

tender wang  于2023年3月6日周一 22:48写道：

> Results on 16devel:
> c0 | c3 | c6 | c7 |c8
> ++++---
>   0 |||| ALGERIA
>   0 |||| ETHIOPIA
>   0 |||| KENYA
>   0 |||| MOROCCO
>   0 |||| MOZAMBIQUE
>   1 |||| ARGENTINA
>   1 |||| BRAZIL
>   1 |||| CANADA
>   1 |||| PERU
>   1 |||| UNITED STATES
>   2 |||| CHINA
>   2 |||| INDIA
>   2 |||| INDONESIA
>   2 |||| JAPAN
>   2 |||| VIETNAM
>   3 |||| FRANCE
>   3 |||| GERMANY
>   3 |||| ROMANIA
>   3 |||| RUSSIA
>   3 |||| UNITED KINGDOM
>   4 |||| EGYPT
>   4 |||| IRAN
>   4 |||| IRAQ
>   4 |||| JORDAN
>   4 |||| SAUDI ARABIA
> (25 rows)
>
> Results on 15.2:
>  c0 | c3 | c6 | c7 | c8
> ++++----
> (0 rows)
>
> Ashutosh Bapat  于2023年3月6日周一 22:14写道：
>
>>
>>
>> On Mon, Mar 6, 2023 at 3:00 PM tender wang  wrote:
>>
>>> tender wang 
>>> [image: 附件]14:51 (2小时前)
>>> 发送至 pgsql-hackers
>>> Hi hackers.
>>>This query has different result on 16devel and 15.2.
>>> select
>>> sample_3.n_regionkey as c0,
>>> ref_7.l_linenumber as c3,
>>> sample_4.l_quantity as c6,
>>> sample_5.n_nationkey as c7,
>>> sample_3.n_name as c8
>>>   from
>>> public.nation as sample_3
>>> left join public.lineitem as ref_5
>>> on ((cast(null as text) ~>=~ cast(null as text))
>>> or (ref_5.l_discount is NULL))
>>>   left join public.time_statistics as ref_6
>>> inner join public.lineitem as ref_7
>>> on (ref_7.l_returnflag = ref_7.l_linestatus)
>>>   right join public.lineitem as sample_4
>>> left join public.nation as sample_5
>>> on (cast(null as tsquery) = cast(null as tsquery))
>>>   on (cast(null as "time") <= cast(null as "time"))
>>> right join public.customer as ref_8
>>> on (sample_4.l_comment = ref_8.c_name )
>>>   on (ref_5.l_quantity = ref_7.l_quantity )
>>>   where (ref_7.l_suppkey is not NULL)
>>> or ((case when cast(null as lseg) >= cast(null as lseg) then
>>> cast(null as inet) else cast(null as inet) end
>>>&& cast(null as inet))
>>>   or (pg_catalog.getdatabaseencoding() !~~ case when (cast(null as
>>> int2) <= cast(null as int8))
>>> or (EXISTS (
>>>   select
>>>   ref_9.ps_comment as c0,
>>>   5 as c1,
>>>   ref_8.c_address as c2,
>>>   58 as c3,
>>>   ref_8.c_acctbal as c4,
>>>   ref_7.l_orderkey as c5,
>>>   ref_7.l_shipmode as c6,
>>>   ref_5.l_commitdate as c7,
>>>   ref_8.c_custkey as c8,
>>>   sample_3.n_nationkey as c9
>>> from
>>>   public.partsupp as ref_9
>>> where cast(null as tsquery) @> cast(null as tsquery)
>>> order by c0, c1, c2, c3, c4, c5, c6, c7, c8, c9 limit
>>> 38)) then cast(null as text) else cast(null as text) end
>>>   ))
>>>   order by c0, c3, c6, c7, c8 limit 137;
>>>
>>> plan on 16devel:
>>>
>>>QUERY PLAN
>>>
>>&

wrong results due to qual pushdown

2023-03-06 Thread tender wang

tender wang 
[image: 附件]14:51 (2小时前)
发送至 pgsql-hackers
Hi hackers.
   This query has different result on 16devel and 15.2.
select
sample_3.n_regionkey as c0,
ref_7.l_linenumber as c3,
sample_4.l_quantity as c6,
sample_5.n_nationkey as c7,
sample_3.n_name as c8
  from
public.nation as sample_3
left join public.lineitem as ref_5
on ((cast(null as text) ~>=~ cast(null as text))
or (ref_5.l_discount is NULL))
  left join public.time_statistics as ref_6
inner join public.lineitem as ref_7
on (ref_7.l_returnflag = ref_7.l_linestatus)
  right join public.lineitem as sample_4
left join public.nation as sample_5
on (cast(null as tsquery) = cast(null as tsquery))
  on (cast(null as "time") <= cast(null as "time"))
right join public.customer as ref_8
on (sample_4.l_comment = ref_8.c_name )
  on (ref_5.l_quantity = ref_7.l_quantity )
  where (ref_7.l_suppkey is not NULL)
or ((case when cast(null as lseg) >= cast(null as lseg) then cast(null
as inet) else cast(null as inet) end
   && cast(null as inet))
  or (pg_catalog.getdatabaseencoding() !~~ case when (cast(null as
int2) <= cast(null as int8))
or (EXISTS (
  select
  ref_9.ps_comment as c0,
  5 as c1,
  ref_8.c_address as c2,
  58 as c3,
  ref_8.c_acctbal as c4,
  ref_7.l_orderkey as c5,
  ref_7.l_shipmode as c6,
  ref_5.l_commitdate as c7,
  ref_8.c_custkey as c8,
  sample_3.n_nationkey as c9
from
  public.partsupp as ref_9
where cast(null as tsquery) @> cast(null as tsquery)
order by c0, c1, c2, c3, c4, c5, c6, c7, c8, c9 limit 38))
then cast(null as text) else cast(null as text) end
  ))
  order by c0, c3, c6, c7, c8 limit 137;

plan on 16devel:

 QUERY PLAN


 Limit
   InitPlan 1 (returns $0)
 ->  Result
   One-Time Filter: false
   ->  Sort
 Sort Key: sample_3.n_regionkey, l_linenumber, l_quantity,
n_nationkey, sample_3.n_name
 ->  Nested Loop Left Join
   ->  Seq Scan on nation sample_3
   ->  Materialize
 ->  Nested Loop Left Join
   Join Filter: (ref_5.l_quantity = l_quantity)
   Filter: ((l_suppkey IS NOT NULL) OR
(getdatabaseencoding() !~~ CASE WHEN ($0 OR NULL::boolean) THEN NULL::text
ELSE NULL::text END))
   ->  Seq Scan on lineitem ref_5
 Filter: (l_discount IS NULL)
   ->  Result
 One-Time Filter: false
(16 rows)

plan on 15.2:
QUERY
PLAN

 Limit
   InitPlan 1 (returns $0)
 ->  Result
   One-Time Filter: false
   ->  Sort
 Sort Key: sample_3.n_regionkey, l_linenumber, l_quantity,
n_nationkey, sample_3.n_name
 ->  Nested Loop Left Join
   Filter: ((l_suppkey IS NOT NULL) OR (getdatabaseencoding()
!~~ CASE WHEN ($0 OR NULL::boolean) THEN NULL::text ELSE NULL::text END))
   ->  Seq Scan on nation sample_3
   ->  Materialize
 ->  Nested Loop Left Join
   Join Filter: (ref_5.l_quantity = l_quantity)
   ->  Seq Scan on lineitem ref_5
 Filter: (l_discount IS NULL)
   ->  Result
 One-Time Filter: false
(16 rows)


It looks wrong that the qual (e.g ((l_suppkey IS NOT NULL) OR
(getdatabaseencoding() !~~ CASE WHEN ($0 OR NULL::boolean) THEN NULL::text
ELSE NULL::text END))) is pushdown.

regards, tender wang

wrong query result due to wang plan

2023-02-15 Thread tender wang

Hi hackers,
I have this query as shown below:

select ref_1.r_comment as c0, subq_0.c1 as c1 from public.region as
sample_0 right join public.partsupp as sample_1 right join public.lineitem
as sample_2 on (cast(null as path) = cast(null as path)) on (cast(null as
"timestamp") < cast(null as "timestamp")) inner join public.lineitem as
ref_0 on (true) left join (select sample_3.ps_availqty as c1,
sample_3.ps_comment as c2 from public.partsupp as sample_3 where false
order by c1, c2 ) as subq_0 on (sample_1.ps_supplycost = subq_0.c1 ) right
join public.region as ref_1 on (sample_1.ps_availqty = ref_1.r_regionkey )
where ref_1.r_comment is not NULL order by c0, c1;

*This query has different result on pg12.12 and on HEAD*,
on pg12.12:
   c0
 | c1
-+
 even, ironic theodolites according to the bold platelets wa
  |
 furiously unusual packages use carefully above the unusual, exp
  |
 silent, bold requests sleep slyly across the quickly sly dependencies.
furiously silent instructions alongside  |
 special, bold deposits haggle foxes. platelet
  |
 special Tiresias about the furiously even dolphins are furi
  |
(5 rows)

its plan :
  QUERY PLAN
--
 Sort
   Sort Key: ref_1.r_comment, c1
   ->  Hash Left Join
 Hash Cond: (ref_1.r_regionkey = ps_availqty)
 ->  Seq Scan on region ref_1
   Filter: (r_comment IS NOT NULL)
 ->  Hash
   ->  Result
 One-Time Filter: false
(9 rows)

But on HEAD(pg16devel), its results below:
c0 | c1
+
(0 rows)

its plan:
   QUERY PLAN

 Sort
   Sort Key: ref_1.r_comment, subq_0.c1
   ->  Result
 One-Time Filter: false
(4 rows)

Attached file included table schema info.
regards, tender wang


dbt3-s0.01-janm.sql
Description: Binary data

ERROR: permission info at index 1 ....

2023-02-13 Thread tender wang

Hi hackers,
   After a61b1f74823c commit, below query reports error:

 create table perm_test1(a int);
 create table perm_test2(b int);
 select subq.c0
from (select (select a from perm_test1 order by a limit 1) as c0, b as c1
from perm_test2 where false order by c0, c1) as subq where false;
ERROR:  permission info at index 1 (with relid=16457) does not match
provided RTE (with relid=16460)

Below codes can fix this:

--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -512,11 +512,16 @@ flatten_rtes_walker(Node *node,
flatten_rtes_walker_context *cxt)
 * Recurse into subselects.  Must update cxt->query to this
query so
 * that the rtable and rteperminfos correspond with each
other.
 */
+   Query *current_query = cxt->query;
+   bool result;
+
cxt->query = (Query *) node;
-   return query_tree_walker((Query *) node,
+   result = query_tree_walker((Query *) node,

 flatten_rtes_walker,
 (void *)
cxt,

 QTW_EXAMINE_RTES_BEFORE);
+   cxt->query = current_query;
+   return result;
}


     regards, tender wang

Why cann't simplify stable function in planning phase?

2023-02-08 Thread tender wang

Hi hackers,
   In evaluate_function(), I find codes as shown below:

 /*
  * Ordinarily we are only allowed to simplify immutable functions. But for
  * purposes of estimation, we consider it okay to simplify functions that
  * are merely stable; the risk that the result might change from planning
  * time to execution time is worth taking in preference to not being able
   * to estimate the value at all.
   */
if (funcform->provolatile == PROVOLATILE_IMMUTABLE)
/* okay */ ;
else if (context->estimate && funcform->provolatile == PROVOLATILE_STABLE)
 /* okay */ ;
else
return NULL;

The codes say that stable function can not be simplified here(e.g. planning
phase).
I want to know the reason why stable function can not be simplified in
planning phase.
Maybe show me a example that it will be incorrect for  a query if simplify
stable function in
planning phases.

With kindest regards, tender wang

60 matches

Mail list logo