date:20221227

Re: SQL/JSON revisited

2022-12-27 Thread Amit Langote

On Wed, Dec 28, 2022 at 4:28 PM Amit Langote  wrote:
>
> Hi,
>
> Rebased the SQL/JSON patches over the latest HEAD.  I've decided to
> keep the same division of code into individual commits as that
> mentioned in the revert commit 2f2b18bd3f, squashing fixup commits in
> that list into the appropriate feature commits.
>
> The main difference from the patches as they were committed into v15
> is that JsonExpr evaluation no longer needs to use sub-transactions,
> thanks to the work done recently to handle type errors softly.  I've
> made the new code pass an ErrorSaveContext into the type-conversion
> related functions as needed and also added an ExecEvalExprSafe() to
> evaluate sub-expressions of JsonExpr that might contain expressions
> that call type-conversion functions, such as CoerceViaIO contained in
> JsonCoercion nodes.  ExecExprEvalSafe() is based on one of the patches
> that Nikita Glukhov had submitted in a previous discussion about
> redesigning SQL/JSON expression evaluation [1].  Though, I think that
> new interface will become unnecessary after I have finished rebasing
> my patches to remove subsidiary ExprStates of JsonExprState that we
> had also discussed back in [2].
>
> Adding this to January CF.

Done.

https://commitfest.postgresql.org/41/4086/

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com

Re: Error-safe user functions

2022-12-27 Thread Amul Sul

On Tue, Dec 27, 2022 at 11:17 PM Tom Lane  wrote:
>
> Andrew Dunstan  writes:
> > Here's a patch that covers the ltree and intarray contrib modules.
>
> I would probably have done this a little differently --- I think
> the added "res" parameters aren't really necessary for most of
> these.  But it's not worth arguing over.
>

Also, it would be good if we can pass "escontext" through the "state"
argument of makepool() like commit 78212f210114 done for makepol() of
tsquery.c. Attached patch is the updated version that does the same.

Regards,
Amul

v2-ltree-intarray-error-safe.patch
Description: Binary data

CFM for 2023-01

2022-12-27 Thread vignesh C

Hi All,

If no one has volunteered for the upcoming (January 2023) commitfest.
I would like to volunteer for it.

Regards,
Vignesh

Re: Data loss on logical replication, 12.12 to 14.5, ALTER SUBSCRIPTION

2022-12-27 Thread Amit Kapila

On Tue, Dec 27, 2022 at 5:49 PM Michail Nikolaev
 wrote:
>
>
> Probably a small part of WAL was somehow skipped by logical worker in
> all that mess.
>

None of these entries are from the point mentioned by you [1]
yesterday where you didn't find the corresponding data in the
subscriber. How did you identify that the entries corresponding to
that timing were missing?

[1] -
> 2022-12-14 09:49:31.340
> 2022-12-14 09:49:41.683

-- 
With Regards,
Amit Kapila.

Re: Allow placeholders in ALTER ROLE w/o superuser

2022-12-27 Thread Tom Lane

Justin Pryzby  writes:
> This fails when run more than once:
> time meson test --setup running --print 
> test_pg_db_role_setting-running/regress

Ah.

> It didn't fail for you because it says:
> ./src/test/modules/test_pg_db_role_setting/Makefile
> +# disable installcheck for now
> +NO_INSTALLCHECK = 1

So ... exactly why is the meson infrastructure failing to honor that?
This test looks sufficiently careless about its side-effects that
I completely agree with the decision to not run it in pre-existing
installations.  Failing to drop a created superuser is just one of
its risk factors --- it also leaves around pg_db_role_setting entries.

regards, tom lane

Re: POC, WIP: OR-clause support for indexes

2022-12-27 Thread Andrey Lepikhov


On 12/26/15 23:04, Teodor Sigaev wrote:
I'd like to present OR-clause support for indexes. Although OR-clauses 
could be supported by bitmapOR index scan it isn't very effective and 
such scan lost any order existing in index. We (with Alexander Korotkov) 
presented results on Vienna's conference this year. In short, it 
provides performance improvement:


EXPLAIN ANALYZE
SELECT count(*) FROM tst WHERE id = 5 OR id = 500 OR id = 5000;
...
The problems on the way which I see for now:
1 Calculating cost. Right now it's just a simple transformation of costs 
computed for BitmapOr path. I'd like to hope that's possible and so 
index's estimation function could be non-touched. So, they could believe 
that all clauses are implicitly-ANDed
2 I'd like to add such support to btree but it seems that it should be a 
separated patch. Btree search algorithm doesn't use any kind of stack of 
pages and algorithm to walk over btree doesn't clear for me for now.
3 I could miss some places which still assumes  implicitly-ANDed list of 
clauses although regression tests passes fine.
I support such a cunning approach. But this specific case, you 
demonstrated above, could be optimized independently at an earlier 
stage. If to convert:


(F(A) = ConstStableExpr_1) OR (F(A) = ConstStableExpr_2)
to
F(A) IN (ConstStableExpr_1, ConstStableExpr_2)

it can be seen significant execution speedup. For example, using the 
demo.sql to estimate maximum positive effect we see about 40% of 
execution and 100% of planning speedup.


To avoid unnecessary overhead, induced by the optimization, such 
transformation may be made at the stage of planning (we have cardinality 
estimations and have pruned partitions) but before creation of a 
relation scan paths. So, we can avoid planning overhead and non-optimal 
BitmapOr in the case of many OR's possibly aggravated by many indexes on 
the relation.
For example, such operation can be executed in create_index_paths() 
before passing rel->indexlist.


--
Regards
Andrey Lepikhov
Postgres Professional


demo.sql
Description: application/sql

Re: Exit walsender before confirming remote flush in logical replication

2022-12-27 Thread Amit Kapila

On Wed, Dec 28, 2022 at 8:19 AM Hayato Kuroda (Fujitsu)
 wrote:
>
> > In logical replication, it can happen today as well without
> > time-delayed replication. Basically, say apply worker is waiting to
> > acquire some lock that is already acquired by some backend then it
> > will have the same behavior. I have not verified this, so you may want
> > to check it once.
>
> Right, I could reproduce the scenario with following steps.
>
> 1. Construct pub -> sub logical replication system with streaming = off.
> 2. Define a table on both nodes.
>
> ```
> CREATE TABLE tbl (id int PRIMARY KEY);
> ```
>
> 3. Execute concurrent transactions.
>
> Tx-1 (on subscriber)
> BEGIN;
> INSERT INTO tbl SELECT i FROM generate_series(1, 5000) s(i);
>
> Tx-2 (on publisher)
> INSERT INTO tbl SELECT i FROM generate_series(1, 5000) s(i);
>
> 4. Try to shutdown publisher but it will be failed.
>
> ```
> $ pg_ctl stop -D publisher
> waiting for server to shut 
> down... failed
> pg_ctl: server does not shut down
> ```

Thanks for the verification. BTW, do you think we should document this
either with time-delayed replication or otherwise unless this is
already documented?

Another thing we can investigate here why do we need to ensure that
there is no pending send before shutdown.

-- 
With Regards,
Amit Kapila.

RE: Exit walsender before confirming remote flush in logical replication

2022-12-27 Thread Hayato Kuroda (Fujitsu)

Dear Amit,

> In logical replication, it can happen today as well without
> time-delayed replication. Basically, say apply worker is waiting to
> acquire some lock that is already acquired by some backend then it
> will have the same behavior. I have not verified this, so you may want
> to check it once.

Right, I could reproduce the scenario with following steps.

1. Construct pub -> sub logical replication system with streaming = off.
2. Define a table on both nodes.

```
CREATE TABLE tbl (id int PRIMARY KEY);
```

3. Execute concurrent transactions.

Tx-1 (on subscriber)
BEGIN;
INSERT INTO tbl SELECT i FROM generate_series(1, 5000) s(i);

Tx-2 (on publisher)
INSERT INTO tbl SELECT i FROM generate_series(1, 5000) s(i);

4. Try to shutdown publisher but it will be failed.

```
$ pg_ctl stop -D publisher
waiting for server to shut 
down... failed
pg_ctl: server does not shut down
```

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

RE: Exit walsender before confirming remote flush in logical replication

2022-12-27 Thread Hayato Kuroda (Fujitsu)

Dear Amit,

> > Firstly I considered 2, but I thought 1 seemed to be better.
> > PSA the updated patch.
> >
> 
> I think even for logical replication we should check whether there is
> any pending WAL (via pq_is_send_pending()) to be sent. Otherwise, what
> is the point to send the done message? Also, the caller of
> WalSndDone() already has that check which is another reason why I
> can't see why you didn't have the same check in function WalSndDone().

I did not have strong opinion around here. Fixed.

> BTW, even after fixing this, I think logical replication will behave
> differently when due to some reason (like time-delayed replication)
> send buffer gets full and walsender is not able to send data. I think
> this will be less of an issue with physical replication because there
> is a separate walreceiver process to flush the WAL which doesn't wait
> but the same is not true for logical replication. Do you have any
> thoughts on this matter?

Yes, it may happen even if this work is done. And your analysis is correct that
the receive buffer is rarely full in physical replication because walreceiver
immediately flush WALs.
I think this is an architectural problem. Maybe we have assumed that the decoded
WALs are consumed in as short time. I do not have good idea, but one approach is
introducing a new process logical-walreceiver.  It will record the decoded WALs 
to
the persistent storage and workers consume and then remove them. It may have 
huge
impact for other features and should not be accepted...


Best Regards,
Hayato Kuroda
FUJITSU LIMITED



v3-0001-Exit-walsender-before-confirming-remote-flush-in-.patch
Description:  v3-0001-Exit-walsender-before-confirming-remote-flush-in-.patch

Re: Force streaming every change in logical decoding

2022-12-27 Thread Amit Kapila

On Wed, Dec 28, 2022 at 1:42 AM Andres Freund  wrote:
>
> On 2022-12-26 14:04:28 +0530, Amit Kapila wrote:
> > Pushed.
>
> I did not follow this thread but saw the commit. Could you explain why a GUC
> is the right mechanism here? The commit message didn't explain why a GUC was
> chosen.
>
> To me an option like this should be passed in when decoding rather than a GUC.
>

For that, we need to have a subscription option for this as we need to
reduce test timing for streaming TAP tests (by streaming, I mean
replication of large in-progress transactions) as well. Currently,
there is no subscription option that is merely used for
testing/debugging purpose and there doesn't seem to be a use for this
for users. So, we didn't want to expose it as a user option. There is
also a risk that if users use it they will face a slowdown.

-- 
With Regards,
Amit Kapila.

Re: Allow placeholders in ALTER ROLE w/o superuser

2022-12-27 Thread Justin Pryzby

On Tue, Dec 27, 2022 at 01:58:14AM -0500, Tom Lane wrote:
> Justin Pryzby  writes:
> > FYI: this causes meson test running ("installcheck") fail when run
> > twice.  I guess that's expected to work, per:
> 
> We do indeed expect that to work ... but I don't see any problem
> with repeat "make installcheck" on HEAD.  Can you provide more
> detail about what you're seeing?

This fails when run more than once:
time meson test --setup running --print test_pg_db_role_setting-running/regress

@@ -1,12 +1,13 @@
 CREATE EXTENSION test_pg_db_role_setting;
 CREATE USER regress_super_user SUPERUSER;
+ERROR:  role "regress_super_user" already exists
 CREATE USER regress_regular_user;
+ERROR:  role "regress_regular_user" already exists
...

It didn't fail for you because it says:

./src/test/modules/test_pg_db_role_setting/Makefile
+# disable installcheck for now
+NO_INSTALLCHECK = 1

It also says:
+# and also for now force NO_LOCALE and UTF8
+ENCODING = UTF8
+NO_LOCALE = 1

which was evidently copied from the "oat" tests, which have said that
since March (5b29a9f77, 7c51b7f7c).

It fails the same way with "make" if you change it to not disable
installcheck:

EXTRA_REGRESS_OPTS="--bindir=`pwd`/tmp_install/usr/local/pgsql/bin" PGHOST=/tmp 
make installcheck -C src/test/modules/test_pg_db_role_setting

-- 
Justin

Re: recovery modules

2022-12-27 Thread Andres Freund

Hi,

On 2022-12-27 15:04:28 -0800, Nathan Bossart wrote:
> I'm sorry, I'm still lost here.  Wouldn't restoration via library tend to
> improve latency?  Is your point that clusters may end up depending on this
> improvement so much that a shell command would no longer be able to keep
> up?

Yes.


> I might be creating a straw man, but this seems like less of a concern
> for pg_rewind since it isn't meant for continuous, ongoing restoration.

pg_rewind is in the critical path of a bunch of HA scenarios, so I wouldn't
say that restore performance isn't important...

Greetings,

Andres Freund

Re: recovery modules

2022-12-27 Thread Michael Paquier

On Tue, Dec 27, 2022 at 02:11:11PM -0800, Andres Freund wrote:
> On 2022-12-27 11:24:49 -0800, Nathan Bossart wrote:
>> * Unlike archive modules, recovery libraries cannot be changed at runtime.
>> There isn't a safe way to unload a library, and archive libraries work
>> around this restriction by restarting the archiver process.  Since recovery
>> libraries are loaded via the startup and checkpointer processes (which
>> cannot be trivially restarted like the archiver), the same workaround is
>> not feasible.
> 
> I don't think that's a convincing reason to not support configuration
> changes. Sure, libraries cannot be unloaded, but an unnecessarily loaded
> library is cheap. All that's needed is to redirect the relevant function
> calls.

Agreed.  That seems worth the cost to switching this stuff to be
SIGHUP-able.

>> * pg_rewind uses restore_command, but there isn't a straightforward path to
>> support restore_library.  I haven't addressed this in the attached patches,
>> but perhaps this is a reason to allow specifying both restore_command and
>> restore_library at the same time.  pg_rewind would use restore_command, and
>> the server would use restore_library.
> 
> That seems problematic, leading to situations where one might not be able to
> use restore_command anymore, because it's not feasible to do
> segment-by-segment restoration.

Do you mean that supporting restore_library in pg_rewind is a hard
requirement?  I fail to see why this should be the case.  Note that
having the possibility to do dlopen() calls in the frontend (aka
libpq) for loading some callbacks is something I've been looking for
in the past.  Having consistency in terms of restrictions between
library and command like for archives would make sense I guess (FWIW,
I mentioned on the thread of d627ce3 that we'd better not put any
restrictions for the archives).
--
Michael


signature.asc
Description: PGP signature

Re: [PATCH] Support using "all" for the db user in pg_ident.conf

2022-12-27 Thread Michael Paquier

On Tue, Dec 27, 2022 at 03:54:46PM +, Jelte Fennema wrote:
> This change makes it much easier to have a certain database
> administrator peer or cert authentication, that allows connecting as
> any user. Without this change you would need to add a line to
> pg_ident.conf for every user that is in the database.

That seems pretty dangerous to me.  For one, how does this work in
cases where we expect the ident entry to be case-sensitive, aka
authentication methods where check_ident_usermap() and check_usermap()
use case_insensitive = false?

Anyway, it is a bit confusing to see a patch touching parts of the
ident code related to the system-username while it claims to provide a
mean to shortcut a check on the database-username.  If you think that
some renames should be done to IdentLine, these ought to be done
first.
--
Michael


signature.asc
Description: PGP signature

Re: Removing redundant grouping columns

2022-12-27 Thread Tom Lane

I wrote:
> This patch is aimed at being smarter about cases where we have
> redundant GROUP BY entries, for example
> SELECT ... WHERE a.x = b.y GROUP BY a.x, b.y;

The cfbot didn't like this, because of a variable that wasn't
used in non-assert builds.  Fixed in v2.

regards, tom lane

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 9524765650..450f724c49 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -3667,6 +3667,13 @@ appendGroupByClause(List *tlist, deparse_expr_cxt *context)
 	 */
 	Assert(!query->groupingSets);
 
+	/*
+	 * We intentionally print query->groupClause not processed_groupClause,
+	 * leaving it to the remote planner to get rid of any redundant GROUP BY
+	 * items again.  This is necessary in case processed_groupClause reduced
+	 * to empty, and in any case the redundancy situation on the remote might
+	 * be different than what we think here.
+	 */
 	foreach(lc, query->groupClause)
 	{
 		SortGroupClause *grp = (SortGroupClause *) lfirst(lc);
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 1a2c2a665c..befb2fd4d3 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2864,16 +2864,13 @@ select c2 * (random() <= 1)::int as c2 from ft2 group by c2 * (random() <= 1)::i
 -- GROUP BY clause in various forms, cardinal, alias and constant expression
 explain (verbose, costs off)
 select count(c2) w, c2 x, 5 y, 7.0 z from ft1 group by 2, y, 9.0::int order by 2;
-  QUERY PLAN   

- Sort
+ QUERY PLAN 
+
+ Foreign Scan
Output: (count(c2)), c2, 5, 7.0, 9
-   Sort Key: ft1.c2
-   ->  Foreign Scan
- Output: (count(c2)), c2, 5, 7.0, 9
- Relations: Aggregate on (public.ft1)
- Remote SQL: SELECT count(c2), c2, 5, 7.0, 9 FROM "S 1"."T 1" GROUP BY 2, 3, 5
-(7 rows)
+   Relations: Aggregate on (public.ft1)
+   Remote SQL: SELECT count(c2), c2, 5, 7.0, 9 FROM "S 1"."T 1" GROUP BY 2, 3, 5 ORDER BY c2 ASC NULLS LAST
+(4 rows)
 
 select count(c2) w, c2 x, 5 y, 7.0 z from ft1 group by 2, y, 9.0::int order by 2;
   w  | x | y |  z  
@@ -2990,14 +2987,13 @@ select exists(select 1 from pg_enum), sum(c1) from ft1 group by 1;
 QUERY PLAN 
 ---
  GroupAggregate
-   Output: ($0), sum(ft1.c1)
-   Group Key: $0
+   Output: $0, sum(ft1.c1)
InitPlan 1 (returns $0)
  ->  Seq Scan on pg_catalog.pg_enum
->  Foreign Scan on public.ft1
- Output: $0, ft1.c1
+ Output: ft1.c1
  Remote SQL: SELECT "C 1" FROM "S 1"."T 1"
-(8 rows)
+(7 rows)
 
 select exists(select 1 from pg_enum), sum(c1) from ft1 group by 1;
  exists |  sum   
@@ -3382,14 +3378,13 @@ select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6
 --
  GroupAggregate
Output: array_agg(c1 ORDER BY c1 USING <^ NULLS LAST), c2
-   Group Key: ft2.c2
->  Sort
- Output: c2, c1
+ Output: c1, c2
  Sort Key: ft2.c1 USING <^
  ->  Foreign Scan on public.ft2
-   Output: c2, c1
+   Output: c1, c2
Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" WHERE (("C 1" < 100)) AND ((c2 = 6))
-(9 rows)
+(8 rows)
 
 -- This should not be pushed either.
 explain (verbose, costs off)
@@ -3458,14 +3453,13 @@ select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6
 --
  GroupAggregate
Output: array_agg(c1 ORDER BY c1 USING <^ NULLS LAST), c2
-   Group Key: ft2.c2
->  Sort
- Output: c2, c1
+ Output: c1, c2
  Sort Key: ft2.c1 USING <^
  ->  Foreign Scan on public.ft2
-   Output: c2, c1
+   Output: c1, c2
Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" WHERE (("C 1" < 100)) AND ((c2 = 6))
-(9 rows)
+(8 rows)
 
 -- Cleanup
 drop operator class my_op_class using btree;
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index b9268e32dd..722ebc932b 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -3345,9 +3345,9 @@ estimate_path_cost_size(PlannerInfo *root,
 			}
 
 			/* Get number of grouping columns and possible number of groups */
-			numGroupCols = list_length(root->parse->groupClause);
+			numGroupCols =

Re: build gcc warning

2022-12-27 Thread Tom Lane

Andres Freund  writes:
> On 2022-12-27 01:55:06 -0500, Tom Lane wrote:
>> A couple of buildfarm animals are warning about that too ... but
>> only a couple.

> I'm a bit confused by gcc getting confused here - the condition for
> sub_rteperminfos getting initialized and used are the same. Most of the time
> the maybe-uninitialized logic seems to be better than this.

Apparently the key phrase there is "most of the time" ;-).

I see that we've had an equally "unnecessary" initialization of the
sibling variable sub_rtable for a long time, so the problem's been
there for some people before.  I made it initialize sub_rteperminfos
the same way.

regards, tom lane

Re: recovery modules

2022-12-27 Thread Nathan Bossart

On Tue, Dec 27, 2022 at 02:45:30PM -0800, Andres Freund wrote:
> On 2022-12-27 14:37:11 -0800, Nathan Bossart wrote:
>> On Tue, Dec 27, 2022 at 02:11:11PM -0800, Andres Freund wrote:
>> > On 2022-12-27 11:24:49 -0800, Nathan Bossart wrote:
>> >> * pg_rewind uses restore_command, but there isn't a straightforward path 
>> >> to
>> >> support restore_library.  I haven't addressed this in the attached 
>> >> patches,
>> >> but perhaps this is a reason to allow specifying both restore_command and
>> >> restore_library at the same time.  pg_rewind would use restore_command, 
>> >> and
>> >> the server would use restore_library.
>> > 
>> > That seems problematic, leading to situations where one might not be able 
>> > to
>> > use restore_command anymore, because it's not feasible to do
>> > segment-by-segment restoration.
>> 
>> I'm not following why this would make segment-by-segment restoration
>> infeasible.  Would you mind elaborating?
> 
> Latency effects for example can make it infeasible to do segment-by-segment
> restoration infeasible performance wise. On the most extreme end, imagine WAL
> archived to tape or such.

I'm sorry, I'm still lost here.  Wouldn't restoration via library tend to
improve latency?  Is your point that clusters may end up depending on this
improvement so much that a shell command would no longer be able to keep
up?  I might be creating a straw man, but this seems like less of a concern
for pg_rewind since it isn't meant for continuous, ongoing restoration.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: recovery modules

2022-12-27 Thread Andres Freund

Hi,

On 2022-12-27 14:37:11 -0800, Nathan Bossart wrote:
> On Tue, Dec 27, 2022 at 02:11:11PM -0800, Andres Freund wrote:
> > On 2022-12-27 11:24:49 -0800, Nathan Bossart wrote:
> >> I've attached a patch set that adds the restore_library,
> >> archive_cleanup_library, and recovery_end_library parameters to allow
> >> archive recovery via loadable modules.  This is a follow-up to the
> >> archive_library parameter added in v15 [0] [1].
> > 
> > Why do we need N parameters for this? To me it seems more sensible to have 
> > one
> > parameter that then allows a library to implement all these (potentially
> > optionally).
> 
> The main reason is flexibility.  Separate parameters allow using a library
> for one thing and a command for another, or different libraries for
> different things.  If that isn't a use-case we wish to support, I don't
> mind combining all three into a single recovery_library parameter.

I think the configuration complexity is a sufficient concern to not go that
direction...


> >> * Unlike archive modules, recovery libraries cannot be changed at runtime.
> >> There isn't a safe way to unload a library, and archive libraries work
> >> around this restriction by restarting the archiver process.  Since recovery
> >> libraries are loaded via the startup and checkpointer processes (which
> >> cannot be trivially restarted like the archiver), the same workaround is
> >> not feasible.
> > 
> > I don't think that's a convincing reason to not support configuration
> > changes. Sure, libraries cannot be unloaded, but an unnecessarily loaded
> > library is cheap. All that's needed is to redirect the relevant function
> > calls.
> 
> This might leave some stuff around (e.g., GUCs, background workers), but if
> that isn't a concern, I can adjust it to work as you describe.

You can still have a shutdown hook re background workers. I don't think the
GUCs matter, given that it's the startup/checkpointer processes.


> >> * pg_rewind uses restore_command, but there isn't a straightforward path to
> >> support restore_library.  I haven't addressed this in the attached patches,
> >> but perhaps this is a reason to allow specifying both restore_command and
> >> restore_library at the same time.  pg_rewind would use restore_command, and
> >> the server would use restore_library.
> > 
> > That seems problematic, leading to situations where one might not be able to
> > use restore_command anymore, because it's not feasible to do
> > segment-by-segment restoration.
> 
> I'm not following why this would make segment-by-segment restoration
> infeasible.  Would you mind elaborating?

Latency effects for example can make it infeasible to do segment-by-segment
restoration infeasible performance wise. On the most extreme end, imagine WAL
archived to tape or such.

Greetings,

Andres Freund

Re: Make EXPLAIN generate a generic plan for a parameterized query

2022-12-27 Thread Michel Pelletier

On Wed, Dec 7, 2022 at 3:23 AM Laurenz Albe 
wrote:

> On Tue, 2022-12-06 at 10:17 -0800, Andres Freund wrote:
> > On 2022-10-29 10:35:26 +0200, Laurenz Albe wrote:
> > > > > Here is a patch that
> > > > > implements it with an EXPLAIN option named GENERIC_PLAN.
> >
> > This fails to build the docs:
> >
> > https://cirrus-ci.com/task/5609301511766016
> >
> > [17:47:01.064] ref/explain.sgml:179: parser error : Opening and ending
> tag mismatch: likeral line 179 and literal
> > [17:47:01.064]   ANALYZE, since a statement with
> unknown parameters
> > [17:47:01.064] ^
>
> *blush* Here is a fixed version.
>

I built and tested this patch for review and it works well, although I got
the following warning when building:

analyze.c: In function 'transformStmt':
analyze.c:2919:35: warning: 'generic_plan' may be used uninitialized in
this function [-Wmaybe-uninitialized]
 2919 | pstate->p_generic_explain = generic_plan;
  | ~~^~
analyze.c:2909:25: note: 'generic_plan' was declared here
 2909 | boolgeneric_plan;
  | ^~~~

Re: recovery modules

2022-12-27 Thread Nathan Bossart

On Tue, Dec 27, 2022 at 02:11:11PM -0800, Andres Freund wrote:
> On 2022-12-27 11:24:49 -0800, Nathan Bossart wrote:
>> I've attached a patch set that adds the restore_library,
>> archive_cleanup_library, and recovery_end_library parameters to allow
>> archive recovery via loadable modules.  This is a follow-up to the
>> archive_library parameter added in v15 [0] [1].
> 
> Why do we need N parameters for this? To me it seems more sensible to have one
> parameter that then allows a library to implement all these (potentially
> optionally).

The main reason is flexibility.  Separate parameters allow using a library
for one thing and a command for another, or different libraries for
different things.  If that isn't a use-case we wish to support, I don't
mind combining all three into a single recovery_library parameter.

>> * Unlike archive modules, recovery libraries cannot be changed at runtime.
>> There isn't a safe way to unload a library, and archive libraries work
>> around this restriction by restarting the archiver process.  Since recovery
>> libraries are loaded via the startup and checkpointer processes (which
>> cannot be trivially restarted like the archiver), the same workaround is
>> not feasible.
> 
> I don't think that's a convincing reason to not support configuration
> changes. Sure, libraries cannot be unloaded, but an unnecessarily loaded
> library is cheap. All that's needed is to redirect the relevant function
> calls.

This might leave some stuff around (e.g., GUCs, background workers), but if
that isn't a concern, I can adjust it to work as you describe.

>> * pg_rewind uses restore_command, but there isn't a straightforward path to
>> support restore_library.  I haven't addressed this in the attached patches,
>> but perhaps this is a reason to allow specifying both restore_command and
>> restore_library at the same time.  pg_rewind would use restore_command, and
>> the server would use restore_library.
> 
> That seems problematic, leading to situations where one might not be able to
> use restore_command anymore, because it's not feasible to do
> segment-by-segment restoration.

I'm not following why this would make segment-by-segment restoration
infeasible.  Would you mind elaborating?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Removing redundant grouping columns

2022-12-27 Thread Tom Lane

This patch is aimed at being smarter about cases where we have
redundant GROUP BY entries, for example

SELECT ... WHERE a.x = b.y GROUP BY a.x, b.y;

It's clearly not necessary to perform grouping using both columns.
Grouping by either one alone would produce the same results,
assuming compatible equality semantics.  I'm not sure how often
such cases arise in the wild; but we have about ten of them in our
regression tests, which makes me think it's worth the trouble to
de-duplicate as long as it doesn't cost too much.  And it doesn't,
because PathKey construction already detects exactly this sort of
redundancy.  We need only do something with the knowledge.

We can't simply make the planner replace parse->groupClause with
a shortened list of non-redundant columns, because it's possible
that we prove all the columns redundant, as in

SELECT ... WHERE a.x = 1 GROUP BY a.x;

If we make parse->groupClause empty then some subsequent tests
will think no grouping was requested, leading to incorrect results.
So what I've done in the attached is to invent a new PlannerInfo
field processed_groupClause to hold the shortened list, and then run
around and use that instead of parse->groupClause where appropriate.

(Another way could be to invent a bool hasGrouping to remember whether
groupClause was initially nonempty, analogously to hasHavingQual.
I rejected that idea after finding that there were still a few
places where it's advantageous to use the original full list.)

Beyond that, there's not too much to this patch.  I had to fix
nodeAgg.c to not crash when grouping on zero columns, and I spent
some effort on refactoring the grouping-clause preprocessing
logic in planner.c because it seemed to me to have gotten rather
unintelligible.  I didn't add any new test cases, because the changes
in existing results seem to sufficiently prove that it works.

I'll stick this in the January CF.

regards, tom lane

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 9524765650..450f724c49 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -3667,6 +3667,13 @@ appendGroupByClause(List *tlist, deparse_expr_cxt *context)
 	 */
 	Assert(!query->groupingSets);
 
+	/*
+	 * We intentionally print query->groupClause not processed_groupClause,
+	 * leaving it to the remote planner to get rid of any redundant GROUP BY
+	 * items again.  This is necessary in case processed_groupClause reduced
+	 * to empty, and in any case the redundancy situation on the remote might
+	 * be different than what we think here.
+	 */
 	foreach(lc, query->groupClause)
 	{
 		SortGroupClause *grp = (SortGroupClause *) lfirst(lc);
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 1a2c2a665c..befb2fd4d3 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2864,16 +2864,13 @@ select c2 * (random() <= 1)::int as c2 from ft2 group by c2 * (random() <= 1)::i
 -- GROUP BY clause in various forms, cardinal, alias and constant expression
 explain (verbose, costs off)
 select count(c2) w, c2 x, 5 y, 7.0 z from ft1 group by 2, y, 9.0::int order by 2;
-  QUERY PLAN   

- Sort
+ QUERY PLAN 
+
+ Foreign Scan
Output: (count(c2)), c2, 5, 7.0, 9
-   Sort Key: ft1.c2
-   ->  Foreign Scan
- Output: (count(c2)), c2, 5, 7.0, 9
- Relations: Aggregate on (public.ft1)
- Remote SQL: SELECT count(c2), c2, 5, 7.0, 9 FROM "S 1"."T 1" GROUP BY 2, 3, 5
-(7 rows)
+   Relations: Aggregate on (public.ft1)
+   Remote SQL: SELECT count(c2), c2, 5, 7.0, 9 FROM "S 1"."T 1" GROUP BY 2, 3, 5 ORDER BY c2 ASC NULLS LAST
+(4 rows)
 
 select count(c2) w, c2 x, 5 y, 7.0 z from ft1 group by 2, y, 9.0::int order by 2;
   w  | x | y |  z  
@@ -2990,14 +2987,13 @@ select exists(select 1 from pg_enum), sum(c1) from ft1 group by 1;
 QUERY PLAN 
 ---
  GroupAggregate
-   Output: ($0), sum(ft1.c1)
-   Group Key: $0
+   Output: $0, sum(ft1.c1)
InitPlan 1 (returns $0)
  ->  Seq Scan on pg_catalog.pg_enum
->  Foreign Scan on public.ft1
- Output: $0, ft1.c1
+ Output: ft1.c1
  Remote SQL: SELECT "C 1" FROM "S 1"."T 1"
-(8 rows)
+(7 rows)
 
 select exists(select 1 from pg_enum), sum(c1) from ft1 group by 1;
  exists |  sum   
@@ -3382,14 +3378,13 @@ select array_agg(c1 order by c1 using operator(public.<^)) from ft2 where c2 = 6

Re: build gcc warning

2022-12-27 Thread Andres Freund

Hi,

On 2022-12-27 01:55:06 -0500, Tom Lane wrote:
> Pavel Stehule  writes:
> > I got new warning
> > analyze.c: In function ‘transformStmt’:
> > analyze.c:550:21: warning: ‘sub_rteperminfos’ may be used uninitialized
> > [-Wmaybe-uninitialized]
> 
> A couple of buildfarm animals are warning about that too ... but
> only a couple.

I'm a bit confused by gcc getting confused here - the condition for
sub_rteperminfos getting initialized and used are the same. Most of the time
the maybe-uninitialized logic seems to be better than this.

Greetings,

Andres Freund

Re: recovery modules

2022-12-27 Thread Andres Freund

Hi,

On 2022-12-27 11:24:49 -0800, Nathan Bossart wrote:
> I've attached a patch set that adds the restore_library,
> archive_cleanup_library, and recovery_end_library parameters to allow
> archive recovery via loadable modules.  This is a follow-up to the
> archive_library parameter added in v15 [0] [1].

Why do we need N parameters for this? To me it seems more sensible to have one
parameter that then allows a library to implement all these (potentially
optionally).

> * Unlike archive modules, recovery libraries cannot be changed at runtime.
> There isn't a safe way to unload a library, and archive libraries work
> around this restriction by restarting the archiver process.  Since recovery
> libraries are loaded via the startup and checkpointer processes (which
> cannot be trivially restarted like the archiver), the same workaround is
> not feasible.

I don't think that's a convincing reason to not support configuration
changes. Sure, libraries cannot be unloaded, but an unnecessarily loaded
library is cheap. All that's needed is to redirect the relevant function
calls.

> * pg_rewind uses restore_command, but there isn't a straightforward path to
> support restore_library.  I haven't addressed this in the attached patches,
> but perhaps this is a reason to allow specifying both restore_command and
> restore_library at the same time.  pg_rewind would use restore_command, and
> the server would use restore_library.

That seems problematic, leading to situations where one might not be able to
use restore_command anymore, because it's not feasible to do
segment-by-segment restoration.

Greetings,

Andres Freund

Re: Patch: Global Unique Index

2022-12-27 Thread David Zhang


On 2022-12-19 7:51 a.m., Nikita Malakhov wrote:

Sorry to bother - but is this patch used in IvorySQL?
Here:
https://www.ivorysql.org/docs/Global%20Unique%20Index/create_global_unique_index
According to syntax it definitely looks like this patch.


The global unique index is one of the features required in IvorySQL 
development. We want to share it to the communities to get more 
feedback, and then hopefully we could better contribute it back to 
PostgreSQL.


Best regards,

David

Re: Force streaming every change in logical decoding

2022-12-27 Thread Andres Freund

Hi,

On 2022-12-26 14:04:28 +0530, Amit Kapila wrote:
> Pushed.

I did not follow this thread but saw the commit. Could you explain why a GUC
is the right mechanism here? The commit message didn't explain why a GUC was
chosen.

To me an option like this should be passed in when decoding rather than a GUC.

Greetings,

Andres Freund

Re: Error-safe user functions

2022-12-27 Thread Tom Lane

Andrew Dunstan  writes:
> On Dec 27, 2022, at 12:47 PM, Tom Lane  wrote:
>> Andrew Dunstan  writes:
>>> I think that would leave just hstore to be done.

>> Yeah, that matches my scoreboard.  Are you going to look at
>> hstore, or do you want me to?

> Go for it. 

Done.

regards, tom lane

recovery modules

2022-12-27 Thread Nathan Bossart

I've attached a patch set that adds the restore_library,
archive_cleanup_library, and recovery_end_library parameters to allow
archive recovery via loadable modules.  This is a follow-up to the
archive_library parameter added in v15 [0] [1].

The motivation behind this change is similar to that of archive_library
(e.g., robustness, performance).  The recovery functions are provided via a
similar interface to archive modules (i.e., an initialization function that
returns the function pointers).  Also, I've extended basic_archive to work
as a restore_library, which makes it easy to demonstrate both archiving and
recovery via a loadable module in a TAP test.

A few miscellaneous design notes:

* Unlike archive modules, recovery libraries cannot be changed at runtime.
There isn't a safe way to unload a library, and archive libraries work
around this restriction by restarting the archiver process.  Since recovery
libraries are loaded via the startup and checkpointer processes (which
cannot be trivially restarted like the archiver), the same workaround is
not feasible.

* pg_rewind uses restore_command, but there isn't a straightforward path to
support restore_library.  I haven't addressed this in the attached patches,
but perhaps this is a reason to allow specifying both restore_command and
restore_library at the same time.  pg_rewind would use restore_command, and
the server would use restore_library.

* I've combined the documentation to create one "Archive and Recovery
Modules" chapter.  They are similar enough that it felt silly to write a
separate chapter for recovery modules.  However, I've still split them up
within the chapter, and they have separate initialization functions.  This
retains backward compatibility with v15 archive modules, keeps them
logically separate, and hopefully hints at the functional differences.
Even so, if you want to create one library for both archive and recovery,
there is nothing stopping you.

* Unlike archive modules, I didn't add any sort of "check" or "shutdown"
callbacks.  The recovery_end_library parameter makes a "shutdown" callback
largely redundant, and I couldn't think of any use-case for a "check"
callback.  However, new callbacks could be added in the future if needed.

* Unlike archive modules, restore_library and recovery_end_library may be
loaded in single-user mode.  I believe this works out-of-the-box, but it's
an extra thing to be cognizant of.

* If both the library and command parameter for a recovery action is
specified, the server should fail to startup, but if a misconfiguration is
detected after SIGHUP, we emit a WARNING and continue using the library.  I
originally thought about emitting an ERROR like the archiver does in this
case, but failing recovery and stopping the server felt a bit too harsh.
I'm curious what folks think about this.

* Іt could be nice to rewrite pg_archivecleanup for use as an
archive_cleanup_library, but I don't think that needs to be a part of this
patch set.

[0] https://postgr.es/m/668D2428-F73B-475E-87AE-F89D67942270%40amazon.com
[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=5ef1eef

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From b2e826baef398998ba93b27c5d68e89d439b4962 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Fri, 23 Dec 2022 16:35:25 -0800
Subject: [PATCH v1 1/3] Move the code to restore files via the shell to a
 separate file.

This is preparatory work for allowing more extensibility in this
area.
---
 src/backend/access/transam/Makefile|   1 +
 src/backend/access/transam/meson.build |   1 +
 src/backend/access/transam/shell_restore.c | 194 +
 src/backend/access/transam/xlog.c  |  44 -
 src/backend/access/transam/xlogarchive.c   | 158 +
 src/include/access/xlogarchive.h   |   7 +-
 6 files changed, 240 insertions(+), 165 deletions(-)
 create mode 100644 src/backend/access/transam/shell_restore.c

diff --git a/src/backend/access/transam/Makefile b/src/backend/access/transam/Makefile
index 661c55a9db..099c315d03 100644
--- a/src/backend/access/transam/Makefile
+++ b/src/backend/access/transam/Makefile
@@ -19,6 +19,7 @@ OBJS = \
 	multixact.o \
 	parallel.o \
 	rmgr.o \
+	shell_restore.o \
 	slru.o \
 	subtrans.o \
 	timeline.o \
diff --git a/src/backend/access/transam/meson.build b/src/backend/access/transam/meson.build
index 65c77531be..a0870217b8 100644
--- a/src/backend/access/transam/meson.build
+++ b/src/backend/access/transam/meson.build
@@ -7,6 +7,7 @@ backend_sources += files(
   'multixact.c',
   'parallel.c',
   'rmgr.c',
+  'shell_restore.c',
   'slru.c',
   'subtrans.c',
   'timeline.c',
diff --git a/src/backend/access/transam/shell_restore.c b/src/backend/access/transam/shell_restore.c
new file mode 100644
index 00..3ddcabd969
--- /dev/null
+++ b/src/backend/access/transam/shell_restore.c
@@ -0,0 +1,194 @@

Re: Error-safe user functions

2022-12-27 Thread Andrew Dunstan




> On Dec 27, 2022, at 12:47 PM, Tom Lane  wrote:
> 
> Andrew Dunstan  writes:
>> Here's a patch that covers the ltree and intarray contrib modules.
> 
> I would probably have done this a little differently --- I think
> the added "res" parameters aren't really necessary for most of
> these.  But it's not worth arguing over.

I’ll take another look 


> 
>> I think that would leave just hstore to be done.
> 
> Yeah, that matches my scoreboard.  Are you going to look at
> hstore, or do you want me to?
> 
>  

Go for it. 

Cheers

Andrew

Re: Error-safe user functions

2022-12-27 Thread Tom Lane

Andrew Dunstan  writes:
> Here's a patch that covers the ltree and intarray contrib modules.

I would probably have done this a little differently --- I think
the added "res" parameters aren't really necessary for most of
these.  But it's not worth arguing over.

> I think that would leave just hstore to be done.

Yeah, that matches my scoreboard.  Are you going to look at
hstore, or do you want me to?

regards, tom lane

Re: [PATCH] Support using "all" for the db user in pg_ident.conf

2022-12-27 Thread Isaac Morland

On Tue, 27 Dec 2022 at 10:54, Jelte Fennema 
wrote:

This change makes it much easier to have a certain database
> administrator peer or cert authentication, that allows connecting as
> any user. Without this change you would need to add a line to
> pg_ident.conf for every user that is in the database.
>
> In some small sense this is a breaking change if anyone is using "all"
> as a user currently and has pg_ident.conf rules for it. This seems
> unlikely, since "all" was already handled specially in pg_hb.conf.
> Also it can easily be worked around by quoting the all token in
> pg_ident.conf. As long as this is called out in the release notes
> it seems okay to me. However, if others disagree there would
> be the option of changing the token to "pg_all". Since any
> pg_ prefixed users are reserved by postgres there can be no user.
> For now I used "all" though to stay consistent with pg_hba.conf.


+1 from me. I recently was setting up a Vagrant VM for testing and wanted
to allow the OS user which runs the application to connect to the database
as whatever user it wants and was surprised to find I had to list all the
potential target DB users in the pg_ident.conf (in production it uses
password authentication and each server gets just the passwords it needs
stored in ~/.pgpass). I like the idea that both config files would be
consistent, although the use of keywords such as "replication" in the DB
column has always made me a bit uncomfortable.

Related question: is there a reason why pg_ident.conf can't/shouldn't be
replaced by a system table? As far as I can tell, it's just a 3-column
table, essentially, with all columns in the primary key. This latest
proposal changes that a little; strictly, it should probably introduce a
second table with just two columns identifying which OS users can connect
as any user, but existing system table style seems to suggest that we would
just use a special value in the DB user column for "all".

[PATCH] Support using "all" for the db user in pg_ident.conf

2022-12-27 Thread Jelte Fennema

While pg_hba.conf has supported the "all" keyword since a very long
time, pg_ident.conf doesn't have this same functionality. This changes
permission checking in pg_ident.conf to handle "all" differently from
any other value in the database-username column. If "all" is specified
and the system-user matches the identifier, then the user is allowed to
authenticate no matter what user it tries to authenticate as.

This change makes it much easier to have a certain database
administrator peer or cert authentication, that allows connecting as
any user. Without this change you would need to add a line to
pg_ident.conf for every user that is in the database.

In some small sense this is a breaking change if anyone is using "all"
as a user currently and has pg_ident.conf rules for it. This seems
unlikely, since "all" was already handled specially in pg_hb.conf.
Also it can easily be worked around by quoting the all token in
pg_ident.conf. As long as this is called out in the release notes
it seems okay to me. However, if others disagree there would
be the option of changing the token to "pg_all". Since any
pg_ prefixed users are reserved by postgres there can be no user.
For now I used "all" though to stay consistent with pg_hba.conf.

v1-0001-Support-using-all-for-the-db-user-in-pg_ident.con.patch
Description:  v1-0001-Support-using-all-for-the-db-user-in-pg_ident.con.patch

False positive warning in verify_heapam.c with GCC 03

2022-12-27 Thread Maxim Orlov

Hi!

While playing with some unrelated to the topic stuff, I've noticed a
strange warning from verify_heapam.c:730:25:
warning: ‘xmax_status’ may be used uninitialized in this function.

This happens only when get_xid_status is inlined, and only in GCC with O3.
I use a GCC version 11.3.0.

For the purpose of investigation, I've created a PFA patch to force
get_xid_status
inline.

$ CFLAGS="-O3" ./configure -q && make -s -j12 >/dev/null && make -s -j12 -C
contrib
verify_heapam.c: In function ‘check_tuple_visibility’:
verify_heapam.c:730:25: warning: ‘xmax_status’ may be used uninitialized in
this function [-Wmaybe-uninitialized]
  730 | XidCommitStatus xmax_status;
  | ^~~
verify_heapam.c:909:25: warning: ‘xmin_status’ may be used uninitialized in
this function [-Wmaybe-uninitialized]
  909 | else if (xmin_status != XID_COMMITTED)
  |

I believe, this warning is false positive, since mentioned row is
unreachable. If xid is invalid, we return from get_xid_status
XID_INVALID and could not pass
 770 if (HeapTupleHeaderXminInvalid(tuphdr))
 771 return false;···/* inserter aborted, don't check */

So, I think this warning is false positive. On the other hand, we could
simply init status variable in get_xid_status and
make this code more errors/warnings safe. Thoughts?

-- 
Best regards,
Maxim Orlov.


0002-Init-status-var-in-get_xid_status.patch
Description: Binary data


0001-Force-inline-of-get_xid_status.patch
Description: Binary data

Re: Making Vars outer-join aware

2022-12-27 Thread Tom Lane

Richard Guo  writes:
> For 0012, I'm still trying to understand JoinDomain.  AFAIU all EC
> members of the same EC should have the same JoinDomain, because for
> constants we match EC members only within the same JoinDomain, and for
> Vars if they come from different join domains they will have different
> nullingrels and thus will not match.  So I wonder if we can have the
> JoinDomain kept in EquivalenceClass rather than in each
> EquivalenceMembers.

Yeah, I tried to do it like that at first, and failed.  There is
some sort of association between ECs and join domains, for sure,
but exactly what it is seems to need more elucidation.

The thing that I couldn't get around before is that if you have,
say, a mergejoinable equality clause in an outer join:

select ... from a left join b on a.x = b.y;

that equality clause can only be associated with the join domain
for B, because it certainly can't be enforced against A.  However,
you'd still wish to be able to do a mergejoin using indexes on
a.x and b.y, and this means that we have to understand the ordering
induced by a PathKey based on this EC as applicable to A, even
though that relation is not in the same join domain.  So there are
situations where sort orderings apply across domain boundaries even
though equalities don't.  We might have to split the notion of
EquivalenceClass into two sorts of objects, and somewhere right
about here is where I realized that this wasn't getting finished
for v16 :-(.

So the next pass at this is likely going to involve some more
refactoring, and maybe we'll end up saying that an EquivalenceClass
is tightly bound to a join domain or maybe we won't.  For the moment
it seemed to work better to associate domains with only the const
members of ECs.  (As written, the patch does fill em_jdomain even
for non-const members, but that was just for simplicity.  I'd
originally meant to make it NULL for non-const members, but that
turned out to be a bit too tedious because the responsibility for
marking a member as const or not is split among several places.)

Another part of the motivation for doing it like that is that
I've been thinking about having just a single common pool of
EquivalenceMember objects, and turning EquivalenceClasses into
bitmapsets of indexes into the shared EquivalenceMember list.
This would support having ECs that share some member(s) without
being exactly the same thing, which I think might be necessary
to get to the point of treating outer-join clauses as creating
EC equalities.

BTW, I can't escape the suspicion that I've reinvented an idea
that's already well known in the literature.  Has anyone seen
something like this "join domain" concept before, and if so
what was it called?

regards, tom lane

Re: Underscores in numeric literals

2022-12-27 Thread Justin Pryzby

On Tue, Dec 27, 2022 at 09:55:32AM -0500, Tom Lane wrote:
> Peter Eisentraut  writes:
> > Here is a patch to add support for underscores in numeric literals, for 
> > visual grouping, like
> 
> >  1_500_000_000
> >  0b10001000_
> >  0o_1_755
> >  0x_
> >  1.618_034
> 
> > per SQL:202x draft.
> 
> > This adds support in the lexer as well as in the integer type input 
> > functions.
> > TODO: float/numeric type input support
> 
> Hmm ... I'm on board with allowing this in SQL if the committee says
> so.

> I'm not especially on board with accepting it in datatype input
> functions.  There's been zero demand for that AFAIR.  Moreover,
> I don't think we need the inevitable I/O performance hit, nor the
> increased risk of accepting garbage, nor the certainty of
> inconsistency with other places that don't get converted (because
> they depend on strtoul() or whatever).

+1 to accept underscores only in literals and leave input functions
alone.

(When I realized that python3.6 changed to accept things like
int("3_5"), I felt compelled to write a wrapper to check for embedded
underscores and raise an exception in that case.  And I'm sure it
affected performance.)

-- 
Justin

Re: Underscores in numeric literals

2022-12-27 Thread Tom Lane

Peter Eisentraut  writes:
> Here is a patch to add support for underscores in numeric literals, for 
> visual grouping, like

>  1_500_000_000
>  0b10001000_
>  0o_1_755
>  0x_
>  1.618_034

> per SQL:202x draft.

> This adds support in the lexer as well as in the integer type input 
> functions.
> TODO: float/numeric type input support

Hmm ... I'm on board with allowing this in SQL if the committee says
so.  I'm not especially on board with accepting it in datatype input
functions.  There's been zero demand for that AFAIR.  Moreover,
I don't think we need the inevitable I/O performance hit, nor the
increased risk of accepting garbage, nor the certainty of
inconsistency with other places that don't get converted (because
they depend on strtoul() or whatever).

We already accept that numeric input is different from numeric
literals: you can't write Infinity or NaN in SQL without quotes.
So I don't see an argument that we have to allow this in numeric
input for consistency.

regards, tom lane

Re: [BUG] pg_upgrade test fails from older versions.

2022-12-27 Thread Anton A. Melnikov


On 27.12.2022 16:50, Michael Paquier wrote:


If there are no other considerations could you close the corresponding
record on the January CF, please?


Indeed, now marked as committed.

-

Thanks a lot!

Merry Christmas!

--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [BUG] pg_upgrade test fails from older versions.

2022-12-27 Thread Michael Paquier

On Tue, Dec 27, 2022 at 03:26:10PM +0300, Anton A. Melnikov wrote:
> I would like to try realize this, better in a separate thread.

I don't think that this should be added into the tree, but if you have
per-version filtering rules, of course feel free to publish that to
the lists.  I am sure that this could be helpful for others.

> If there are no other considerations could you close the corresponding
> record on the January CF, please?

Indeed, now marked as committed.
--
Michael

signature.asc
Description: PGP signature

Re: Error-safe user functions

2022-12-27 Thread Andrew Dunstan



On 2022-12-26 Mo 18:00, Tom Lane wrote:
> I wrote:
>> (Perhaps we should go further than this, and convert all these
>> functions to just be DirectInputFunctionCallSafe wrappers
>> around the corresponding input functions?  That would save
>> some duplicative code, but I've not done it here.)
> I looked closer at that idea, and realized that it would do more than
> just save some code: it'd cause the to_regfoo functions to accept
> numeric OIDs, as they did not before (and are documented not to).
> It is unclear to me whether that inconsistency with the input
> functions is really desirable or not --- but I don't offhand see a
> good argument for it.  If we change this though, it should probably
> happen in a separate commit.  Accordingly, here's a delta patch
> doing that.
>
>   


+1 for doing this. The code simplification is nice too.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

2022-12-27 Thread Andrew Dunstan


On 2022-12-26 Mo 14:12, Andrew Dunstan wrote:
> On 2022-12-26 Mo 12:47, Tom Lane wrote:
>> Here's a proposed patch for making tsvectorin and tsqueryin
>> report errors softly.  We have to take the changes down a
>> couple of levels of subroutines, but it's not hugely difficult.
>
> Great!
>
>
>> With the other patches I've posted recently, this covers all
>> of the core datatype input functions.  There are still half
>> a dozen to tackle in contrib.
>>
>>  
>
> Yeah, I'm currently looking at those in ltree.
>
>

Here's a patch that covers the ltree and intarray contrib modules. I
think that would leave just hstore to be done.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com
diff --git a/contrib/intarray/_int_bool.c b/contrib/intarray/_int_bool.c
index 3ed88af76d..77bbf77c43 100644
--- a/contrib/intarray/_int_bool.c
+++ b/contrib/intarray/_int_bool.c
@@ -149,8 +149,8 @@ pushquery(WORKSTATE *state, int32 type, int32 val)
 /*
  * make polish notation of query
  */
-static int32
-makepol(WORKSTATE *state)
+static bool
+makepol(WORKSTATE *state, int32 *res, struct Node *escontext)
 {
 	int32		val,
 type;
@@ -179,7 +179,7 @@ makepol(WORKSTATE *state)
 else
 {
 	if (lenstack == STACKDEPTH)
-		ereport(ERROR,
+		ereturn(escontext, false,
 (errcode(ERRCODE_STATEMENT_TOO_COMPLEX),
  errmsg("statement too complex")));
 	stack[lenstack] = val;
@@ -187,8 +187,10 @@ makepol(WORKSTATE *state)
 }
 break;
 			case OPEN:
-if (makepol(state) == ERR)
-	return ERR;
+if (!makepol(state, res, escontext))
+	return false;
+if (*res == ERR)
+	return true;
 while (lenstack && (stack[lenstack - 1] == (int32) '&' ||
 	stack[lenstack - 1] == (int32) '!'))
 {
@@ -202,14 +204,14 @@ makepol(WORKSTATE *state)
 	lenstack--;
 	pushquery(state, OPR, stack[lenstack]);
 };
-return END;
+*res = END;
+return true;
 break;
 			case ERR:
 			default:
-ereport(ERROR,
+ereturn(escontext, false,
 		(errcode(ERRCODE_SYNTAX_ERROR),
 		 errmsg("syntax error")));
-return ERR;
 		}
 	}
 
@@ -218,7 +220,8 @@ makepol(WORKSTATE *state)
 		lenstack--;
 		pushquery(state, OPR, stack[lenstack]);
 	};
-	return END;
+	*res = END;
+	return true;
 }
 
 typedef struct
@@ -483,6 +486,8 @@ bqarr_in(PG_FUNCTION_ARGS)
 	ITEM	   *ptr;
 	NODE	   *tmp;
 	int32		pos = 0;
+	int32   polres;
+	struct Node *escontext = fcinfo->context;
 
 #ifdef BS_DEBUG
 	StringInfoData pbuf;
@@ -495,14 +500,15 @@ bqarr_in(PG_FUNCTION_ARGS)
 	state.str = NULL;
 
 	/* make polish notation (postfix, but in reverse order) */
-	makepol();
+	if (!makepol(, , escontext))
+		PG_RETURN_NULL();
 	if (!state.num)
-		ereport(ERROR,
+		ereturn(escontext, (Datum) 0,
 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
  errmsg("empty query")));
 
 	if (state.num > QUERYTYPEMAXITEMS)
-		ereport(ERROR,
+		ereturn(escontext, (Datum) 0,
 (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
  errmsg("number of query items (%d) exceeds the maximum allowed (%d)",
 		state.num, (int) QUERYTYPEMAXITEMS)));
diff --git a/contrib/intarray/expected/_int.out b/contrib/intarray/expected/_int.out
index a09d40efa1..c953065a5c 100644
--- a/contrib/intarray/expected/_int.out
+++ b/contrib/intarray/expected/_int.out
@@ -398,6 +398,21 @@ SELECT '1&(2&(4&(5|!6)))'::query_int;
  1 & 2 & 4 & ( 5 | !6 )
 (1 row)
 
+-- test non-error-throwing input
+SELECT str as "query_int",
+   pg_input_is_valid(str,'query_int') as ok,
+   pg_input_error_message(str,'query_int') as errmsg
+FROM (VALUES ('1&(2&(4&(5|6)))'),
+ ('1#(2&(4&(5&6)))'),
+ ('foo'))
+  AS a(str);
+query_int| ok |errmsg
+-++--
+ 1&(2&(4&(5|6))) | t  | 
+ 1#(2&(4&(5&6))) | f  | syntax error
+ foo | f  | syntax error
+(3 rows)
+
 CREATE TABLE test__int( a int[] );
 \copy test__int from 'data/test__int.data'
 ANALYZE test__int;
diff --git a/contrib/intarray/sql/_int.sql b/contrib/intarray/sql/_int.sql
index b26fc57e4d..4c9ba4c1fb 100644
--- a/contrib/intarray/sql/_int.sql
+++ b/contrib/intarray/sql/_int.sql
@@ -75,6 +75,17 @@ SELECT '1&2&4&5&6'::query_int;
 SELECT '1&(2&(4&(5|6)))'::query_int;
 SELECT '1&(2&(4&(5|!6)))'::query_int;
 
+-- test non-error-throwing input
+
+SELECT str as "query_int",
+   pg_input_is_valid(str,'query_int') as ok,
+   pg_input_error_message(str,'query_int') as errmsg
+FROM (VALUES ('1&(2&(4&(5|6)))'),
+ ('1#(2&(4&(5&6)))'),
+ ('foo'))
+  AS a(str);
+
+
 
 CREATE TABLE test__int( a int[] );
 \copy test__int from 'data/test__int.data'
diff --git a/contrib/ltree/expected/ltree.out b/contrib/ltree/expected/ltree.out
index c6d8f3ef75..b95be71c78 100644
--- a/contrib/ltree/expected/ltree.out
+++ b/contrib/ltree/expected/ltree.out
@@ -8084,3 +8084,28 @@ SELECT count(*) FROM _ltreetest WHERE t ? '{23.*.1,23.*.2}' ;
 15
 (1 row)
 
+-- test

Re: [BUG] pg_upgrade test fails from older versions.

2022-12-27 Thread Anton A. Melnikov


Hello!

On 27.12.2022 08:44, Michael Paquier wrote:


It is worth noting that perlcritic was complaining here, as eval is
getting used with a string.  I have spent a few days looking at that,
and I really want a maximum of flexibility in the rules that can be
applied so I have put a "no critic" rule, which is fine by me as this
extra file is something owned by the user and it would apply only to
cross-version upgrades.


I think it's a very smart decision. Thank you very match!


So it looks like we are now done here..  With all these pieces in
place in the tests, I don't see why it would not be possible to
automate the cross-version tests of pg_upgrade.


I've checked the cross-upgrade test form 9.5+ to current master and
have found no problem with accuracy up to dumps filtering.
For cross-version tests automation one have to write additional
filtering rules in the external files.

I would like to try realize this, better in a separate thread.
If there are no other considerations could you close the corresponding
record on the January CF, please?


With the best wishes!

--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Data loss on logical replication, 12.12 to 14.5, ALTER SUBSCRIPTION

2022-12-27 Thread Michail Nikolaev

Hello, Amit!

> IUC, this is the time when only table B's initial sync was
> in-progress. Table A's initial sync was finished by that time and for
> Table C, it is yet not started.
Yes, it is correct. C was started too, but unsuccessfully (restarted
after, see below).

> During the time of
> the initial sync of B, are there any other operations on that table
> apart from the missing ones?
Yes, a lot of them. Tuples created all the time without any pauses,
only ~10s interval is gone.

> You may want to see the LOGs of
> subscribers during the initial sync time for any errors or other
> messages.

Looks like I have found something interesting. Below, sorted by time
with some comments.

Restart of the logical apply worker day before issue.

-2022-12-13 21:04:25 UTC-6398e8d9.4ec3-LOG: logical
replication apply worker for subscription
"cloud_production_main_sub_v4" has started
-2022-12-13 21:04:26 UTC-6398e8d9.4ec3-ERROR: could not start
WAL streaming: FATAL: out of relcache_callback_list slots CONTEXT:
slot "cloud_production_main_sub_v4", output plugin "pgoutput", in the
startup callback ERROR: odyssey: c1747b31d0187: remote server
read/write error s721c052ace56: (null) SSL SYSCALL error: EOF detected
-2022-12-13 21:04:26 UTC-639872fa.1-LOG: background worker
"logical replication worker" (PID 20163) exited with exit code 1
-2022-12-13 21:04:31 UTC-6398e8df.4f7c-LOG: logical
replication apply worker for subscription
"cloud_production_main_sub_v4" has started

Start of B and C table initial sync (adding tables to the
subscription, table A already in streaming mode):

-2022-12-14 08:08:34 UTC-63998482.7d10-LOG: logical
replication table synchronization worker for subscription
"cloud_production_main_sub_v4", table "B" has started
-2022-12-14 08:08:34 UTC-63998482.7d13-LOG: logical
replication table synchronization worker for subscription
"cloud_production_main_sub_v4", table "C" has started

B is synchronized without any errors:

-2022-12-14 10:19:08 UTC-63998482.7d10-LOG: logical
replication table synchronization worker for subscription
"cloud_production_main_sub_v4", table "B" has finished

After about 15 seconds, replication worker is restarted because of
issues with I/O:

-2022-12-14 10:19:22 UTC-6398e8df.4f7c-ERROR: could not
receive data from WAL stream: SSL SYSCALL error: EOF detected
-2022-12-14 10:19:22 UTC-6399a32a.6af3-LOG: logical
replication apply worker for subscription
"cloud_production_main_sub_v4" has started
-2022-12-14 10:19:22 UTC-639872fa.1-LOG: background worker
"logical replication worker" (PID 20348) exited with exit code 1

Then cancel of query (something about insert into public.lsnmover):

-2022-12-14 10:21:03 UTC-63999c6a.f25d-ERROR: canceling
statement due to user request
-2022-12-14 10:21:39 UTC-639997f9.fd6e-ERROR: canceling
statement due to user request

After small amount of time, synchronous replication seems to be
broken, we see tons of:

-2022-12-14 10:24:18 UTC-63999d2c.2020-WARNING: shutdown
requested while waiting for synchronous replication ack.
-2022-12-14 10:24:18 UTC-63999d2c.2020-WARNING: user requested
cancel while waiting for synchronous

After few minutes at 10:36:05 we initiated database restart:

-2022-12-14 10:35:20 UTC-63999c6a.f25d-ERROR: canceling
statement due to user request
-2022-12-14 10:37:25 UTC-6399a765.1-LOG: pgms_stats: Finishing PG_init
-2022-12-14 10:37:26 UTC-6399a765.1-LOG: listening on IPv4
address "0.0.0.0", port 5432
-2022-12-14 10:37:26 UTC-6399a765.1-LOG: listening on IPv6
address "::", port 5432
-2022-12-14 10:37:26 UTC-6399a765.1-LOG: starting PostgreSQL
14.4 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
-2022-12-14 10:37:26 UTC-6399a765.1-LOG: listening on Unix
socket "/tmp/.s.PGSQL.5432"
-2022-12-14 10:37:26 UTC-6399a766.103-LOG: database system was
interrupted; last known up at 2022-12-14 10:36:47 UTC
-2022-12-14 10:37:26 UTC-6399a766.105-FATAL: the database
system is starting up
-2022-12-14 10:37:26 UTC-6399a766.107-FATAL: the database
system is starting up
-2022-12-14 10:37:27 UTC-6399a767.108-FATAL: the database
system is starting up
-2022-12-14 10:37:27 UTC-6399a767.109-FATAL: the database
system is starting up
-2022-12-14 10:37:27 UTC-6399a767.10a-FATAL: the database
system is starting up
-2022-12-14 10:37:27 UTC-6399a767.10b-FATAL: the database
system is starting up
-2022-12-14 10:37:27 UTC-6399a767.113-FATAL: the database
system is starting up
-2022-12-14 10:37:29 UTC-6399a766.103-LOG: recovered
replication state of node 4 to 185F6/CBF4BBB8
-2022-12-14 10:37:29 UTC-6399a766.103-LOG: recovered
replication state of node 1 to 18605/FE229E10
-2022-12-14 10:37:29 UTC-6399a766.103-LOG: recovered
replication state of node 6 to 185F8/86AEC880

Re: Passing relation metadata to Exec routine

2022-12-27 Thread Nikita Malakhov

Hi Tom!

Thank you for your feedback. I agree that for complex columns
created with joins, grouping, etc considering properties of the base
table does not make sense at all.

But for CREATE TABLE LIKE and simple columns that are inherited
from some existing relations - it does, if we consider some advanced
properties and from user's perspective - want our new table [columns]
to behave exactly as the base ones (in some ways like encryption,
storage, compression methods, etc). LIKE options is a good idea,
thank you, but when we CREATE TABLE AS - maybe, we take it
into account too?

I agree that passing these parameters in a backdoor fashion is not
very transparent and user-friendly, too.

On Tue, Dec 27, 2022 at 12:56 AM Tom Lane  wrote:

> Nikita Malakhov  writes:
> > While working on Pluggable TOAST [1] we found out that creation
> > of new relation with CREATE TABLE AS... or CREATE TABLE LIKE -
> > method
> > static ObjectAddress create_ctas_internal(List *attrList, IntoClause
> *into)
> > does not receive any metadata from columns or tables used in query
> > (if any). It makes sense to pass not only column type and size, but
> > all other metadata - like attoptions,base relation OID (and, maybe,
> > reloptions), if the column from existing relation was used.
>
> I am very, very skeptical of the premise here.
>
> CREATE TABLE AS creates a table based on the output of a query.
> That query could involve arbitrarily complex processing -- joins,
> grouping, what-have-you.  I don't see how it makes sense to
> consider the properties of the base table(s) in deciding how to
> create the output table.  I certainly do not think it'd be sane for
> that to behave differently depending on how complicated the query is.
>
> As for CREATE TABLE LIKE, the point of that is to create a table
> by copying a *specified* set of properties of a reference table.
> I don't think letting an access method copy some other properties
> behind the user's back is a great idea.  If you've got things that
> it'd make sense to be able to inherit, let's discuss adding more
> LIKE options to support that --- in which case the implementation
> would presumably pass the data through in a non-back-door fashion.
>
> regards, tom lane
>

-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL

2022-12-27 Thread Bharath Rupireddy

On Mon, Dec 26, 2022 at 4:18 PM Bharath Rupireddy
 wrote:
>
> On Mon, Dec 26, 2022 at 12:59 PM Michael Paquier  wrote:
>
> +1. I think this feature will also be useful in pg_walinspect.

Just for the record - here's the pg_walinspect function to extract
FPIs from WAL records -
https://www.postgresql.org/message-id/CALj2ACVCcvzd7WiWvD%3D6_7NBvVB_r6G0EGSxL4F8vosAi6Se4g%40mail.gmail.com.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Add a new pg_walinspect function to extract FPIs from WAL records

2022-12-27 Thread Bharath Rupireddy

Hi,

Here's a patch that implements the idea of extracting full page images
from WAL records [1] [2] with a function in pg_walinspect. This new
function accepts start and end lsn and returns full page image info
such as WAL record lsn, tablespace oid, database oid, relfile number,
block number, fork name and the raw full page (as bytea). I'll
register this in the next commitfest.

Thoughts?

[1] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=d497093cbecccf6df26365e06a5f8f8614b591c8
[2] 
https://postgr.es/m/CAOxo6XKjQb2bMSBRpePf3ZpzfNTwjQUc4Tafh21=jzjx6bx...@mail.gmail.com

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v1-0001-Add-FPI-extract-function-to-pg_walinspect.patch
Description: Binary data

Re: Perform streaming logical transactions by background workers and parallel apply

2022-12-27 Thread Amit Kapila

On Tue, Dec 27, 2022 at 10:24 AM wangw.f...@fujitsu.com
 wrote:
>
> Attach the new version patch which addressed all above comments and part of
> comments from [1] except one comment that are being discussed.
>

1.
+# Test that the deadlock is detected among leader and parallel apply workers.
+
+$node_subscriber->append_conf('postgresql.conf', "deadlock_timeout = 1ms");
+$node_subscriber->reload;
+

A. I see that the other existing tests have deadlock_timeout set as
10ms, 100ms, 100s, etc. Is there a reason to keep so low here? Shall
we keep it as 10ms?
B. /among leader/among the leader

2. Can we leave having tests in 022_twophase_cascade to be covered by
parallel mode? The two-phase and parallel apply will be covered by
023_twophase_stream, so not sure if we get any extra coverage by
022_twophase_cascade.

3. Let's combine 0001 and 0002 as both have got reviewed independently.

-- 
With Regards,
Amit Kapila.

Re: [feature]COPY FROM enable FORCE_NULL/FORCE_NOT_NULL on all columns

2022-12-27 Thread Melih Mutlu

Hi,

Having  FORCE_NULL(*) and FORCE_NOT_NULL(*) sounds good, since postgres
already has FORCE_QUOTE(*).

I just quickly tried out your patch. It worked for me as expected.

 One little suggestion:

+ if (cstate->opts.force_notnull_all)
> + {
> + int i;
> + for(i = 0; i < num_phys_attrs; i++)
> + cstate->opts.force_notnull_flags[i] = true;
> + }


Instead of setting force_null/force_notnull flags for all columns, what
about simply setting "attnums" list to cstate->attnumlist?
Something like the following should be enough :

> if (cstate->opts.force_null_all)
>attnums = cstate->attnumlist;
> else
>attnums = CopyGetAttnums(tupDesc, cstate->rel, cstate->opts.force_null);


Thanks,
-- 
Melih Mutlu
Microsoft

RE: Time delayed LR (WAS Re: logical replication restrictions)

2022-12-27 Thread Hayato Kuroda (Fujitsu)

Dear Dilip,

Thanks for reviewing our patch! PSA new version patch set.
Again, 0001 is not made by us, brought from [1].

> I have done some review for the patch and I have a few comments.
>
> 1.
> A.
> +   wal_sender_timeout on the publisher. Otherwise, the
> +   walsender repeatedly terminates due to timeout during the delay of
> +   the subscriber.
> 
> 
> B.
> +/*
> + * In order to avoid walsender's timeout during time delayed replication,
> + * it's necessaary to keep sending feedbacks during the delay from the worker
> + * process. Meanwhile, the feature delays the apply before starting the
> + * transaction and thus we don't write WALs for the suspended changes during
> + * the wait. Hence, in the case the worker process sends a feedback during 
> the
> + * delay, avoid having positions of the flushed and apply LSN overwritten by
> + * the latest LSN.
> + */
> 
> - Seems like these two statements are conflicting, I mean if we are
> sending feedback then why the walsender will timeout?

It is a possibility that timeout is occurred because the interval between 
feedback
messages may become longer than wal_sender_timeout. Reworded and added 
descriptions.

> - Typo /necessaary/necessary

Fixed.

> 2.
> + *
> + * During the time delayed replication, avoid reporting the suspeended
> + * latest LSN are already flushed and written, to the publisher.
>   */
> Typo /suspeended/suspended

Fixed.

> 3.
> +if (wal_receiver_status_interval > 0
> +&& diffms > wal_receiver_status_interval)
> +{
> +WaitLatch(MyLatch,
> +  WL_LATCH_SET | WL_TIMEOUT |
> WL_EXIT_ON_PM_DEATH,
> +  (long) wal_receiver_status_interval,
> +  WAIT_EVENT_RECOVERY_APPLY_DELAY);
> +send_feedback(last_received, true, false);
> +}
> +else
> +WaitLatch(MyLatch,
> +  WL_LATCH_SET | WL_TIMEOUT |
> WL_EXIT_ON_PM_DEATH,
> +  diffms,
> +  WAIT_EVENT_RECOVERY_APPLY_DELAY);
> 
> I think here we should add some comments to explain about sending
> feedback, something like what we have explained at the time of
> defining the "in_delaying_apply" variable.

Added.

> 4.
> 
> + * Although the delay is applied in BEGIN messages, streamed transactions
> + * apply the delay in a STREAM COMMIT message. That's ok because no
> + * changes have been applied yet (apply_spooled_messages() will do it).
> + * The STREAM START message would be a natural choice for this delay
> but
> + * there is no commit time yet (it will be available when the in-progress
> + * transaction finishes), hence, it was not possible to apply a delay at
> + * that time.
> + */
> +maybe_delay_apply(commit_data.committime);
> 
> I am wondering how this will interact with the parallel apply worker
> where we do not spool the data in file?  How are we going to get the
> commit time of the transaction without applying the changes?

We think that parallel apply workers should not delay applications because if
they delay transactions before committing they may hold locks very long time.

> 5.
> +/*
> + * The following operations use these special functions to detect
> + * overflow. Number of ms per informed days.
> + */
> 
> This comment doesn't make much sense, I think this needs to be rephrased.

Changed to simpler expression.

We have also fixed wrong usage of wal_receiver_status_interval. We must convert
the unit from [s] to [ms] when it is passed to WaitLatch().


Note that more than half of the modifications are done by Osumi-san.

[1]: https://www.postgresql.org/message-id/20221215224721.GA694065%40nathanxps13

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



v12-0001-wake-up-logical-workers-as-needed-instead-of-rel.patch
Description:  v12-0001-wake-up-logical-workers-as-needed-instead-of-rel.patch


v12-0002-Time-delayed-logical-replication-subscriber.patch
Description:  v12-0002-Time-delayed-logical-replication-subscriber.patch

Re: Exit walsender before confirming remote flush in logical replication

2022-12-27 Thread Amit Kapila

On Tue, Dec 27, 2022 at 2:50 PM Amit Kapila  wrote:
>
> On Tue, Dec 27, 2022 at 1:44 PM Hayato Kuroda (Fujitsu)
>  wrote:
> >
> > Thanks for checking my proposal!
> >
> > > - * Note that if we determine that there's still more data to send, this
> > > - * function will return control to the caller.
> > > + * Note that if we determine that there's still more data to send or we 
> > > are in
> > > + * the physical replication more, this function will return control to 
> > > the
> > > + * caller.
> > >
> > > I think in this comment you meant to say
> > >
> > > 1. "or we are in physical replication mode and all WALs are not yet 
> > > replicated"
> > > 2. Typo /replication more/replication mode
> >
> > Firstly I considered 2, but I thought 1 seemed to be better.
> > PSA the updated patch.
> >
>
> I think even for logical replication we should check whether there is
> any pending WAL (via pq_is_send_pending()) to be sent. Otherwise, what
> is the point to send the done message? Also, the caller of
> WalSndDone() already has that check which is another reason why I
> can't see why you didn't have the same check in function WalSndDone().
>
> BTW, even after fixing this, I think logical replication will behave
> differently when due to some reason (like time-delayed replication)
> send buffer gets full and walsender is not able to send data. I think
> this will be less of an issue with physical replication because there
> is a separate walreceiver process to flush the WAL which doesn't wait
> but the same is not true for logical replication. Do you have any
> thoughts on this matter?
>

In logical replication, it can happen today as well without
time-delayed replication. Basically, say apply worker is waiting to
acquire some lock that is already acquired by some backend then it
will have the same behavior. I have not verified this, so you may want
to check it once.

-- 
With Regards,
Amit Kapila.

Re: Exit walsender before confirming remote flush in logical replication

2022-12-27 Thread Amit Kapila

On Tue, Dec 27, 2022 at 1:44 PM Hayato Kuroda (Fujitsu)
 wrote:
>
> Thanks for checking my proposal!
>
> > - * Note that if we determine that there's still more data to send, this
> > - * function will return control to the caller.
> > + * Note that if we determine that there's still more data to send or we 
> > are in
> > + * the physical replication more, this function will return control to the
> > + * caller.
> >
> > I think in this comment you meant to say
> >
> > 1. "or we are in physical replication mode and all WALs are not yet 
> > replicated"
> > 2. Typo /replication more/replication mode
>
> Firstly I considered 2, but I thought 1 seemed to be better.
> PSA the updated patch.
>

I think even for logical replication we should check whether there is
any pending WAL (via pq_is_send_pending()) to be sent. Otherwise, what
is the point to send the done message? Also, the caller of
WalSndDone() already has that check which is another reason why I
can't see why you didn't have the same check in function WalSndDone().

BTW, even after fixing this, I think logical replication will behave
differently when due to some reason (like time-delayed replication)
send buffer gets full and walsender is not able to send data. I think
this will be less of an issue with physical replication because there
is a separate walreceiver process to flush the WAL which doesn't wait
but the same is not true for logical replication. Do you have any
thoughts on this matter?

-- 
With Regards,
Amit Kapila.

Underscores in numeric literals

2022-12-27 Thread Peter Eisentraut

Here is a patch to add support for underscores in numeric literals, for 
visual grouping, like


1_500_000_000
0b10001000_
0o_1_755
0x_
1.618_034

per SQL:202x draft.

This adds support in the lexer as well as in the integer type input 
functions.


TODO: float/numeric type input support

I did some performance tests similar to what was done in [0] and didn't 
find any problematic deviations.  Other tests would be welcome.


[0]: 
https://www.postgresql.org/message-id/flat/b239564c-cad0-b23e-c57e-166d883cb...@enterprisedb.comFrom 36aef78423e9adc6ebe72fb2a3cf43e385a2caca Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 27 Dec 2022 10:10:18 +0100
Subject: [PATCH v1] Underscores in numeric literals

Add support for underscores in numeric literals, for visual grouping,
like

1_500_000_000
0b10001000_
0o_1_755
0x_
1.618_034

per SQL:202x draft.

This adds support in the lexer as well as in the integer type input
functions.

TODO: float/numeric type input support

Discussion: 
https://www.postgresql.org/message-id/flat/b239564c-cad0-b23e-c57e-166d883cb...@enterprisedb.com
---
 doc/src/sgml/syntax.sgml  |  14 ++
 src/backend/catalog/sql_features.txt  |   1 +
 src/backend/parser/scan.l |  63 ++--
 src/backend/utils/adt/numutils.c  | 144 --
 src/fe_utils/psqlscan.l   |  16 +-
 src/interfaces/ecpg/preproc/pgc.l |  16 +-
 src/pl/plpgsql/src/expected/plpgsql_trap.out  |   2 +-
 src/pl/plpgsql/src/sql/plpgsql_trap.sql   |   2 +-
 src/test/regress/expected/int2.out|  44 ++
 src/test/regress/expected/int4.out|  44 ++
 src/test/regress/expected/int8.out|  44 ++
 src/test/regress/expected/numerology.out  |  92 ++-
 src/test/regress/expected/partition_prune.out |   6 +-
 src/test/regress/sql/int2.sql |  14 ++
 src/test/regress/sql/int4.sql |  14 ++
 src/test/regress/sql/int8.sql |  14 ++
 src/test/regress/sql/numerology.sql   |  24 ++-
 src/test/regress/sql/partition_prune.sql  |   6 +-
 18 files changed, 509 insertions(+), 51 deletions(-)

diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml
index 956182e7c6..27e53b4b46 100644
--- a/doc/src/sgml/syntax.sgml
+++ b/doc/src/sgml/syntax.sgml
@@ -728,6 +728,20 @@ Numeric Constants
  
 
 
+
+ For visual grouping, underscores can be inserted between digits.  These
+ have no further effect on the value of the literal.  For example:
+1_500_000_000
+0b10001000_
+0o_1_755
+0x_
+1.618_034
+
+ Underscores are not allowed at the start or end of a numeric constant or
+ a group of digits (that is, immediately before or after a period or the
+ e), and more than one underscore in a row is not allowed.
+
+
 
  integer
  bigint
diff --git a/src/backend/catalog/sql_features.txt 
b/src/backend/catalog/sql_features.txt
index abad216b7e..3766762ae3 100644
--- a/src/backend/catalog/sql_features.txt
+++ b/src/backend/catalog/sql_features.txt
@@ -528,6 +528,7 @@ T653SQL-schema statements in external routines  
YES
 T654   SQL-dynamic statements in external routines NO  
 T655   Cyclically dependent routines   YES 
 T661   Non-decimal integer literalsYES SQL:202x draft
+T662   Underscores in integer literals YES SQL:202x draft
 T811   Basic SQL/JSON constructor functionsNO  
 T812   SQL/JSON: JSON_OBJECTAGGNO  
 T813   SQL/JSON: JSON_ARRAYAGG with ORDER BY   NO  
diff --git a/src/backend/parser/scan.l b/src/backend/parser/scan.l
index 9ad9e0c8ba..a1ea94ef06 100644
--- a/src/backend/parser/scan.l
+++ b/src/backend/parser/scan.l
@@ -124,6 +124,7 @@ static void addlit(char *ytext, int yleng, core_yyscan_t 
yyscanner);
 static void addlitchar(unsigned char ychar, core_yyscan_t yyscanner);
 static char *litbufdup(core_yyscan_t yyscanner);
 static unsigned char unescape_single_char(unsigned char c, core_yyscan_t 
yyscanner);
+static char *strip_underscores(const char *in);
 static int process_integer_literal(const char *token, YYSTYPE *lval, int 
base);
 static void addunicode(pg_wchar c, yyscan_t yyscanner);
 
@@ -395,19 +396,19 @@ hexdigit  [0-9A-Fa-f]
 octdigit   [0-7]
 bindigit   [0-1]
 
-decinteger {decdigit}+
-hexinteger 0[xX]{hexdigit}+
-octinteger 0[oO]{octdigit}+
-bininteger 0[bB]{bindigit}+
+decinteger {decdigit}(_?{decdigit})*
+hexinteger 0[xX](_?{hexdigit})+
+octinteger 0[oO](_?{octdigit})+
+bininteger 0[bB](_?{bindigit})+
 
-hexfail0[xX]
-octfail

Refactor recordExtObjInitPriv()

2022-12-27 Thread Peter Eisentraut


Another aclchk.c refactoring patch, similar to [0] and [1].

Refactor recordExtObjInitPriv():  Instead of half a dozen of 
mostly-duplicate conditional branches, write one common one that can 
handle most catalogs.  We already have all the information we need, such 
as which system catalog corresponds to which catalog table and which 
column is the ACL column.



[0]: 
https://www.postgresql.org/message-id/95c30f96-4060-2f48-98b5-a4392d3b6...@enterprisedb.com
[1]: 
https://www.postgresql.org/message-id/flat/22c7e802-4e7d-8d87-8b71-cba95e6f4bcf%40enterprisedb.comFrom 2982380987066d43e4d35db2157f98498a2a3185 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 27 Dec 2022 09:38:54 +0100
Subject: [PATCH] Refactor recordExtObjInitPriv()

Instead of half a dozen of mostly-duplicate conditional branches,
write one common one that can handle most catalogs.  We already have
all the information we need, such as which system catalog corresponds
to which catalog table and which column is the ACL column.
---
 src/backend/catalog/aclchk.c | 153 ++-
 1 file changed, 8 insertions(+), 145 deletions(-)

diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index cdfd637815..cf18401449 100644
--- a/src/backend/catalog/aclchk.c
+++ b/src/backend/catalog/aclchk.c
@@ -4241,9 +4241,6 @@ recordDependencyOnNewAcl(Oid classId, Oid objectId, int32 
objsubId,
  *
  * For the object passed in, this will record its ACL (if any) and the ACLs of
  * any sub-objects (eg: columns) into pg_init_privs.
- *
- * Any new kinds of objects which have ACLs associated with them and can be
- * added to an extension should be added to the if-else tree below.
  */
 void
 recordExtObjInitPriv(Oid objoid, Oid classoid)
@@ -4336,74 +4333,6 @@ recordExtObjInitPriv(Oid objoid, Oid classoid)
 
ReleaseSysCache(tuple);
}
-   /* pg_foreign_data_wrapper */
-   else if (classoid == ForeignDataWrapperRelationId)
-   {
-   Datum   aclDatum;
-   boolisNull;
-   HeapTuple   tuple;
-
-   tuple = SearchSysCache1(FOREIGNDATAWRAPPEROID,
-   
ObjectIdGetDatum(objoid));
-   if (!HeapTupleIsValid(tuple))
-   elog(ERROR, "cache lookup failed for foreign data 
wrapper %u",
-objoid);
-
-   aclDatum = SysCacheGetAttr(FOREIGNDATAWRAPPEROID, tuple,
-  
Anum_pg_foreign_data_wrapper_fdwacl,
-  );
-
-   /* Add the record, if any, for the top-level object */
-   if (!isNull)
-   recordExtensionInitPrivWorker(objoid, classoid, 0,
-   
  DatumGetAclP(aclDatum));
-
-   ReleaseSysCache(tuple);
-   }
-   /* pg_foreign_server */
-   else if (classoid == ForeignServerRelationId)
-   {
-   Datum   aclDatum;
-   boolisNull;
-   HeapTuple   tuple;
-
-   tuple = SearchSysCache1(FOREIGNSERVEROID, 
ObjectIdGetDatum(objoid));
-   if (!HeapTupleIsValid(tuple))
-   elog(ERROR, "cache lookup failed for foreign server %u",
-objoid);
-
-   aclDatum = SysCacheGetAttr(FOREIGNSERVEROID, tuple,
-  
Anum_pg_foreign_server_srvacl,
-  );
-
-   /* Add the record, if any, for the top-level object */
-   if (!isNull)
-   recordExtensionInitPrivWorker(objoid, classoid, 0,
-   
  DatumGetAclP(aclDatum));
-
-   ReleaseSysCache(tuple);
-   }
-   /* pg_language */
-   else if (classoid == LanguageRelationId)
-   {
-   Datum   aclDatum;
-   boolisNull;
-   HeapTuple   tuple;
-
-   tuple = SearchSysCache1(LANGOID, ObjectIdGetDatum(objoid));
-   if (!HeapTupleIsValid(tuple))
-   elog(ERROR, "cache lookup failed for language %u", 
objoid);
-
-   aclDatum = SysCacheGetAttr(LANGOID, tuple, 
Anum_pg_language_lanacl,
-  );
-
-   /* Add the record, if any, for the top-level object */
-   if (!isNull)
-   recordExtensionInitPrivWorker(objoid, classoid, 0,
-   
  DatumGetAclP(aclDatum));
-
-   ReleaseSysCache(tuple);
-   }

Re: Making Vars outer-join aware

2022-12-27 Thread Richard Guo

On Sat, Dec 24, 2022 at 2:20 AM Tom Lane  wrote:

> I shoved some preliminary refactoring into the 0001 patch,
> notably splitting deconstruct_jointree into two passes.
> 0002-0009 cover the same ground as they did before, though
> with some differences in detail.  0010-0012 are new work
> mostly aimed at removing kluges we no longer need.

I'm looking at 0010-0012 and I really like the changes and removals
there.  Thanks for the great work!

For 0010, the change seems quite independent.  I think maybe we can
apply it to HEAD directly.

For 0011, I found that some clauses that were outerjoin_delayed and thus
not equivalent before might be treated as being equivalent now.  For
example

explain (costs off)
select * from a left join b on a.i = b.i where coalesce(b.j, 0) = 0 and
coalesce(b.j, 0) = a.j;
QUERY PLAN
--
 Hash Right Join
   Hash Cond: (b.i = a.i)
   Filter: (COALESCE(b.j, 0) = 0)
   ->  Seq Scan on b
   ->  Hash
 ->  Seq Scan on a
   Filter: (j = 0)
(7 rows)

This is different behavior from HEAD.  But I think it's an improvement.

For 0012, I'm still trying to understand JoinDomain.  AFAIU all EC
members of the same EC should have the same JoinDomain, because for
constants we match EC members only within the same JoinDomain, and for
Vars if they come from different join domains they will have different
nullingrels and thus will not match.  So I wonder if we can have the
JoinDomain kept in EquivalenceClass rather than in each
EquivalenceMembers.

Thanks
Richard

RE: Exit walsender before confirming remote flush in logical replication

2022-12-27 Thread Hayato Kuroda (Fujitsu)

Dear Dilip,

Thanks for checking my proposal!

> - * Note that if we determine that there's still more data to send, this
> - * function will return control to the caller.
> + * Note that if we determine that there's still more data to send or we are 
> in
> + * the physical replication more, this function will return control to the
> + * caller.
> 
> I think in this comment you meant to say
> 
> 1. "or we are in physical replication mode and all WALs are not yet 
> replicated"
> 2. Typo /replication more/replication mode

Firstly I considered 2, but I thought 1 seemed to be better.
PSA the updated patch.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



v2-0001-Exit-walsender-before-confirming-remote-flush-in-.patch
Description:  v2-0001-Exit-walsender-before-confirming-remote-flush-in-.patch

53 matches

Mail list logo