Re: table inheritance versus column compression and storage settings

2024-01-30 Thread Ashutosh Bapat
On Tue, Dec 5, 2023 at 6:28 PM Peter Eisentraut  wrote:
>
> On 05.12.23 05:26, Ashutosh Bapat wrote:
> >> - When inheriting from multiple parents with different settings, an
> >> explicit setting in the child is required.
> > When no explicit setting for child is specified, it will throw an
> > error as it does today. Right?
>
> Yes, it would throw an error, but a different error than today, saying
> something like "the settings in the parents conflict, so you need to
> specify one here to override the conflict".
>

PFA patch fixing inheritance and compression. It also fixes a crash
reported in [1].

The storage looks more involved. The way it has been coded, the child
always inherits the parent's storage properties.
#create table t1 (a text storage plain);
CREATE TABLE
#create table c1 (b text storage main) inherits(t1);
CREATE TABLE
#create table c1 (a text storage main) inherits(t1);
NOTICE:  merging column "a" with inherited definition
CREATE TABLE
#\d+ t1
  Table "public.t1"
 Column | Type | Collation | Nullable | Default | Storage |
Compression | Stats target | Description
+--+---+--+-+-+-+--+-
 a  | text |   |  | | plain   |
 |  |
Child tables: c1
Access method: heap
#\d+ c1
  Table "public.c1"
 Column | Type | Collation | Nullable | Default | Storage |
Compression | Stats target | Description
+--+---+--+-+-+-+--+-
 a  | text |   |  | | plain   |
 |  |
Inherits: t1
Access method: heap

Observe that c1.a did not have storage "main" but instead inherits
"plain" from t1.

According to the code at
https://github.com/postgres/postgres/blob/6a1ea02c491d16474a6214603dce40b5b122d4d1/src/backend/commands/tablecmds.c#L3253,
there is supposed to be a conflict error. But that does not happen
since child's storage specification is in ColumnDef::storage_name
which is never consulted. The ColumnDef::storage_name is converted to
ColumnDef::storage only in BuildDescForRelation(), after
MergeAttribute() has been finished. There are two ways to fix this
1. In MergeChildAttribute() resolve ColumnDef::storage_name to
ColumnDef::storage before comparing it against inherited property. I
don't like this approach since a. we duplicate the conversion logic in
MergeChildAttribute() and BuildDescForRelation(), b. the conversion
happens only for the attributes which are inherited.

2. Deal with it the same way as compression. Get rid of
ColumnDef::storage altogether. Instead set ColumnDef::storage_name at
https://github.com/postgres/postgres/blob/6a1ea02c491d16474a6214603dce40b5b122d4d1/src/backend/commands/tablecmds.c#L2723.

I am inclined to take the second approach. Let me know if you feel otherwise.

[1] 
https://www.postgresql.org/message-id/b22a6834-aacb-7b18-0424-a3f5fe889...@gmail.com

-- 
Best Wishes,
Ashutosh Bapat




Re: Possibility to disable `ALTER SYSTEM`

2024-01-30 Thread Peter Eisentraut

On 31.01.24 06:28, Tom Lane wrote:

The idea of adding a file to the data directory appeals to me.

optional_runtime_features.conf
alter_system=enabled
copy_from_program=enabled
copy_to_program=disabled

... so, exactly what keeps an uncooperative superuser from
overwriting that file?


The point of this feature would be to keep the honest people honest.

The first thing I did when ALTER SYSTEM came out however many years ago 
was to install Nagios checks to warn when postgresql.auto.conf exists. 
Because the thing is an attractive nuisance, especially when you want to 
do centralized configuration control.  Of course you can bypass it using 
COPY PROGRAM etc., but then you *know* that you are *bypassing* 
something.  If you just see ALTER SYSTEM, you'll think, "that is 
obviously the appropriate tool", and there is no generally accepted way 
to communicate that, in particular environment, it might not be.






RE: Synchronizing slots from primary to standby

2024-01-30 Thread Zhijie Hou (Fujitsu)
On Wednesday, January 31, 2024 9:57 AM Peter Smith  
wrote:
> 
> Hi,
> 
> I saw that v73-0001 was pushed, but it included some last-minute
> changes that I did not get a chance to check yesterday.
> 
> Here are some review comments for the new parts of that patch.
> 
> ==
> doc/src/sgml/ref/create_subscription.sgml
> 
> 1.
> connect (boolean)
> 
> Specifies whether the CREATE SUBSCRIPTION command should connect
> to the publisher at all. The default is true. Setting this to false
> will force the values of create_slot, enabled, copy_data, and failover
> to false. (You cannot combine setting connect to false with setting
> create_slot, enabled, copy_data, or failover to true.)
> 
> ~
> 
> I don't think the first part "Setting this to false will force the
> values ... failover to false." is strictly correct.
> 
> I think is correct to say all those *other* properties (create_slot,
> enabled, copy_data) are forced to false because those otherwise have
> default true values. But the 'failover' has default false, so it
> cannot get force-changed at all because you can't set connect to false
> when failover is true as the second part ("You cannot combine...")
> explains.
> 
> IMO remove 'failover' from that first sentence.
> 
> 
> 3.
> dump can be restored without requiring network access to the remote
> servers.  It is then up to the user to reactivate the subscriptions in a
> suitable way.  If the involved hosts have changed, the connection
> -   information might have to be changed.  It might also be appropriate to
> +   information might have to be changed.  If the subscription needs to
> +   be enabled for
> +linkend="sql-createsubscription-params-with-failover">failover eral>,
> +   then same needs to be done by executing
> +   
> +   ALTER SUBSCRIPTION ... SET(failover = true)
> +   after the slot has been created.  It might also be appropriate to
> 
> "then same needs to be done" (English?)
> 
> BEFORE
> If the subscription needs to be enabled for failover, then same needs
> to be done by executing ALTER SUBSCRIPTION ... SET(failover = true)
> after the slot has been created.
> 
> SUGGESTION
> If the subscription needs to be enabled for failover, execute ALTER
> SUBSCRIPTION ... SET(failover = true) after the slot has been created.
> 
> ==
> src/backend/commands/subscriptioncmds.c
> 
> 4.
>  #define SUBOPT_RUN_AS_OWNER 0x1000
> -#define SUBOPT_LSN 0x2000
> -#define SUBOPT_ORIGIN 0x4000
> +#define SUBOPT_FAILOVER 0x2000
> +#define SUBOPT_LSN 0x4000
> +#define SUBOPT_ORIGIN 0x8000
> +
> 
> A spurious blank line was added.
> 

Here is a small patch to address the comment 3 and 4.
The discussion for comment 1 is still going on, so we can
update the patch once it's concluded.

Best Regards,
Hou zj



0001-clean-up-for-776621a5.patch
Description: 0001-clean-up-for-776621a5.patch


Re: Apply the "LIMIT 1" optimization to partial DISTINCT

2024-01-30 Thread Richard Guo
On Wed, Jan 31, 2024 at 12:26 PM David Rowley  wrote:

> On Fri, 26 Jan 2024 at 21:14, David Rowley  wrote:
> > However, having said that. Parallel plans are often picked when there
> > is some highly selective qual as parallel_tuple_cost has to be applied
> > to fewer tuples for such plans, so probably this is worth doing.
>
> I was messing around with your test case and didn't manage to get any
> plan that had any rows to use the partial path with the LIMIT.  I
> ended up dropping the test that was checking the results were empty as
> I didn't think it added much more value over the EXPLAIN output.
>
> I pushed the result.


Thanks for pushing it!

Thanks
Richard


Re: Some revises in adding sorting path

2024-01-30 Thread Richard Guo
On Wed, Jan 31, 2024 at 5:13 AM David Rowley  wrote:

> On Wed, 31 Jan 2024 at 00:44, Richard Guo  wrote:
> > This patchset does not aim to introduce anything new; it simply
> > refactors the existing code.  The newly added tests are used to show
> > that the code that is touched here is not redundant, but rather
> > essential for generating certain paths.  I remember the tests were added
> > per your comment in [3].
> >
> > [3]
> https://www.postgresql.org/message-id/CAApHDvo%2BFagxVSGmvt-LUrhLZQ0KViiLvX8dPaG3ZzWLNd-Zpg%40mail.gmail.com
>
> OK.  I've pushed the patched based on it being a simplification of the
> partial path generation.


Thanks for pushing it!

Thanks
Richard


Re: Should we remove -Wdeclaration-after-statement?

2024-01-30 Thread Peter Eisentraut

On 29.01.24 16:03, Jelte Fennema-Nio wrote:

I feel like this is the type of change where there's not much
discussion to be had. And the only way to resolve it is to use some
voting to gauge community opinion.

So my suggestion is for people to respond with -1, -0.5, +-0, +0.5, or
+1 to indicate support against/for the change.

I'll start: +1

Attached is a patch that removes -Wdeclaration-after-statement in the
codebase. This is mainly to be able to add it to the commitfest, to
hopefully get a decent amount of responses.


-1, mostly for the various reasons explained by others.





Re: Incorrect cost for MergeAppend

2024-01-30 Thread Ashutosh Bapat
On Wed, Jan 31, 2024 at 12:12 PM David Rowley  wrote:
>
> What is relevant are things like:
>
> For:
> * It's a clear bug and what's happening now is clearly wrong.
> * inheritance/partitioned table plan changes for the better in minor versions
>
> Against:
> * Nobody has complained for 13 years, so maybe it's unlikely anyone is
> suffering too much.
> * Possibility of inheritance/partitioned table plans changing for the
> worse in minor versions
>

That's what I am thinking as well. And the plans that may change for
the worse are the ones where the costs with and without the patch are
close.

Just to be clear, the change is for good and should be committed to
the master. It's the backpatching I am worried about.

-- 
Best Wishes,
Ashutosh Bapat




Re: Incorrect cost for MergeAppend

2024-01-30 Thread David Rowley
On Wed, 31 Jan 2024 at 18:58, Ashutosh Bapat
 wrote:
> with patch
>  Merge Append  (cost=6.94..18.90 rows=198 width=4)

> without patch
>  Sort  (cost=19.04..19.54 rows=198 width=4)

> Those numbers are higher than 1% (#define STD_FUZZ_FACTOR 1.01) but
> slight variation in all the GUCs that affect cost, might bring the
> difference closer to STD_FUZZ_FACTOR.
>
> Given how close they are, maybe it's not such a good idea to
> backpatch.

The reason those numbers are close is because I reduced the row count
on the test tables to a point where we only just get the Merge Append
plan with a small margin.  I don't see the test case costs as a
relevant factor for if we backpatch or not.

What is relevant are things like:

For:
* It's a clear bug and what's happening now is clearly wrong.
* inheritance/partitioned table plan changes for the better in minor versions

Against:
* Nobody has complained for 13 years, so maybe it's unlikely anyone is
suffering too much.
* Possibility of inheritance/partitioned table plans changing for the
worse in minor versions

For now, I'm on the fence on this one.

David




Re: Oversight in reparameterize_path_by_child leading to executor crash

2024-01-30 Thread Richard Guo
On Wed, Jan 31, 2024 at 5:12 AM Tom Lane  wrote:

> Richard Guo  writes:
> > On Wed, Jan 17, 2024 at 5:01 PM Richard Guo 
> wrote:
> >> Sure, here it is:
> >> v10-0001-Avoid-reparameterizing-Paths-when-it-s-not-suitable.patch
>
> > I forgot to mention that this patch applies on v16 not master.
>
> I spent some time looking at this patch (which seems more urgent than
> the patch for master, given that we have back-branch releases coming
> up).


Thanks for looking at this patch!


> There are two things I'm not persuaded about:
>
> * Why is it okay to just use pull_varnos on a tablesample expression,
> when contain_references_to() does something different?


Hmm, the main reason is that the expression_tree_walker() function does
not handle T_RestrictInfo nodes.  So we cannot just use pull_varnos or
pull_var_clause on a restrictinfo list.  So contain_references_to()
calls extract_actual_clauses() first before it calls pull_var_clause().


> * Is contain_references_to() doing the right thing by specifying
> PVC_RECURSE_PLACEHOLDERS?  That causes it to totally ignore a
> PlaceHolderVar's ph_eval_at, and I'm not convinced that's correct.


I was thinking that we should recurse into the PHV's contents to see if
there are any lateral references to the other join relation.  But now I
realize that checking phinfo->ph_lateral, as you suggested, might be a
better way to do that.

But I'm not sure about checking phinfo->ph_eval_at.  It seems to me that
the ph_eval_at could not overlap the other join relation; otherwise the
clause would not be a restriction clause but rather a join clause.


> Ideally it seems to me that we want to reject a PlaceHolderVar
> if either its ph_eval_at or ph_lateral overlap the other join
> relation; if it was coded that way then we'd not need to recurse
> into the PHV's contents.   pull_varnos isn't directly amenable
> to this, but I think we could use pull_var_clause with
> PVC_INCLUDE_PLACEHOLDERS and then iterate through the resulting
> list manually.  (If this patch were meant for HEAD, I'd think
> about extending the var.c code to support this usage more directly.
> But as things stand, this is a one-off so I think we should just do
> what we must in reparameterize_path_by_child.)


Thanks for the suggestion.  I've coded it this way in the v11 patch, and
left a XXX comment about checking phinfo->ph_eval_at.


> BTW, it shouldn't be necessary to write either PVC_RECURSE_AGGREGATES
> or PVC_RECURSE_WINDOWFUNCS, because neither kind of node should ever
> appear in a scan-level expression.  I'd leave those out so that we
> get an error if something unexpected happens.


Good point.

Thanks
Richard


v11-0001-Avoid-reparameterizing-Paths-when-it-s-not-suitable.patch
Description: Binary data


Re: src/bin/pg_upgrade/t/004_subscription.pl test comment fix

2024-01-30 Thread Peter Smith
On Wed, Jan 31, 2024 at 4:51 PM vignesh C  wrote:
>
> On Wed, 31 Jan 2024 at 10:27, Peter Smith  wrote:
> >
> > Hi,
> >
> > PSA a small fix for a misleading comment found in the pg_upgrade test code.
>
> Is the double # intentional here, as I did not see this style of
> commenting used elsewhere:
> +# # Upgraded regress_sub1 should still have enabled and failover as true.
> +# # Upgraded regress_sub2 should still have enabled and failover as false.
>

Unintentional caused by copy/paste. Thanks for reporting. I will post
a fixed patch tomorrow.

==
Kind Regards,
Peter Smith.
Fujitsu Australia




RE: Documentation to upgrade logical replication cluster

2024-01-30 Thread Hayato Kuroda (Fujitsu)
Dear Vignesh,

Thanks for updating the patch! Here are my comments for v6.

01.
```
+   Logical replication cluster
+   
+
+ A set of publisher and subscriber instance with publisher instance
+ replicating changes to the subscriber instance.
+
+   
```

Should we say 1:N relationship is allowed?

02.
```
@@ -70,6 +70,7 @@ PostgreSQL documentation
pg_upgrade supports upgrades from 9.2.X and later to the current
major release of PostgreSQL, including snapshot 
and beta releases.
   
+
  
```

Unnecessary blank.

03.
```
   
-   These are the steps to perform an upgrade
-   with pg_upgrade:
+   Below are the steps to perform an upgrade
+   with pg_upgrade.
   
```

I'm not sure it should be included in this patch.

04.
```
+   If the old primary is prior to version 17.0, then no slots on the 
primary
+   are copied to the new standby, so all the slots on the old standby must
+   be recreated manually.
```

I think that "all the slots on the old standby" must be created manually in any
cases. Therefore, the preposition ", so" seems not correct.

05.
```
If the old primary is version 17.0 or later, then
+   only logical slots on the primary are copied to the new standby, but
+   other slots on the old standby are not copied, so must be recreated
+   manually.
```

How about replacing this paragraph to below?

```
All the slots on the old standby must be recreated manually. If the old primary
is version 17.0 or later, then only logical slots on the primary are copied to 
the
new standby.
```

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 



Re: int4->bool test coverage

2024-01-30 Thread Michael Paquier
On Fri, Dec 22, 2023 at 11:48:11AM +0900, Michael Paquier wrote:
> On Thu, Dec 21, 2023 at 11:56:22AM +0100, Christoph Berg wrote:
>> The first cast is the int4_bool function, but it isn't covered by the
>> regression tests at all. The attached patch adds tests.
> 
> I don't see why not.

And one month later, done.

> Interesting that there are a few more of these in int.c, like int2up,
> int4inc, int2smaller, int{2,4}shr, int{2,4}not, etc.

I've left these out for now.
--
Michael


signature.asc
Description: PGP signature


possible inter-gcc-version incompatibility regarding fat objects

2024-01-30 Thread Kyotaro Horiguchi
Hello.

I would like to share some information regarding a recent issue we
encoutered.

While compiling an extension against the RPM package
postgresql16-devel-16.1-2PGDG.rhel9.x86_64.rpm we faced a problem that
appears to be caused by the combination of LTO and constant
folding/propagation, leading to a SEGV. This crash didn't occur when
we gave --fno-lto instead of --flto=auto to the compiler.

To put this case briefly:
postgresql16-devel-16.1-2PGDG.rhel9.x86_64.rpm + gcc 11.3
  => building the extension completes but the binary may crash with SEGV
 --fno-lto prevents this crash from occurring.

The PG package is built with gcc-11.4, and the compiler in our build
environment for the extension was gcc-11.3. A quick search suggests
that the LTO bytecode format was optimized in gcc-11.4. Conversely,
when we built the extension with postgresql-16-16.0 and gc-11.4, the
build fails. This failure seems to stem LTO binary format
incompatibility.

> Error: bad register name `%'
> Error: open CFI at the end of file; missing .cfi_endproc directive
> /usr/bin/ld: error: lto-wrapper failed

To put this case briefly:
postgresql16-devel-16.0-1PGDG.rhel9.x86_64.rpm + gcc 11.4
  => build failure regarding LTO

Given these issues, it seems there might be some issues with including
fat objects in the -devel package.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Incorrect cost for MergeAppend

2024-01-30 Thread Ashutosh Bapat
On Wed, Jan 31, 2024 at 4:33 AM David Rowley  wrote:
>
> On Wed, 31 Jan 2024 at 02:23, Alexander Kuzmenkov
>  wrote:
> >
> > On Tue, Jan 30, 2024 at 1:20 PM David Rowley  wrote:
> > > You should likely focus on trying to find a test that does not require
> > > making 2 tables with 10k rows.
> >
> > Is 1k smallint OK? It should fit in one page. Still reproduces the
> > error, and the entire test case runs in under 10 ms.
>
> I had a go at making it a bit smaller without going dangerously close
> to where the plan might change. The following seems to work.
>
> create table ma0(a int primary key);
> create table ma1() inherits (ma0);
> insert into ma0 select generate_series(1, 400);
> insert into ma1 select generate_series(1, 200);
> analyze ma0;
> analyze ma1;
>
> explain (costs off) select * from ma0 where a < 100 order by a;
>
> drop table ma0 cascade;

On my laptop with all default settings

with patch
postgres@231714=#explain select * from ma0 where a < 100 order by a;
  QUERY PLAN
--
 Merge Append  (cost=6.94..18.90 rows=198 width=4)
   Sort Key: ma0.a
   ->  Index Only Scan using ma0_pkey on ma0 ma0_1  (cost=0.15..9.88
rows=99 width=4)
 Index Cond: (a < 100)
   ->  Sort  (cost=6.78..7.03 rows=99 width=4)
 Sort Key: ma0_2.a
 ->  Seq Scan on ma1 ma0_2  (cost=0.00..3.50 rows=99 width=4)
   Filter: (a < 100)
(8 rows)

without patch
#explain select * from ma0 where a < 100 order by a;
  QUERY PLAN
--
 Sort  (cost=19.04..19.54 rows=198 width=4)
   Sort Key: ma0.a
   ->  Append  (cost=0.00..11.49 rows=198 width=4)
 ->  Seq Scan on ma0 ma0_1  (cost=0.00..7.00 rows=99 width=4)
   Filter: (a < 100)
 ->  Seq Scan on ma1 ma0_2  (cost=0.00..3.50 rows=99 width=4)
   Filter: (a < 100)
(7 rows)

postgres@233864=#select (19.54 - 18.90)/19.54, (19.04 - 18.09)/19.04;
?column?|?column?
+
 0.03275332650972364381 | 0.04989495798319327731
(1 row)

Those numbers are higher than 1% (#define STD_FUZZ_FACTOR 1.01) but
slight variation in all the GUCs that affect cost, might bring the
difference closer to STD_FUZZ_FACTOR.

Given how close they are, maybe it's not such a good idea to
backpatch. If the plans change for better, users won't complain but
otherwise they will complain and will have no way to go back to their
good plans in production.

-- 
Best Wishes,
Ashutosh Bapat




Re: Improve the connection failure error messages

2024-01-30 Thread Peter Smith
AFAIK some recent commits patches (e,g [1]  for the "slot sync"
development) have created some more cases of "could not connect..."
messages. So, you might need to enhance your patch to deal with any
new ones in the latest HEAD.

==
[1] 
https://github.com/postgres/postgres/commit/776621a5e4796fa214b6b29a7ca134f6c138572a

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Documentation: warn about two_phase when altering a subscription

2024-01-30 Thread Bertrand Drouvot
Hi,

On Wed, Jan 31, 2024 at 01:47:16PM +1100, Peter Smith wrote:
> Hi, thanks for the patch. Here are some review comments for v1

Thanks for the review!

> 
> ==
> 
> (below is not showing the links and other sgml rendering -- all those LGTM)
> 
> BEFORE
> When altering the slot_name, the failover and two_phase properties
> values of the named slot may differ from their counterparts failover
> and two_phase parameters specified in the subscription. When creating
> the slot, ensure the slot failover and two_phase properties match
> their counterparts parameters values of the subscription.
> 
> SUGGESTION
> When altering the slot_name, the failover and two_phase property
> values of the named slot may differ from the counterpart failover and
> two_phase parameters specified by the subscription. When creating the
> slot, ensure the slot properties failover and two_phase match their
> counterpart parameters of the subscription.
> 
> ~
> 
> BEFORE
> Otherwise, the slot on the publisher may behave differently from what
> subscription's failover and two_phase options say: for example, the
> slot on the publisher could ...
> 
> SUGGESTION:
> Otherwise, the slot on the publisher may behave differently from what
> these subscription options say: for example, the slot on the publisher
> could ...
> 

As a non native English speaker somehow I have to rely on you for those
suggestions ;-)

They make sense to me so applied both in v2 attached.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
>From 29ef47f6a201a81557ce1b4b37b414118a623634 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot 
Date: Tue, 30 Jan 2024 14:48:16 +
Subject: [PATCH v2] Documentation: warn about two_phase when altering a
 subscription's slot name.

776621a5e4 added a warning about the newly failover option. Doing the same
for the already existing two_phase one.
---
 doc/src/sgml/ref/alter_subscription.sgml | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)
 100.0% doc/src/sgml/ref/

diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index e9e6d9d74a..11f69f330d 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -235,15 +235,15 @@ ALTER SUBSCRIPTION name RENAME TO <
  
   When altering the
   slot_name,
-  the failover property value of the named slot may differ from the
+  the failover and two_phase property
+  values of the named slot may differ from the counterpart
   failover
-  parameter specified in the subscription. When creating the slot,
-  ensure the slot failover property matches the
-  failover
-  parameter value of the subscription. Otherwise, the slot on the
-  publisher may behave differently from what subscription's
-  failover
-  option says. The slot on the publisher could either be
+  and two_phase
+  parameters specified by the subscription. When creating the slot, ensure
+  the slot properties failover and two_phase
+  match their counterpart parameters of the subscription.
+  Otherwise, the slot on the publisher may behave differently from what these
+  subscription options say: for example, the slot on the publisher could either be
   synced to the standbys even when the subscription's
   failover
   option is disabled or could be disabled for sync
-- 
2.34.1



Re: src/bin/pg_upgrade/t/004_subscription.pl test comment fix

2024-01-30 Thread vignesh C
On Wed, 31 Jan 2024 at 10:27, Peter Smith  wrote:
>
> Hi,
>
> PSA a small fix for a misleading comment found in the pg_upgrade test code.

Is the double # intentional here, as I did not see this style of
commenting used elsewhere:
+# # Upgraded regress_sub1 should still have enabled and failover as true.
+# # Upgraded regress_sub2 should still have enabled and failover as false.

Regards,
Vignesh




Re: [PoC] Improve dead tuple storage for lazy vacuum

2024-01-30 Thread Masahiko Sawada
On Tue, Jan 30, 2024 at 7:20 PM John Naylor  wrote:
>
> On Tue, Jan 30, 2024 at 7:56 AM Masahiko Sawada  wrote:
> >
> > On Mon, Jan 29, 2024 at 8:48 PM John Naylor  wrote:
>
> > > I meant the macro could probably be
> > >
> > > Max(SLAB_DEFAULT_BLOCK_SIZE, (size) * N)
> > >
> > > (Right now N=32). I also realize I didn't answer your question earlier
> > > about block sizes being powers of two. I was talking about PG in
> > > general -- I was thinking all block sizes were powers of two. If
> > > that's true, I'm not sure if it's because programmers find the macro
> > > calculations easy to reason about, or if there was an implementation
> > > reason for it (e.g. libc behavior). 32*2088 bytes is about 65kB, or
> > > just above a power of two, so if we did  round that up it would be
> > > 128kB.
> >
> > Thank you for your explanation. It might be better to follow other
> > codes. Does the calculation below make sense to you?
> >
> > RT_SIZE_CLASS_ELEM size_class = RT_SIZE_CLASS_INFO[i];
> > Size inner_blocksize = SLAB_DEFAULT_BLOCK_SIZE;
> > while (inner_blocksize < 32 * size_class.allocsize)
> >  inner_blocksize <<= 1;
>
> It does make sense, but we can do it more simply:
>
> Max(SLAB_DEFAULT_BLOCK_SIZE, pg_nextpower2_32(size * 32))

Thanks!

I've attached the new patch set (v56). I've squashed previous updates
and addressed review comments on v55 in separate patches. Here are the
update summary:

0004: fix compiler warning caught by ci test.
0005-0008: address review comments on radix tree codes.
0009: cleanup #define and #undef
0010: use TEST_SHARED_RT macro for shared radix tree test. RT_SHMEM is
undefined after including radixtree.h so we should not use it in test
code.
0013-0015: address review comments on tidstore codes.
0017-0018: address review comments on vacuum integration codes.

Looking at overall changes, there are still XXX and TODO comments in
radixtree.h:

---
 * XXX There are 4 node kinds, and this should never be increased,
 * for several reasons:
 * 1. With 5 or more kinds, gcc tends to use a jump table for switch
 *statements.
 * 2. The 4 kinds can be represented with 2 bits, so we have the option
 *in the future to tag the node pointer with the kind, even on
 *platforms with 32-bit pointers. This might speed up node traversal
 *in trees with highly random node kinds.
 * 3. We can have multiple size classes per node kind.

Can we just remove "XXX"?

---
 * WIP: notes about traditional radix tree trading off span vs height...

Are you going to write it?

---
#ifdef RT_SHMEM
/*  WIP: do we really need this? */
typedef dsa_pointer RT_HANDLE;
#endif

I think it's worth having it.

---
 * WIP: The paper uses at most 64 for this node kind. "isset" happens to fit
 * inside a single bitmapword on most platforms, so it's a good starting
 * point. We can make it higher if we need to.
 */
#define RT_FANOUT_48_MAX (RT_NODE_MAX_SLOTS / 4)

Are you going to work something on this?

---
/* WIP: We could go first to the higher node16 size class */
newnode = RT_ALLOC_NODE(tree, RT_NODE_KIND_16, RT_CLASS_16_LO);

Does it mean to go to RT_CLASS_16_HI and then further go to
RT_CLASS_16_LO upon further deletion?

---
 * TODO: The current locking mechanism is not optimized for high concurrency
 * with mixed read-write workloads. In the future it might be worthwhile
 * to replace it with the Optimistic Lock Coupling or ROWEX mentioned in
 * the paper "The ART of Practical Synchronization" by the same authors as
 * the ART paper, 2016.

I think it's not TODO for now, but a future improvement. We can remove it.

---
/* TODO: consider 5 with subclass 1 or 2. */
#define RT_FANOUT_4 4

Is there something we need to do here?

---
/*
 * Return index of the chunk and slot arrays for inserting into the node,
 * such that the chunk array remains ordered.
 * TODO: Improve performance for non-SIMD platforms.
 */

Are you going to work on this?

---
/* Delete the element at 'idx' */
/*  TODO: replace slow memmove's */

Are you going to work on this?

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com


v56-ART.tar.gz
Description: GNU Zip compressed data


Re: Possibility to disable `ALTER SYSTEM`

2024-01-30 Thread David G. Johnston
On Tuesday, January 30, 2024, Tom Lane  wrote:

> "David G. Johnston"  writes:
> > On Tuesday, January 30, 2024, Tom Lane  wrote:
> >> My larger point here is that trying to enforce restrictions on
> >> superusers *within* Postgres is simply not a good plan, for
> >> largely the same reasons that Robert questioned making the
> >> GUC mechanism police itself.  It needs to be done outside,
> >> either at the filesystem level or via some other kernel-level
> >> security system.
>
> > The idea of adding a file to the data directory appeals to me.
> >
> > optional_runtime_features.conf
> > alter_system=enabled
> > copy_from_program=enabled
> > copy_to_program=disabled
>
> ... so, exactly what keeps an uncooperative superuser from
> overwriting that file?
>
> You cannot enforce such restrictions within Postgres.
> It has to be done by an outside mechanism.  If you think
> different, you are mistaken.
>

If the only user on the OS that can modify that file is root, how does the
superuser, who is hard coded to not be root, modify it?  The root/admin
user on the OS and it’s filesystem permissions is the outside mechanism
being suggested here.

If the complaint is that the in-memory boolean is changeable by the
superuser, or even the logic pertaining to the error branch of the code,
then yes this is a lost cause.

But root prevents superuser from controlling that file and then that file
can prevent the superuser from escaping to the operating system and
leveraging its OS postgres user.

David J.


Re: [17] CREATE SUBSCRIPTION ... SERVER

2024-01-30 Thread Ashutosh Bapat
On Wed, Jan 31, 2024 at 2:16 AM Jeff Davis  wrote:
>
> On Tue, 2024-01-30 at 16:17 +0530, Ashutosh Bapat wrote:
> > Converting a server and user mapping to
> > conninfo should be delegated to the FDW being used since that FDW
> > knows best how to use those options.
>
> If I understand you correctly, you mean that there would be a new
> optional function associated with an FDW (in addition to the HANDLER
> and VALIDATOR) like "CONNECTION", which would be able to return the
> conninfo from a server using that FDW. Is that right?

I am not sure whether it fits {HANDLER,VALIDATOR} set or should be
part of FdwRoutine or a new set of hooks similar to FdwRoutine. But
something like that. Since the hooks for query planning and execution
have different characteristics from the ones used for replication, it
might make sense to create a new set of hooks similar to FdwRoutine,
say FdwReplicationRoutines and rename FdwRoutines to FdwQueryRoutines.
This way, we know whether an FDW can handle subscription connections
or not. A SERVER whose FDW does not support replication routines
should not be used with a subscription.

>
> I like the idea -- it further decouples the logic from the core server.
> I suspect it will make postgres_fdw the primary way (though not the
> only possible way) to use this feature. There would be little need to
> create a new builtin FDW to make this work.

That's what I see as well. I am glad that we are on the same page.

>
> To get the subscription invalidation right, we'd need to make the
> (reasonable) assumption that the connection information is based only
> on the FDW, server, and user mapping. A FDW wouldn't be able to use,
> for example, some kind of configuration table or GUC to control how the
> connection string gets created. That's easy enough to solve with
> documentation.
>

I think that's true for postgres_fdw as well right? But I think it's
more important for a subscription since it's expected to live very
long almost as long as the server itself does. So I agree. But that's
FDW's responsibility.

-- 
Best Wishes,
Ashutosh Bapat




Re: Make COPY format extendable: Extract COPY TO format implementations

2024-01-30 Thread Michael Paquier
On Wed, Jan 31, 2024 at 02:11:22PM +0900, Sutou Kouhei wrote:
> Ah, yes. defel->location is used in later patches. For
> example, it's used when a COPY handler for the specified
> FORMAT isn't found.

I see.

> I've prepared the v10 patch set. Could you try this?

Thanks, I'm looking into that now.

> FYI: Here are Copy{From,To}Routine in the v10 patch set. I
> think that only Copy{From,To}OneRow are minimal callbacks
> for the performance gain. But can we keep Copy{From,To}Start
> and Copy{From,To}End for consistency? We can remove a few
> {csv_mode,binary} conditions by Copy{From,To}{Start,End}. It
> doesn't depend on the number of COPY target tuples. So they
> will not affect performance.

I think I'm OK to keep the start/end callbacks.  This makes the code
more consistent as a whole, as well.
--
Michael


signature.asc
Description: PGP signature


Re: Possibility to disable `ALTER SYSTEM`

2024-01-30 Thread Tom Lane
"David G. Johnston"  writes:
> On Tuesday, January 30, 2024, Tom Lane  wrote:
>> My larger point here is that trying to enforce restrictions on
>> superusers *within* Postgres is simply not a good plan, for
>> largely the same reasons that Robert questioned making the
>> GUC mechanism police itself.  It needs to be done outside,
>> either at the filesystem level or via some other kernel-level
>> security system.

> The idea of adding a file to the data directory appeals to me.
>
> optional_runtime_features.conf
> alter_system=enabled
> copy_from_program=enabled
> copy_to_program=disabled

... so, exactly what keeps an uncooperative superuser from
overwriting that file?

You cannot enforce such restrictions within Postgres.
It has to be done by an outside mechanism.  If you think
different, you are mistaken.

regards, tom lane




RE: Improve eviction algorithm in ReorderBuffer

2024-01-30 Thread Hayato Kuroda (Fujitsu)
Dear Sawada-san,

I have started to read your patches. Here are my initial comments.
At least, all subscription tests were passed on my env.

A comment for 0001:

01.
```
+static void
+bh_enlarge_node_array(binaryheap *heap)
+{
+if (heap->bh_size < heap->bh_space)
+return;
+
+heap->bh_space *= 2;
+heap->bh_nodes = repalloc(heap->bh_nodes,
+  sizeof(bh_node_type) * heap->bh_space);
+}
```

I'm not sure it is OK to use repalloc() for enlarging bh_nodes. This data
structure public one and arbitrary codes and extensions can directly refer
bh_nodes. But if the array is repalloc()'d, the pointer would be updated so that
their referring would be a dangling pointer.
I think the internal of the structure should be a private one in this case.

Comments for 0002:

02.
```
+#include "utils/palloc.h"
```

Is it really needed? I'm not sure who referrs it.

03.
```
typedef struct bh_nodeidx_entry
{
bh_node_typekey;
charstatus;
int idx;
} bh_nodeidx_entry;
```

Sorry if it is a stupid question. Can you tell me how "status" is used?
None of binaryheap and reorderbuffer components refer it. 

04.
```
 extern binaryheap *binaryheap_allocate(int capacity,
binaryheap_comparator compare,
-   void *arg);
+   bool indexed, void *arg);
```

I felt pre-existing API should not be changed. How about adding
binaryheap_allocate_extended() or something which can specify the `bool 
indexed`?
binaryheap_allocate() sets heap->bh_indexed to false.

05.
```
+extern void binaryheap_update_up(binaryheap *heap, bh_node_type d);
+extern void binaryheap_update_down(binaryheap *heap, bh_node_type d);
```

IIUC, callers must consider whether the node should be shift up/down and use
appropriate function, right? I felt it may not be user-friendly.

Comments for 0003:

06.
```
This commit changes the eviction algorithm in ReorderBuffer to use
max-heap with transaction size,a nd use two strategies depending on
the number of transactions being decoded.
```

s/a nd/ and/

07.
```
It could be too expensive to pudate max-heap while preserving the heap
property each time the transaction's memory counter is updated, as it
could happen very frquently. So when the number of transactions being
decoded is small, we add the transactions to max-heap but don't
preserve the heap property, which is O(1). We heapify the max-heap
just before picking the largest transaction, which is O(n). This
strategy minimizes the overheads of updating the transaction's memory
counter.
```

s/pudate/update/

08.
IIUC, if more than 1024 transactions are running but they have small amount of
changes, the performance may be degraded, right? Do you have a result in sucha
a case?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 



Re: Make COPY format extendable: Extract COPY TO format implementations

2024-01-30 Thread Sutou Kouhei
Hi,

In 
  "Re: Make COPY format extendable: Extract COPY TO format implementations" on 
Tue, 30 Jan 2024 17:37:35 +0900,
  Michael Paquier  wrote:

>> We use defel->location for an error message. (We don't need
>> to set location for the default "text" DefElem.)
> 
> Yeah, but you should not need to have this error in the paths that set
> the callback routines in opts_out if the same validation happens a few
> lines before, in copy.c.

Ah, yes. defel->location is used in later patches. For
example, it's used when a COPY handler for the specified
FORMAT isn't found.

>   I am really worried about the complexities
> this thread is getting into because we are trying to shape the
> callbacks in the most generic way possible based on *two* use cases.
> This is going to be a never-ending discussion.  I'd rather get some
> simple basics, and then we can discuss if tweaking the callbacks is
> really necessary or not.  Even after introducing the pg_proc lookups
> to get custom callbacks.

I understand your concern. Let's introduce minimal callbacks
as the first step. I think that we completed our design
discussion for this feature. We can choose minimal callbacks
based on the discussion.

> The custom options don't seem like an absolute strong
> requirement for the first shot with the callbacks or even the
> possibility to retrieve the callbacks from a function call.  I mean,
> you could provide some control with SET commands and a few GUCs, at
> least, even if that would be strange.  Manipulations with a list of
> DefElems is the intuitive way to have custom options at query level,
> but we also have to guess the set of callbacks from this list of
> DefElems coming from the query.  You see my point, I am not sure 
> if it would be the best thing to process twice the options, especially
> when it comes to decide if a DefElem should be valid or not depending
> on the callbacks used.  Or we could use a kind of "special" DefElem
> where we could store a set of key:value fed to a callback :)

Interesting. Let's remove custom options support from the
initial minimal callbacks.

>> Can I prepare the v10 patch set as "the v7 patch set" + "the
>> removal of the "if" checks in NextCopyFromRawFields()"?
>> (+ reverting opts_out->{csv_mode,binary} changes in
>> ProcessCopyOptions().)
> 
> Yep, if I got it that would make sense to me.  If you can do that,
> that would help quite a bit.  :)

I've prepared the v10 patch set. Could you try this?

Changes since the v7 patch set:

0001:

* Remove CopyToProcessOption() callback
* Remove CopyToGetFormat() callback
* Revert passing CopyToState to ProcessCopyOptions()
* Revert moving "opts_out->{csv_mode,binary} = true" to
  ProcessCopyOptionFormatTo()
* Change to receive "const char *format" instead "DefElem  *defel"
  by ProcessCopyOptionFormatTo()

0002:

* Remove CopyFromProcessOption() callback
* Remove CopyFromGetFormat() callback
* Change to receive "const char *format" instead "DefElem
  *defel" by ProcessCopyOptionFormatFrom()
* Remove "if (cstate->opts.csv_mode)" branches from
  NextCopyFromRawFields()



FYI: Here are Copy{From,To}Routine in the v10 patch set. I
think that only Copy{From,To}OneRow are minimal callbacks
for the performance gain. But can we keep Copy{From,To}Start
and Copy{From,To}End for consistency? We can remove a few
{csv_mode,binary} conditions by Copy{From,To}{Start,End}. It
doesn't depend on the number of COPY target tuples. So they
will not affect performance.

/* Routines for a COPY FROM format implementation. */
typedef struct CopyFromRoutine
{
/*
 * Called when COPY FROM is started. This will initialize something and
 * receive a header.
 */
void(*CopyFromStart) (CopyFromState cstate, TupleDesc 
tupDesc);

/* Copy one row. It returns false if no more tuples. */
bool(*CopyFromOneRow) (CopyFromState cstate, ExprContext 
*econtext, Datum *values, bool *nulls);

/* Called when COPY FROM is ended. This will finalize something. */
void(*CopyFromEnd) (CopyFromState cstate);
}   CopyFromRoutine;

/* Routines for a COPY TO format implementation. */
typedef struct CopyToRoutine
{
/* Called when COPY TO is started. This will send a header. */
void(*CopyToStart) (CopyToState cstate, TupleDesc tupDesc);

/* Copy one row for COPY TO. */
void(*CopyToOneRow) (CopyToState cstate, TupleTableSlot 
*slot);

/* Called when COPY TO is ended. This will send a trailer. */
void(*CopyToEnd) (CopyToState cstate);
}   CopyToRoutine;




Thanks,
-- 
kou
>From f827f1f1632dc330ef5d78141b85df8ca1bce63b Mon Sep 17 00:00:00 2001
From: Sutou Kouhei 
Date: Wed, 31 Jan 2024 13:22:04 +0900
Subject: [PATCH v10 1/2] Extract COPY TO format implementations

This doesn't change the current behavior. This just introduces

Re: Synchronizing slots from primary to standby

2024-01-30 Thread Peter Smith
On Wed, Jan 31, 2024 at 2:18 PM Amit Kapila  wrote:
>
> On Wed, Jan 31, 2024 at 7:27 AM Peter Smith  wrote:
> >
> > I saw that v73-0001 was pushed, but it included some last-minute
> > changes that I did not get a chance to check yesterday.
> >
> > Here are some review comments for the new parts of that patch.
> >
> > ==
> > doc/src/sgml/ref/create_subscription.sgml
> >
> > 1.
> > connect (boolean)
> >
> > Specifies whether the CREATE SUBSCRIPTION command should connect
> > to the publisher at all. The default is true. Setting this to false
> > will force the values of create_slot, enabled, copy_data, and failover
> > to false. (You cannot combine setting connect to false with setting
> > create_slot, enabled, copy_data, or failover to true.)
> >
> > ~
> >
> > I don't think the first part "Setting this to false will force the
> > values ... failover to false." is strictly correct.
> >
> > I think is correct to say all those *other* properties (create_slot,
> > enabled, copy_data) are forced to false because those otherwise have
> > default true values.
> >
>
> So, won't when connect=false, the user has to explicitly provide such
> values (create_slot, enabled, etc.) as false? If so, is using 'force'
> strictly correct?

Perhaps the original docs text could be worded differently; I think
the word "force" here just meant setting connection=false
forces/causes/makes those other options behave "as if" they had been
set to false without the user explicitly doing anything to them. The
point is they are made to behave *differently* to their normal
defaults.

So,
connect=false ==> this actually sets enabled=false (you can see this
with \dRs+), which is different to the default setting of 'enabled'
connect=false ==> will not create a slot (because there is no
connection), which is different to the default behaviour for
'create-slot'
connect=false ==> will not copy tables (because there is no
connection), which is different to the default behaviour for
'copy_data;'

OTOH, failover is different
connect=false ==> failover is not possible (because there is no
connection), but the 'failover' default is false anyway, so no change
to the default behaviour for this one.

>
> > ~~~
> >
> > 2.
> >   
> >Since no connection is made when this option is
> >false, no tables are subscribed. To initiate
> >replication, you must manually create the replication slot, 
> > enable
> > -  the subscription, and refresh the subscription. See
> > +  the failover if required, enable the subscription, and refresh 
> > the
> > +  subscription. See
> > > linkend="logical-replication-subscription-examples-deferred-slot"/>
> >for examples.
> >   
> >
> > IMO "see the failover if required" is very vague. See what failover?
> >
>
> AFAICS, the committed docs says: "To initiate replication, you must
> manually create the replication slot, enable the failover if required,
> ...". I am not sure what you are referring to.
>

My mistake. I was misreading the patch code without applying the
patch. Sorry for the noise.

> >
> > ==
> > src/bin/pg_upgrade/t/004_subscription.pl
> >
> > 5.
> > -# The subscription's running status should be preserved. Old subscription
> > -# regress_sub1 should be enabled and old subscription regress_sub2 should 
> > be
> > -# disabled.
> > +# The subscription's running status and failover option should be 
> > preserved.
> > +# Old subscription regress_sub1 should have enabled and failover as true 
> > while
> > +# old subscription regress_sub2 should have enabled and failover as false.
> >  $result =
> >$new_sub->safe_psql('postgres',
> > - "SELECT subname, subenabled FROM pg_subscription ORDER BY subname");
> > -is( $result, qq(regress_sub1|t
> > -regress_sub2|f),
> > + "SELECT subname, subenabled, subfailover FROM pg_subscription ORDER
> > BY subname");
> > +is( $result, qq(regress_sub1|t|t
> > +regress_sub2|f|f),
> >   "check that the subscription's running status are preserved");
> >
> > ~
> >
> > Calling those "old subscriptions" seems misleading. Aren't these the
> > new/upgraded subscriptions being checked here?
> >
>
> Again the quoted wording is not introduced by this patch. But, I see
> your point and it is better if you can start a separate thread for it.
>

OK. I created a separate thread for this [1]

==
[1] 
https://www.postgresql.org/message-id/cahut+pu1uslphrysptacy1k_q-ddsrxnfhmj_2u1nfqbc1y...@mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia.




Re: Possibility to disable `ALTER SYSTEM`

2024-01-30 Thread David G. Johnston
On Tuesday, January 30, 2024, Tom Lane  wrote:
>
>
> My larger point here is that trying to enforce restrictions on
> superusers *within* Postgres is simply not a good plan, for
> largely the same reasons that Robert questioned making the
> GUC mechanism police itself.  It needs to be done outside,
> either at the filesystem level or via some other kernel-level
> security system.
>
>
The idea of adding a file to the data directory appeals to me.

optional_runtime_features.conf
alter_system=enabled
copy_from_program=enabled
copy_to_program=disabled

If anyone tries to use disabled features the system emits an error:

ERROR: Cannot send copy output to program, action disabled by host.

My main usability question is whether restart required is an acceptable
restriction.

Making said file owned by root (or equivalent) and only readable by the
postgres process user suffices to lock it down.  Refusing to start if the
file is writable, and at least one feature is disabled can be considered,
with a startup option to bypass that check if desired.

David J.


src/bin/pg_upgrade/t/004_subscription.pl test comment fix

2024-01-30 Thread Peter Smith
Hi,

PSA a small fix for a misleading comment found in the pg_upgrade test code.

==
Kind Regards,
Peter Smith.
Fujitsu Australia


v1-0001-Fix-test-comment.patch
Description: Binary data


Re: Apply the "LIMIT 1" optimization to partial DISTINCT

2024-01-30 Thread David Rowley
On Fri, 26 Jan 2024 at 21:14, David Rowley  wrote:
> However, having said that. Parallel plans are often picked when there
> is some highly selective qual as parallel_tuple_cost has to be applied
> to fewer tuples for such plans, so probably this is worth doing.

I was messing around with your test case and didn't manage to get any
plan that had any rows to use the partial path with the LIMIT.  I
ended up dropping the test that was checking the results were empty as
I didn't think it added much more value over the EXPLAIN output.

I pushed the result.

Thanks for working on this.

David




Re: Possibility to disable `ALTER SYSTEM`

2024-01-30 Thread Tom Lane
Magnus Hagander  writes:
> On Tue, Jan 30, 2024 at 10:48 PM Tom Lane  wrote:
>> I was imagining using selinux and/or sepgsql to directly prevent
>> writing postgresql.auto.conf from the Postgres account.

> Wouldn't a simple "chattr +i postgresql.auto.conf" work?

Hmm, I'm not too familiar with that file attribute, but it looks
like it'd work (on platforms that support it).

My larger point here is that trying to enforce restrictions on
superusers *within* Postgres is simply not a good plan, for
largely the same reasons that Robert questioned making the
GUC mechanism police itself.  It needs to be done outside,
either at the filesystem level or via some other kernel-level
security system.

regards, tom lane




Re: Fix some ubsan/asan related issues

2024-01-30 Thread Alexander Lakhin

Hello,

30.01.2024 18:57, Tristan Partin wrote:

Patch 1:

Passing NULL as a second argument to memcpy breaks ubsan, ...


Maybe you would like to fix also one more similar place, reached with:
create extension xml2;
select xslt_process('',
$$http://www.w3.org/1999/XSL/Transform;>


$$);

varlena.c:201:26: runtime error: null pointer passed as argument 2, which is 
declared to never be null

There is also an issue with pg_bsd_indent, I stumble upon when doing
`make check-world` with the sanitizers enabled:
https://www.postgresql.org/message-id/591971ce-25c1-90f3-0526-5f54e3ebb32e%40gmail.com

31.01.2024 00:23, Andres Freund wrote:

The reason asan fails is that it uses a "shadow stack" to track stack variable
lifetimes. These confuse our stack depth check. CI doesn't have the issue
because the compiler doesn't yet enable the feature, locally I get around it
by using ASAN_OPTIONS=detect_stack_use_after_return=0:...


Even with detect_stack_use_after_return=0, clang-18's asan makes the test
012_subtransactions.pl fail:
2024-01-31 03:24:25.691 UTC [4112455] 012_subtransactions.pl LOG: statement: 
SELECT hs_subxids(201);
2024-01-31 03:24:25.714 UTC [4112455] 012_subtransactions.pl ERROR: stack depth 
limit exceeded
2024-01-31 03:24:25.714 UTC [4112455] 012_subtransactions.pl HINT: Increase the configuration parameter max_stack_depth 
(currently 2048kB), after ensuring the platform's stack depth limit is adequate.


(All the other tests pass.)
Though the same test passes when I use clang-16.

Best regards,
Alexander

Re: Parallelize correlated subqueries that execute within each worker

2024-01-30 Thread Tom Lane
James Coleman  writes:
> I've finally had a chance to look at this, and I don't believe there's
> any real failure here, merely drift of how the planner works on master
> resulting in this query now being eligible for a different plan shape.

> I was a bit wary at first because the changing test query is one I'd
> previously referenced in [1] as likely exposing the bug I'd fixed
> where params where being used across worker boundaries. However
> looking at the diff in the patch at that point (v10) that particular
> test query formed a different plan shape (there were two gather nodes
> being created, and params crossing between them).

> But in the current revision of master with the current patch applied
> that's no longer true: we have a Gather node, and the Subplan using
> the param is properly under that Gather node, and the param should be
> being both generated and consumed within the same worker process.

Hmm ... so the question this raises for me is: was that test intended
to verify behavior of params being passed across workers?  If so,
haven't you broken the point of the test?  This doesn't mean that
your code change is wrong; but I think maybe you need to find a way
to modify that test case so that it still tests what it's meant to.
This is a common hazard when changing the planner's behavior.

regards, tom lane




Re: Synchronizing slots from primary to standby

2024-01-30 Thread Amit Kapila
On Wed, Jan 31, 2024 at 7:27 AM Peter Smith  wrote:
>
> I saw that v73-0001 was pushed, but it included some last-minute
> changes that I did not get a chance to check yesterday.
>
> Here are some review comments for the new parts of that patch.
>
> ==
> doc/src/sgml/ref/create_subscription.sgml
>
> 1.
> connect (boolean)
>
> Specifies whether the CREATE SUBSCRIPTION command should connect
> to the publisher at all. The default is true. Setting this to false
> will force the values of create_slot, enabled, copy_data, and failover
> to false. (You cannot combine setting connect to false with setting
> create_slot, enabled, copy_data, or failover to true.)
>
> ~
>
> I don't think the first part "Setting this to false will force the
> values ... failover to false." is strictly correct.
>
> I think is correct to say all those *other* properties (create_slot,
> enabled, copy_data) are forced to false because those otherwise have
> default true values.
>

So, won't when connect=false, the user has to explicitly provide such
values (create_slot, enabled, etc.) as false? If so, is using 'force'
strictly correct?

> ~~~
>
> 2.
>   
>Since no connection is made when this option is
>false, no tables are subscribed. To initiate
>replication, you must manually create the replication slot, enable
> -  the subscription, and refresh the subscription. See
> +  the failover if required, enable the subscription, and refresh the
> +  subscription. See
> linkend="logical-replication-subscription-examples-deferred-slot"/>
>for examples.
>   
>
> IMO "see the failover if required" is very vague. See what failover?
>

AFAICS, the committed docs says: "To initiate replication, you must
manually create the replication slot, enable the failover if required,
...". I am not sure what you are referring to.

>
> ==
> src/bin/pg_upgrade/t/004_subscription.pl
>
> 5.
> -# The subscription's running status should be preserved. Old subscription
> -# regress_sub1 should be enabled and old subscription regress_sub2 should be
> -# disabled.
> +# The subscription's running status and failover option should be preserved.
> +# Old subscription regress_sub1 should have enabled and failover as true 
> while
> +# old subscription regress_sub2 should have enabled and failover as false.
>  $result =
>$new_sub->safe_psql('postgres',
> - "SELECT subname, subenabled FROM pg_subscription ORDER BY subname");
> -is( $result, qq(regress_sub1|t
> -regress_sub2|f),
> + "SELECT subname, subenabled, subfailover FROM pg_subscription ORDER
> BY subname");
> +is( $result, qq(regress_sub1|t|t
> +regress_sub2|f|f),
>   "check that the subscription's running status are preserved");
>
> ~
>
> Calling those "old subscriptions" seems misleading. Aren't these the
> new/upgraded subscriptions being checked here?
>

Again the quoted wording is not introduced by this patch. But, I see
your point and it is better if you can start a separate thread for it.

-- 
With Regards,
Amit Kapila.




Re: Parallelize correlated subqueries that execute within each worker

2024-01-30 Thread James Coleman
On Tue, Jan 30, 2024 at 11:54 AM Robert Haas  wrote:
>
> On Tue, Jan 30, 2024 at 11:17 AM Akshat Jaimini  wrote:
> > I think we should move this patch to the next CF as I believe that work is 
> > still going on resolving the last reported bug.
>
> We shouldn't just keep pushing this forward to the next CF. It's been
> idle since July. If it needs more work, mark it RwF and it can be
> reopened when there's something for a reviewer to do.

I don't follow the "Idle since July" since it just hasn't received
review since then, so there's been nothing to reply to.

That being said, Vignesh's note in January about a now-failing test is
relevant activity, and I've just today responded to that, so I'm
changing the status back from Waiting on Author to Needs Review.

Regards,
James Coleman




Re: POC, WIP: OR-clause support for indexes

2024-01-30 Thread jian he
+/*
+ * Hash function that's compatible with guc_name_compare
+ */
+static uint32
+orclause_hash(const void *data, Size keysize)
+{
+ OrClauseGroupKey   *key = (OrClauseGroupKey *) data;
+ uint64 hash;
+
+ (void) JumbleExpr(key->expr, );
+ hash += ((uint64) key->opno + (uint64) key->exprtype) % UINT64_MAX;
+ return hash;
+}

looks strange. `hash` is uint64, but here you return uint32.

based on my understanding of
https://www.postgresql.org/docs/current/xoper-optimization.html#XOPER-COMMUTATOR
I think you need move commutator check right after the `if
(get_op_rettype(opno) != BOOLOID)` branch

+ opno = ((OpExpr *) orqual)->opno;
+ if (get_op_rettype(opno) != BOOLOID)
+ {
+ /* Only operator returning boolean suits OR -> ANY transformation */
+ or_list = lappend(or_list, orqual);
+ continue;
+ }

select  
po.oprname,po.oprkind,po.oprcanhash,po.oprleft::regtype,po.oprright,po.oprresult,
po1.oprname
frompg_operator po join pg_operator po1
on  po.oprcom = po1.oid
where   po.oprresult = 16;

I am wondering, are all these types as long as the return type is bool
suitable for this transformation?




Re: Parallelize correlated subqueries that execute within each worker

2024-01-30 Thread James Coleman
On Tue, Jan 9, 2024 at 2:09 AM vignesh C  wrote:
> ...
> > Given all of that I settled on this approach:
> > 1. Modify is_parallel_safe() to by default ignore PARAM_EXEC params.
> > 2. Add is_parallel_safe_with_params() that checks for the existence of
> > such params.
> > 3. Store the required params in a bitmapset on each base rel.
> > 4. Union the bitmapset on join rels.
> > 5. Only insert a gather node if that bitmapset is empty.
> >
> > I have an intuition that there's some spot (e.g. joins) that we should
> > be removing params from this set (e.g., when we've satisfied them),
> > but I haven't been able to come up with such a scenario as yet.
> >
> > The attached v11 fixes the issue you reported.
>
> One of the tests has failed in CFBot at [1] with:
> +++ 
> /tmp/cirrus-ci-build/build/testrun/pg_upgrade/002_pg_upgrade/data/results/select_parallel.out
> 2023-12-20 20:08:42.480004000 +
> @@ -137,23 +137,24 @@
>  explain (costs off)
>   select (select max((select pa1.b from part_pa_test pa1 where pa1.a = 
> pa2.a)))
>   from part_pa_test pa2;
> -  QUERY PLAN
> ---
> - Aggregate
> + QUERY PLAN
> +
> + Finalize Aggregate
> ->  Gather
>   Workers Planned: 3
> - ->  Parallel Append
> -   ->  Parallel Seq Scan on part_pa_test_p1 pa2_1
> -   ->  Parallel Seq Scan on part_pa_test_p2 pa2_2
> + ->  Partial Aggregate
> +   ->  Parallel Append
> + ->  Parallel Seq Scan on part_pa_test_p1 pa2_1
> + ->  Parallel Seq Scan on part_pa_test_p2 pa2_2
> +   SubPlan 1
> + ->  Append
> +   ->  Seq Scan on part_pa_test_p1 pa1_1
> + Filter: (a = pa2.a)
> +   ->  Seq Scan on part_pa_test_p2 pa1_2
> + Filter: (a = pa2.a)
> SubPlan 2
>   ->  Result
> -   SubPlan 1
> - ->  Append
> -   ->  Seq Scan on part_pa_test_p1 pa1_1
> - Filter: (a = pa2.a)
> -   ->  Seq Scan on part_pa_test_p2 pa1_2
> - Filter: (a = pa2.a)
> -(14 rows)
> +(15 rows)
>
> More details of the failure is available at [2].
>
> [1] - https://cirrus-ci.com/task/5685696451575808
> [2] - 
> https://api.cirrus-ci.com/v1/artifact/task/5685696451575808/testrun/build/testrun/pg_upgrade/002_pg_upgrade/log/regress_log_002_pg_upgrade

Thanks for noting this here.

I've finally had a chance to look at this, and I don't believe there's
any real failure here, merely drift of how the planner works on master
resulting in this query now being eligible for a different plan shape.

I was a bit wary at first because the changing test query is one I'd
previously referenced in [1] as likely exposing the bug I'd fixed
where params where being used across worker boundaries. However
looking at the diff in the patch at that point (v10) that particular
test query formed a different plan shape (there were two gather nodes
being created, and params crossing between them).

But in the current revision of master with the current patch applied
that's no longer true: we have a Gather node, and the Subplan using
the param is properly under that Gather node, and the param should be
being both generated and consumed within the same worker process.

So I've updated the patch to show that plan change as part of the diff.

See attached v12

Regards,
James Coleman

1: 
https://www.postgresql.org/message-id/CAAaqYe-_TObm5KwmZLYXBJ3BJGh4cUZWM0v1mY1gWTMkRNQXDQ%40mail.gmail.com


v12-0002-Parallelize-correlated-subqueries.patch
Description: Binary data


v12-0001-Add-tests-before-change.patch
Description: Binary data


Re: Documentation: warn about two_phase when altering a subscription

2024-01-30 Thread Peter Smith
Hi, thanks for the patch. Here are some review comments for v1

==

(below is not showing the links and other sgml rendering -- all those LGTM)

BEFORE
When altering the slot_name, the failover and two_phase properties
values of the named slot may differ from their counterparts failover
and two_phase parameters specified in the subscription. When creating
the slot, ensure the slot failover and two_phase properties match
their counterparts parameters values of the subscription.

SUGGESTION
When altering the slot_name, the failover and two_phase property
values of the named slot may differ from the counterpart failover and
two_phase parameters specified by the subscription. When creating the
slot, ensure the slot properties failover and two_phase match their
counterpart parameters of the subscription.

~

BEFORE
Otherwise, the slot on the publisher may behave differently from what
subscription's failover and two_phase options say: for example, the
slot on the publisher could ...

SUGGESTION:
Otherwise, the slot on the publisher may behave differently from what
these subscription options say: for example, the slot on the publisher
could ...

==
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Synchronizing slots from primary to standby

2024-01-30 Thread Peter Smith
Hi,

I saw that v73-0001 was pushed, but it included some last-minute
changes that I did not get a chance to check yesterday.

Here are some review comments for the new parts of that patch.

==
doc/src/sgml/ref/create_subscription.sgml

1.
connect (boolean)

Specifies whether the CREATE SUBSCRIPTION command should connect
to the publisher at all. The default is true. Setting this to false
will force the values of create_slot, enabled, copy_data, and failover
to false. (You cannot combine setting connect to false with setting
create_slot, enabled, copy_data, or failover to true.)

~

I don't think the first part "Setting this to false will force the
values ... failover to false." is strictly correct.

I think is correct to say all those *other* properties (create_slot,
enabled, copy_data) are forced to false because those otherwise have
default true values. But the 'failover' has default false, so it
cannot get force-changed at all because you can't set connect to false
when failover is true as the second part ("You cannot combine...")
explains.

IMO remove 'failover' from that first sentence.

~~~

2.
  
   Since no connection is made when this option is
   false, no tables are subscribed. To initiate
   replication, you must manually create the replication slot, enable
-  the subscription, and refresh the subscription. See
+  the failover if required, enable the subscription, and refresh the
+  subscription. See
   
   for examples.
  

IMO "see the failover if required" is very vague. See what failover?
The slot property failover, or the subscription option failover? And
"see" it for what purpose?

I think the intention was probably to say something like "ensure the
manually created slot has the same matching failover property value as
the subscriber failover option", but that is not clear from the
current text.

==
doc/src/sgml/ref/pg_dump.sgml

3.
dump can be restored without requiring network access to the remote
servers.  It is then up to the user to reactivate the subscriptions in a
suitable way.  If the involved hosts have changed, the connection
-   information might have to be changed.  It might also be appropriate to
+   information might have to be changed.  If the subscription needs to
+   be enabled for
+   failover,
+   then same needs to be done by executing
+   
+   ALTER SUBSCRIPTION ... SET(failover = true)
+   after the slot has been created.  It might also be appropriate to

"then same needs to be done" (English?)

BEFORE
If the subscription needs to be enabled for failover, then same needs
to be done by executing ALTER SUBSCRIPTION ... SET(failover = true)
after the slot has been created.

SUGGESTION
If the subscription needs to be enabled for failover, execute ALTER
SUBSCRIPTION ... SET(failover = true) after the slot has been created.

==
src/backend/commands/subscriptioncmds.c

4.
 #define SUBOPT_RUN_AS_OWNER 0x1000
-#define SUBOPT_LSN 0x2000
-#define SUBOPT_ORIGIN 0x4000
+#define SUBOPT_FAILOVER 0x2000
+#define SUBOPT_LSN 0x4000
+#define SUBOPT_ORIGIN 0x8000
+

A spurious blank line was added.

==
src/bin/pg_upgrade/t/004_subscription.pl

5.
-# The subscription's running status should be preserved. Old subscription
-# regress_sub1 should be enabled and old subscription regress_sub2 should be
-# disabled.
+# The subscription's running status and failover option should be preserved.
+# Old subscription regress_sub1 should have enabled and failover as true while
+# old subscription regress_sub2 should have enabled and failover as false.
 $result =
   $new_sub->safe_psql('postgres',
- "SELECT subname, subenabled FROM pg_subscription ORDER BY subname");
-is( $result, qq(regress_sub1|t
-regress_sub2|f),
+ "SELECT subname, subenabled, subfailover FROM pg_subscription ORDER
BY subname");
+is( $result, qq(regress_sub1|t|t
+regress_sub2|f|f),
  "check that the subscription's running status are preserved");

~

Calling those "old subscriptions" seems misleading. Aren't these the
new/upgraded subscriptions being checked here?

Should the comment be more like:

# The subscription's running status and failover option should be preserved.
# Upgraded regress_sub1 should still have enabled and failover as true.
# Upgraded regress_sub2 should still have enabled and failover as false.

==
Kind Regards,
Peter Smith.
Fujitsu Australia.




Re: 003_extrafiles.pl test fails on Windows with the newer Perl versions

2024-01-30 Thread Andrew Dunstan



On 2024-01-30 Tu 18:06, Tom Lane wrote:

Andrew Dunstan  writes:

Pushed to all live branches. Thanks for the patch.

v12 and v13 branches aren't looking good:

Global symbol "$test_primary_datadir" requires explicit package name (did you forget to 
declare "my $test_primary_datadir"?) at t/003_extrafiles.pl line 80.
Execution of t/003_extrafiles.pl aborted due to compilation errors.
# Looks like your test exited with 255 before it could output anything.
[17:19:57] t/003_extrafiles.pl ..
Dubious, test returned 255 (wstat 65280, 0xff00)
Failed 5/5 subtests





Will fix.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Flushing large data immediately in pqcomm

2024-01-30 Thread Jelte Fennema-Nio
On Tue, 30 Jan 2024 at 19:48, Robert Haas  wrote:
>
> On Tue, Jan 30, 2024 at 12:58 PM Melih Mutlu  wrote:
> > Sounds like it's difficult to come up with a heuristic that would work well 
> > enough for most cases.
> > One thing with sending data instead of copying it if the buffer is empty is 
> > that initially the buffer is empty. I believe it will stay empty forever if 
> > we do not copy anything when the buffer is empty. We can maybe simply set 
> > the threshold to the buffer size/2 (4kB) and hope that will work better. Or 
> > copy the data only if it fits into the remaining space in the buffer. What 
> > do you think?
> >
> > An additional note while I mentioned pq_putmessage_noblock(), I've been 
> > testing sending input data immediately in pq_putmessage_noblock() without 
> > blocking and copy the data into PqSendBuffer only if the socket would block 
> > and cannot send it. Unfortunately, I don't have strong numbers to 
> > demonstrate any improvement in perf or timing yet. But I still like to know 
> > what would you think about it?
>
> I think this is an area where it's very difficult to foresee on
> theoretical grounds what will be right in practice

I agree that it's hard to prove that such heuristics will always be
better in practice than the status quo. But I feel like we shouldn't
let perfect be the enemy of good here. I one approach that is a clear
improvement over the status quo is:
1. If the buffer is empty AND the data we are trying to send is larger
than the buffer size, then don't use the buffer.
2. If not, fill up the buffer first (just like we do now) then send
that. And if the left over data is then still larger than the buffer,
then now the buffer is empty so 1. applies.




Re: 003_extrafiles.pl test fails on Windows with the newer Perl versions

2024-01-30 Thread Tom Lane
Andrew Dunstan  writes:
> Pushed to all live branches. Thanks for the patch.

v12 and v13 branches aren't looking good:

Global symbol "$test_primary_datadir" requires explicit package name (did you 
forget to declare "my $test_primary_datadir"?) at t/003_extrafiles.pl line 80.
Execution of t/003_extrafiles.pl aborted due to compilation errors.
# Looks like your test exited with 255 before it could output anything.
[17:19:57] t/003_extrafiles.pl .. 
Dubious, test returned 255 (wstat 65280, 0xff00)
Failed 5/5 subtests 

regards, tom lane




Re: why there is not VACUUM FULL CONCURRENTLY?

2024-01-30 Thread Michael Paquier
On Tue, Jan 30, 2024 at 12:37:12PM +0100, Alvaro Herrera wrote:
> On 2024-Jan-30, Pavel Stehule wrote:
> 
> > some basic variant (without autovacuum support) can be good enough. We have
> > no autovacuum support for REINDEX CONCURRENTLY and I don't see a necessity
> > for it (sure, it can be limited by my perspective) . The necessity of
> > reducing table size is not too common (a lot of use cases are better
> > covered by using partitioning), but sometimes it is, and then buildin
> > simple available solution can be helpful.
> 
> That's my thinking as well.

Or, yes, I'd agree about that.  This can make for a much better user
experience.  I'm just not sure how that stuff would be shaped and how
much ground it would need to cover.
--
Michael


signature.asc
Description: PGP signature


Re: Incorrect cost for MergeAppend

2024-01-30 Thread David Rowley
On Wed, 31 Jan 2024 at 02:23, Alexander Kuzmenkov
 wrote:
>
> On Tue, Jan 30, 2024 at 1:20 PM David Rowley  wrote:
> > You should likely focus on trying to find a test that does not require
> > making 2 tables with 10k rows.
>
> Is 1k smallint OK? It should fit in one page. Still reproduces the
> error, and the entire test case runs in under 10 ms.

I had a go at making it a bit smaller without going dangerously close
to where the plan might change. The following seems to work.

create table ma0(a int primary key);
create table ma1() inherits (ma0);
insert into ma0 select generate_series(1, 400);
insert into ma1 select generate_series(1, 200);
analyze ma0;
analyze ma1;

explain (costs off) select * from ma0 where a < 100 order by a;

drop table ma0 cascade;

As for backpatching this.  It does risk plans changing in stable
versions of PostgreSQL, which we normally try to avoid.  However, it
seems quite similar to 1e731ed12a, albeit, far more long-standing.
I'm currently thinking we should backpatch this back to 12.

Does anyone disagree?

David




Re: [PATCH] Add native windows on arm64 support

2024-01-30 Thread Dave Cramer
On Tue, Jan 30, 2024 at 4:56 PM Andrew Dunstan  wrote:

>
> On 2024-01-30 Tu 09:50, Dave Cramer wrote:
>
>
>
> On Tue, 30 Jan 2024 at 08:38, Andrew Dunstan  wrote:
>
>>
>> On 2024-01-29 Mo 11:20, Dave Cramer wrote:
>>
>>
>> Dave Cramer
>> www.postgres.rocks
>>
>>
>> On Mon, 29 Jan 2024 at 11:16, Andrew Dunstan  wrote:
>>
>>>
>>> On 2024-01-26 Fr 09:18, Dave Cramer wrote:
>>>
>>>
>>>
>>> On Fri, 26 Jan 2024 at 07:36, Andrew Dunstan 
>>> wrote:
>>>

 On 2024-01-25 Th 20:32, Michael Paquier wrote:
 > On Thu, Jan 25, 2024 at 04:52:30PM -0500, Dave Cramer wrote:
 >> On Thu, 25 Jan 2024 at 16:32, Andrew Dunstan 
 wrote:
 >>> On 2024-01-25 Th 16:17, Dave Cramer wrote:
 >>> Yeah, I think the default Developer Command Prompt for VS2022 is
 set up
 >>> for x86 builds. AIUI you should start by executing "vcvarsall
 x64_arm64".
 >> Yup, now I'm in the same state you are
 > Wait a minute here.  Based on [1], x64_arm64 means you can use a x64
 > host and you'll be able to produce ARM64 builds, still these will not
 > be able to run on the host where they were built.  How much of the
 > patch posted upthread is required to produce such builds?  Basically
 > everything from it, I guess, so as build dependencies can be
 > satisfied?
 >
 > [1]:
 https://learn.microsoft.com/en-us/cpp/build/building-on-the-command-line?view=msvc-170


 If you look at the table here x86 and x64 are the only supported host
 architectures. But that's OK, the x64 binaries will run on arm64 (W11
 ARM64 has x64 emulation builtin). If that didn't work Dave and I would
 not have got as far as we have. But you want the x64_arm64 argument to
 vcvarsall so you will get ARM64 output.

>>>
>>> I've rebuilt it using  x64_arm64 and with the attached (very naive
>>> patch) and I still get an x64 binary :(
>>>
>>>
>>> With this patch I still get a build error, but it's different :-)
>>>
>>>
>>> [1406/2088] "link" @src/backend/postgres.exe.rsp
>>> FAILED: src/backend/postgres.exe src/backend/postgres.pdb
>>> "link" @src/backend/postgres.exe.rsp
>>>Creating library src\backend\postgres.exe.lib
>>>
>>> storage_lmgr_s_lock.c.obj : error LNK2019: unresolved external symbol
>>> spin_delay referenced in function perform_spin_delay
>>>
>>> src\backend\postgres.exe : fatal error LNK1120: 1 unresolved externals
>>>
>>
>> Did you add the latest lock.patch ?
>>
>>
>>
>>
>> I'm a bit confused about exactly what needs to be applied. Can you supply
>> a complete patch to be applied to a pristine checkout that will let me
>> build?
>>
>>
>> cheers
>>
>
> See attached.
>
>
>
> No, that is what is giving me the error shown above (just tried again to
> be certain). And it's not surprising, as patch 2 #ifdef's out the
> definition of spin_delay().
>
> If you can get a complete build with these patches then I suspect you're
> not doing a proper ARM64 build.
>
>
Okay I will look when I get home in a week

Dave

>
> cheers
>
>
> andrew
>
> --
> Andrew Dunstan
> EDB: https://www.enterprisedb.com
>
>


Re: psql not responding to SIGINT upon db reconnection

2024-01-30 Thread Jelte Fennema-Nio
On Tue, 30 Jan 2024 at 23:20, Tristan Partin  wrote:
> Not next week, but here is a respin. I've exposed pqSocketPoll as
> PQsocketPoll and am just using that. You can see the diff is so much
> smaller, which is great!

The exports.txt change should be made part of patch 0001, also docs
are missing for the newly exposed function. PQsocketPoll does look
like quite a nice API to expose from libpq.




Re: psql not responding to SIGINT upon db reconnection

2024-01-30 Thread Tristan Partin

On Fri Jan 12, 2024 at 11:13 AM CST, Tristan Partin wrote:

On Fri Jan 12, 2024 at 10:45 AM CST, Robert Haas wrote:
> On Mon, Jan 8, 2024 at 1:03 AM Tristan Partin  wrote:
> > I think the way to go is to expose some variation of libpq's
> > pqSocketPoll(), which I would be happy to put together a patch for.
> > Making frontends, psql in this case, have to reimplement the polling
> > logic doesn't strike me as fruitful, which is essentially what I have
> > done.
>
> I encourage further exploration of this line of attack. I fear that if
> I were to commit something like what you've posted up until now,
> people would complain that that code was too ugly to live, and I'd
> have a hard time telling them that they're wrong.

Completely agree. Let me look into this. Perhaps I can get something up 
next week or the week after.


Not next week, but here is a respin. I've exposed pqSocketPoll as 
PQsocketPoll and am just using that. You can see the diff is so much 
smaller, which is great!


In order to fight the race condition, I am just using a 1 second timeout 
instead of trying to integrate pselect or ppoll. We could add 
a PQsocketPPoll() to support those use cases, but I am not sure how 
available pselect and ppoll are. I guess on Windows we don't have 
pselect. I don't think using the pipe trick that Heikki mentioned 
earlier is suitable to expose via an API in libpq, but someone else 
might have a different opinion.


Maybe this is good enough until someone complains? Most people would 
probably just chalk any latency between keypress and cancellation as 
network latency and not a hardcoded 1 second.


Thanks for your feedback Robert!

--
Tristan Partin
Neon (https://neon.tech)
From 4fa6db1900a355a171d3e16019d02f3a415764d0 Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Tue, 30 Jan 2024 15:40:57 -0600
Subject: [PATCH v8 1/2] Expose PQsocketPoll for frontends

PQsocketPoll should help with state machine processing when connecting
to a database asynchronously via PQconnectStart().
---
 doc/src/sgml/libpq.sgml | 5 -
 src/interfaces/libpq/fe-misc.c  | 7 +++
 src/interfaces/libpq/libpq-fe.h | 4 
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index d0d5aefadc..aa26c2cc8d 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -358,7 +358,10 @@ PostgresPollingStatusType PQconnectPoll(PGconn *conn);
Loop thus: If PQconnectPoll(conn) last returned
PGRES_POLLING_READING, wait until the socket is ready to
read (as indicated by select(), poll(), or
-   similar system function).
+   similar system function).  Note that PQsocketPoll
+   can help reduce boilerplate by abstracting the setup of
+   select(), or poll() if it is
+   available on your system.
Then call PQconnectPoll(conn) again.
Conversely, if PQconnectPoll(conn) last returned
PGRES_POLLING_WRITING, wait until the socket is ready
diff --git a/src/interfaces/libpq/fe-misc.c b/src/interfaces/libpq/fe-misc.c
index 47a28b0a3a..ee917d375d 100644
--- a/src/interfaces/libpq/fe-misc.c
+++ b/src/interfaces/libpq/fe-misc.c
@@ -55,7 +55,6 @@ static int	pqPutMsgBytes(const void *buf, size_t len, PGconn *conn);
 static int	pqSendSome(PGconn *conn, int len);
 static int	pqSocketCheck(PGconn *conn, int forRead, int forWrite,
 		  time_t end_time);
-static int	pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time);
 
 /*
  * PQlibVersion: return the libpq version number
@@ -1059,7 +1058,7 @@ pqSocketCheck(PGconn *conn, int forRead, int forWrite, time_t end_time)
 
 	/* We will retry as long as we get EINTR */
 	do
-		result = pqSocketPoll(conn->sock, forRead, forWrite, end_time);
+		result = PQsocketPoll(conn->sock, forRead, forWrite, end_time);
 	while (result < 0 && SOCK_ERRNO == EINTR);
 
 	if (result < 0)
@@ -1083,8 +1082,8 @@ pqSocketCheck(PGconn *conn, int forRead, int forWrite, time_t end_time)
  * Timeout is infinite if end_time is -1.  Timeout is immediate (no blocking)
  * if end_time is 0 (or indeed, any time before now).
  */
-static int
-pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time)
+int
+PQsocketPoll(int sock, int forRead, int forWrite, time_t end_time)
 {
 	/* We use poll(2) if available, otherwise select(2) */
 #ifdef HAVE_POLL
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index defc415fa3..11a7fd32b6 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -21,6 +21,7 @@ extern "C"
 #endif
 
 #include 
+#include 
 
 /*
  * postgres_ext.h defines the backend's externally visible types,
@@ -644,6 +645,9 @@ extern int	lo_export(PGconn *conn, Oid lobjId, const char *filename);
 /* Get the version of the libpq library in use */
 extern int	PQlibVersion(void);
 
+/* Poll a file descriptor for reading and/or writing with a timeout */
+extern int	PQsocketPoll(int sock, int forRead, int forWrite, 

Re: 003_extrafiles.pl test fails on Windows with the newer Perl versions

2024-01-30 Thread Andrew Dunstan



On 2024-01-30 Tu 06:49, Andrew Dunstan wrote:


On 2024-01-30 Tu 06:21, Nazir Bilal Yavuz wrote:

Hi,

I was trying to install newer Perl versions to Windows CI images and
found that 003_extrafiles.pl test fails on Windows with:

(0.183s) not ok 2 - file lists match
(0.000s) #   Failed test 'file lists match'
#   at C:/cirrus/src/bin/pg_rewind/t/003_extrafiles.pl line 81.
(0.000s) # Structures begin differing at:
#  $got->[0] =
'C:/cirrus/build/testrun/pg_rewind/003_extrafiles/data/t_003_extrafiles_primary_local_data/pgdata/tst_both_dir' 


# $expected->[0] =
'C:\cirrus\build/testrun/pg_rewind/003_extrafiles\data/t_003_extrafiles_primary_local_data/pgdata/tst_both_dir' 



(0.263s) not ok 5 - file lists match
(0.000s) #   Failed test 'file lists match'
#   at C:/cirrus/src/bin/pg_rewind/t/003_extrafiles.pl line 81.
(0.000s) # Structures begin differing at:
#  $got->[0] =
'C:/cirrus/build/testrun/pg_rewind/003_extrafiles/data/t_003_extrafiles_primary_remote_data/pgdata/tst_both_dir' 


# $expected->[0] =
'C:\cirrus\build/testrun/pg_rewind/003_extrafiles\data/t_003_extrafiles_primary_remote_data/pgdata/tst_both_dir' 



It looks like File::Find converts backslashes to slashes in the newer
Perl versions. I tried to find the related commit and found this:
https://github.com/Perl/perl5/commit/414f14df98cb1c9a20f92c5c54948b67c09f072d 



To solve this, I did the same conversion for Windows before comparing
the paths. And to support all Perl versions, I decided to always
convert them on Windows regardless of the Perl's version. The fix is
attached.

I looked at other File::Find appearances in the code but they do not
compare the paths. So, I do not think there is any need to fix them.

Any kind of feedback would be appreciated.



Looks reasonable on the face of it. I'll see about pushing this today.



Pushed to all live branches. Thanks for the patch.


cheers


andrew

--

Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Possibility to disable `ALTER SYSTEM`

2024-01-30 Thread Magnus Hagander
On Tue, Jan 30, 2024 at 10:48 PM Tom Lane  wrote:
>
> Robert Haas  writes:
> > There's nothing wrong with that exactly, but what does it gain us over
> > my proposal of a sentinel file?
>
> I was imagining using selinux and/or sepgsql to directly prevent
> writing postgresql.auto.conf from the Postgres account.  Combine that
> with a non-Postgres-owned postgresql.conf (already supported) and you
> have something that seems actually bulletproof, rather than a hint.
> Admittedly, using that approach requires knowing something about a
> non-Postgres security mechanism.

Wouldn't a simple "chattr +i postgresql.auto.conf" work?

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: Fix some ubsan/asan related issues

2024-01-30 Thread Andres Freund
Hi,

On 2024-01-30 09:59:25 -0600, Tristan Partin wrote:
> From 331cec1c9db6ff60dcc6d9ba62a9c8be4e5e95ed Mon Sep 17 00:00:00 2001
> From: Tristan Partin 
> Date: Mon, 29 Jan 2024 18:03:39 -0600
> Subject: [PATCH v1 1/3] Refuse to register message in LogLogicalMessage if
>  NULL

> If this occurs, the memcpy of rdata_data in CopyXLogRecordToWAL breaks
> the API contract of memcpy in glibc. The two pointer arguments are
> marked as nonnull, even in the event the amount to copy is 0 bytes.

It seems pretty odd to call LogLogicalMessage() with a NULL argument. Why is
that something useful?


> From dc9488f3fdee69b981b52c985fb77106d7d301ff Mon Sep 17 00:00:00 2001
> From: Tristan Partin 
> Date: Wed, 24 Jan 2024 17:07:01 -0600
> Subject: [PATCH v1 2/3] meson: Support compiling with -Db_sanitize=address
> 
> The ecpg is parser is extremely leaky, so we need to silence leak
> detection.

This does stuff beyond epcg...


> +if get_option('b_sanitize').contains('address')
> +  cdata.set('USE_ADDRESS_SANITIZER', 1)
> +endif
>  
>  ###
>  # NLS / Gettext
> diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
> index ac409b0006..e18e716d9c 100644
> --- a/src/bin/initdb/initdb.c
> +++ b/src/bin/initdb/initdb.c
> @@ -338,6 +338,17 @@ do { \
>   output_failed = true, output_errno = errno; \
>  } while (0)
>  
> +#ifdef USE_ADDRESS_SANITIZER

When asan is used  __SANITIZE_ADDRESS__ is defined, so we don't need to
implement this ourselves.


> +const char *__asan_default_options(void);
> +
> +const char *__asan_default_options(void)
> +{
> + return "detect_leaks=0";
> +}
> +
> +#endif

Wonder if we should move this into some static library and link it into all
binaries that don't want leak detection? It doesn't seem great to have to
adjust this in a bunch of files if we want to adjust the options...

Greetings,

Andres Freund




Re: [PATCH] Add native windows on arm64 support

2024-01-30 Thread Andrew Dunstan


On 2024-01-30 Tu 09:50, Dave Cramer wrote:



On Tue, 30 Jan 2024 at 08:38, Andrew Dunstan  wrote:


On 2024-01-29 Mo 11:20, Dave Cramer wrote:


Dave Cramer
www.postgres.rocks 


On Mon, 29 Jan 2024 at 11:16, Andrew Dunstan
 wrote:


On 2024-01-26 Fr 09:18, Dave Cramer wrote:



On Fri, 26 Jan 2024 at 07:36, Andrew Dunstan
 wrote:


On 2024-01-25 Th 20:32, Michael Paquier wrote:
> On Thu, Jan 25, 2024 at 04:52:30PM -0500, Dave Cramer
wrote:
>> On Thu, 25 Jan 2024 at 16:32, Andrew Dunstan
 wrote:
>>> On 2024-01-25 Th 16:17, Dave Cramer wrote:
>>> Yeah, I think the default Developer Command Prompt
for VS2022 is set up
>>> for x86 builds. AIUI you should start by executing
"vcvarsall x64_arm64".
>> Yup, now I'm in the same state you are
> Wait a minute here.  Based on [1], x64_arm64 means you
can use a x64
> host and you'll be able to produce ARM64 builds, still
these will not
> be able to run on the host where they were built.  How
much of the
> patch posted upthread is required to produce such
builds?  Basically
> everything from it, I guess, so as build dependencies
can be
> satisfied?
>
> [1]:

https://learn.microsoft.com/en-us/cpp/build/building-on-the-command-line?view=msvc-170


If you look at the table here x86 and x64 are the only
supported host
architectures. But that's OK, the x64 binaries will run
on arm64 (W11
ARM64 has x64 emulation builtin). If that didn't work
Dave and I would
not have got as far as we have. But you want the
x64_arm64 argument to
vcvarsall so you will get ARM64 output.


I've rebuilt it using  x64_arm64 and with the attached (very
naive patch) and I still get an x64 binary :(



With this patch I still get a build error, but it's different :-)


[1406/2088] "link" @src/backend/postgres.exe.rsp
FAILED: src/backend/postgres.exe src/backend/postgres.pdb
"link" @src/backend/postgres.exe.rsp
   Creating library src\backend\postgres.exe.lib

storage_lmgr_s_lock.c.obj : error LNK2019: unresolved
external symbol spin_delay referenced in function
perform_spin_delay

src\backend\postgres.exe : fatal error LNK1120: 1 unresolved
externals


Did you add the latest lock.patch ?





I'm a bit confused about exactly what needs to be applied. Can you
supply a complete patch to be applied to a pristine checkout that
will let me build?


cheers


See attached.




No, that is what is giving me the error shown above (just tried again to 
be certain). And it's not surprising, as patch 2 #ifdef's out the 
definition of spin_delay().


If you can get a complete build with these patches then I suspect you're 
not doing a proper ARM64 build.



cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: Possibility to disable `ALTER SYSTEM`

2024-01-30 Thread Tom Lane
Robert Haas  writes:
> On Tue, Jan 30, 2024 at 2:20 PM Tom Lane  wrote:
>> Indeed.  I'd go so far as to say that we should reject not only this
>> proposal, but any future ones that intend to prevent superusers from
>> doing things that superusers normally could do (and, indeed, are
>> normally expected to do).

> Also in my opinion, there is a fair amount of nuance here. On the one
> hand, I and others put a lot of work into making it possible to not
> give people superuser and still be able to do a controlled subset of
> the things that a superuser can do.

Sure, and that is a line of thought that we should continue to pursue.
But we already have enough mechanism to let a non-superuser set only
the ALTER SYSTEM stuff she's authorized to.  There is no reason to
think that a non-superuser could break through that restriction at
all, let alone easily.  So that's an actual security feature, not
security theater.  I don't see how the feature proposed here isn't
security theater, or at least close enough to that.

>> Something like contrib/sepgsql would be a better mechanism, perhaps.

> There's nothing wrong with that exactly, but what does it gain us over
> my proposal of a sentinel file?

I was imagining using selinux and/or sepgsql to directly prevent
writing postgresql.auto.conf from the Postgres account.  Combine that
with a non-Postgres-owned postgresql.conf (already supported) and you
have something that seems actually bulletproof, rather than a hint.
Admittedly, using that approach requires knowing something about a
non-Postgres security mechanism.

regards, tom lane




Re: Possibility to disable `ALTER SYSTEM`

2024-01-30 Thread Robert Haas
On Tue, Jan 30, 2024 at 2:20 PM Tom Lane  wrote:
> Indeed.  I'd go so far as to say that we should reject not only this
> proposal, but any future ones that intend to prevent superusers from
> doing things that superusers normally could do (and, indeed, are
> normally expected to do).  That sort of thing is not part of our
> security model, never has been, and it's simply naive to believe that
> it won't have a boatload of easily-reachable holes in it.  Which we
> *will* get complaints about, if we claim that thus-and-such feature
> prevents it.  So why bother?  Don't give out superuser to people you
> don't trust to not do the things you wish they wouldn't.

In my opinion, we need to have the conversation, whereas you seem to
want to try to shut it down before it starts. If we take that
approach, people are going to get (more) frustrated.

Also in my opinion, there is a fair amount of nuance here. On the one
hand, I and others put a lot of work into making it possible to not
give people superuser and still be able to do a controlled subset of
the things that a superuser can do. For example, thanks to Mark
Dilger's work, you can make somebody not a superuser and still allow
them to set GUCs that can normally be set only by superusers, and you
can choose which GUCs you do and do not want them to be able to set.
And, thanks to my work, you can make someone a CREATEROLE user without
letting them escalate to superuser, and you can then allow them to
manage the users that they create almost exactly as if they were a
superuser, with only the limitations that seem necessary to maintain
system security. It is worth asking - and I would like to hear a real,
non-flip answer - why someone who wants to do what is proposed here
isn't using those mechanisms instead of handing out SUPERUSER and then
complaining that it grants too much power.

On the other hand, I don't see why it isn't legitimate to imagine a
scenario where there is no security boundary between the Kubernetes
administrator and the PostgreSQL DBA, and yet the PostgreSQL DBA
should still be pushed in the direction of doing things in a way that
doesn't break Kubernetes. It surprises me a little bit that Gabriele
and others want to build the system that way, though, because you
might expect that in a typical install the Kubernetes administrator
would want to FORCIBLY PREVENT the PostgreSQL administrator from
messing things up instead of doing what is proposed here, which
amounts to suggesting perhaps the PostgreSQL administrator would be
kind enough not to mess things up. Nonetheless, there's no law against
suggestions. When my wife puts the ground beef that I'm supposed to
use to cook dinner at the top of the freezer and the stuff I'm
supposed to not use at the bottom, nothing prevents me from digging
out the other ground beef and using it, but I don't, because I can
take a hint. And indeed, I benefit from that hint. This seems like it
could be construed as a very similar type of hint.

I don't think we should pretend like one of the two paragraphs above
is valid and the other is hot garbage. That's not solving anything. We
can't resolve the tension between those two things in either direction
by somebody hammering on the side of the argument that they believe to
be correct and ignoring the other one.

> Something like contrib/sepgsql would be a better mechanism, perhaps.

There's nothing wrong with that exactly, but what does it gain us over
my proposal of a sentinel file? I don't see much value in adding a
hook and then a module that uses that hook to return false or
unconditionally ereport.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Fix some ubsan/asan related issues

2024-01-30 Thread Andres Freund
Hi,

On 2024-01-30 22:05:28 +0200, Heikki Linnakangas wrote:
> On 30/01/2024 17:57, Tristan Partin wrote:
> > In my effort to try to see if the test suite would pass with asan
> > enabled, I ran into a max_stack_depth issue. I tried maxing it out
> > (hence, the patch), but that still didn't remedy my issue. I tried to
> > look on the list for any relevant emails, but nothing turned up. Maybe
> > others have not attempted this before? Seems doubtful.
> > 
> > Not entirely sure how to fix this issue. I personally find asan
> > extremely effective, even more than valgrind, so it would be great if
> > I could run Postgres with asan enabled to catch various stupid C issues
> > I might create. On my system, ulimit -a returns 8192 kbytes, so Postgres
> > just lets me set 7680 kbytes as the max. Whatever asan is doing doesn't
> > seem to leave enough stack space for Postgres.
> 
> I'm a bit confused by these. We already run with sanitizer in the cirrus CI.
> What does this enable that we're not already doing?

The reason asan fails is that it uses a "shadow stack" to track stack variable
lifetimes. These confuse our stack depth check. CI doesn't have the issue
because the compiler doesn't yet enable the feature, locally I get around it
by using ASAN_OPTIONS=detect_stack_use_after_return=0:...

The checks are actually quite useful, so making our stack depth check work
with asan would be worthwhile.

I discussed this in a bit more in
https://postgr.es/m/20231129193920.4vphw7dqxjzf5v5b%40awork3.anarazel.de

Greetings,

Andres Freund




Re: Some revises in adding sorting path

2024-01-30 Thread David Rowley
On Wed, 31 Jan 2024 at 00:44, Richard Guo  wrote:
> This patchset does not aim to introduce anything new; it simply
> refactors the existing code.  The newly added tests are used to show
> that the code that is touched here is not redundant, but rather
> essential for generating certain paths.  I remember the tests were added
> per your comment in [3].
>
> [3] 
> https://www.postgresql.org/message-id/CAApHDvo%2BFagxVSGmvt-LUrhLZQ0KViiLvX8dPaG3ZzWLNd-Zpg%40mail.gmail.com

OK.  I've pushed the patched based on it being a simplification of the
partial path generation.

David




Re: Oversight in reparameterize_path_by_child leading to executor crash

2024-01-30 Thread Tom Lane
Richard Guo  writes:
> On Wed, Jan 17, 2024 at 5:01 PM Richard Guo  wrote:
>> Sure, here it is:
>> v10-0001-Avoid-reparameterizing-Paths-when-it-s-not-suitable.patch

> I forgot to mention that this patch applies on v16 not master.

I spent some time looking at this patch (which seems more urgent than
the patch for master, given that we have back-branch releases coming
up).  There are two things I'm not persuaded about:

* Why is it okay to just use pull_varnos on a tablesample expression,
when contain_references_to() does something different?

* Is contain_references_to() doing the right thing by specifying
PVC_RECURSE_PLACEHOLDERS?  That causes it to totally ignore a
PlaceHolderVar's ph_eval_at, and I'm not convinced that's correct.

Ideally it seems to me that we want to reject a PlaceHolderVar
if either its ph_eval_at or ph_lateral overlap the other join
relation; if it was coded that way then we'd not need to recurse
into the PHV's contents.   pull_varnos isn't directly amenable
to this, but I think we could use pull_var_clause with
PVC_INCLUDE_PLACEHOLDERS and then iterate through the resulting
list manually.  (If this patch were meant for HEAD, I'd think
about extending the var.c code to support this usage more directly.
But as things stand, this is a one-off so I think we should just do
what we must in reparameterize_path_by_child.)

BTW, it shouldn't be necessary to write either PVC_RECURSE_AGGREGATES
or PVC_RECURSE_WINDOWFUNCS, because neither kind of node should ever
appear in a scan-level expression.  I'd leave those out so that we
get an error if something unexpected happens.

regards, tom lane




Re: [17] CREATE SUBSCRIPTION ... SERVER

2024-01-30 Thread Jeff Davis
On Tue, 2024-01-30 at 16:17 +0530, Ashutosh Bapat wrote:
> Converting a server and user mapping to
> conninfo should be delegated to the FDW being used since that FDW
> knows best how to use those options.

If I understand you correctly, you mean that there would be a new
optional function associated with an FDW (in addition to the HANDLER
and VALIDATOR) like "CONNECTION", which would be able to return the
conninfo from a server using that FDW. Is that right?

I like the idea -- it further decouples the logic from the core server.
I suspect it will make postgres_fdw the primary way (though not the
only possible way) to use this feature. There would be little need to
create a new builtin FDW to make this work.

To get the subscription invalidation right, we'd need to make the
(reasonable) assumption that the connection information is based only
on the FDW, server, and user mapping. A FDW wouldn't be able to use,
for example, some kind of configuration table or GUC to control how the
connection string gets created. That's easy enough to solve with
documentation.

I'll work up a new patch for this.


Regards,
Jeff Davis





Re: Fix some ubsan/asan related issues

2024-01-30 Thread Heikki Linnakangas

On 30/01/2024 17:57, Tristan Partin wrote:

Patch 1:

Passing NULL as a second argument to memcpy breaks ubsan, and there
didn't seem to be anything preventing that in the LogLogicalMessage()
codepath. Here is a preventative measure in LogLogicalMessage() and an
Assert() in CopyXLogRecordToWAL().


For the audience: We ran into this one with the neon extension. The 
solution is to call LogLogicalMessage("", 0, ...) instead of 
LogLogicalMessage(NULL, 0, ...). . But it's true that it's pointlessfor 
LogLogicalMessage to call XLogRegisterData() if the message is empty. If 
we do this, we should check for 'size == 0' rather than 'message == NULL'.



Patch 2:

Support building with -Db_sanitize=address in Meson. Various executables
are leaky which can cause the builds to fail and tests to fail, when we
are fine leaking this memory.

Personally, I am a big stickler for always freeing my memory in
executables even if it gets released at program exit because it makes
the leak sanitizer much more effective. However (!), I am not here to
change the philosophy of memory management in one-off executables, so
I have just silenced memory leaks in various executables for now.

Patch 3:

THIS DOESN'T WORK. DO NOT COMMIT. PROPOSING FOR DISCUSSION!

In my effort to try to see if the test suite would pass with asan
enabled, I ran into a max_stack_depth issue. I tried maxing it out
(hence, the patch), but that still didn't remedy my issue. I tried to
look on the list for any relevant emails, but nothing turned up. Maybe
others have not attempted this before? Seems doubtful.

Not entirely sure how to fix this issue. I personally find asan
extremely effective, even more than valgrind, so it would be great if
I could run Postgres with asan enabled to catch various stupid C issues
I might create. On my system, ulimit -a returns 8192 kbytes, so Postgres
just lets me set 7680 kbytes as the max. Whatever asan is doing doesn't
seem to leave enough stack space for Postgres.


I'm a bit confused by these. We already run with sanitizer in the cirrus 
CI. What does this enable that we're not already doing?


--
Heikki Linnakangas
Neon (https://neon.tech)





Re: Schema variables - new implementation for Postgres 15

2024-01-30 Thread Pavel Stehule
út 30. 1. 2024 v 20:15 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
napsal:

> Yep, in this constellation the implementation holds much better (in
> terms of memory) in my create/let/drop testing.
>
> I've marked the CF item as ready for committer, but a note for anyone
> who would like to pick up it from here -- we're talking about first 5
> patches here, up to the memory cleaning after DROP VARIABLE. It doesn't
> mean the rest is somehow not worth it, but I believe it's a good first
> step.
>

Thank you very much

Pavel


Re: Possibility to disable `ALTER SYSTEM`

2024-01-30 Thread Tom Lane
Robert Haas  writes:
> I have to admit that I'm a little afraid that people will mistake this
> for an actual security feature and file bug reports or CVEs about the
> superuser being able to circumvent these restrictions. If we add this,
> we had better make sure that the documentation is extremely clear
> about what we are guaranteeing, or more to the point about what we are
> not guaranteeing.

> I understand that there's some frustration on the part of Gabriele and
> others that this proposal hasn't been enthusiastically adopted, but I
> would ask for a little bit of forbearance because those are also, by
> and large, not the people who will not have to cope with it when we
> start getting security researchers threatening to publish our evilness
> in the Register. Such conversations are no fun at all.

Indeed.  I'd go so far as to say that we should reject not only this
proposal, but any future ones that intend to prevent superusers from
doing things that superusers normally could do (and, indeed, are
normally expected to do).  That sort of thing is not part of our
security model, never has been, and it's simply naive to believe that
it won't have a boatload of easily-reachable holes in it.  Which we
*will* get complaints about, if we claim that thus-and-such feature
prevents it.  So why bother?  Don't give out superuser to people you
don't trust to not do the things you wish they wouldn't.

> I also think that using the GUC system to manage itself is a little
> bit suspect.

Something like contrib/sepgsql would be a better mechanism, perhaps.

regards, tom lane




Re: Schema variables - new implementation for Postgres 15

2024-01-30 Thread Dmitry Dolgov
Yep, in this constellation the implementation holds much better (in
terms of memory) in my create/let/drop testing.

I've marked the CF item as ready for committer, but a note for anyone
who would like to pick up it from here -- we're talking about first 5
patches here, up to the memory cleaning after DROP VARIABLE. It doesn't
mean the rest is somehow not worth it, but I believe it's a good first
step.




Re: Add LSN <-> time conversion functionality

2024-01-30 Thread Melanie Plageman
On Wed, Dec 27, 2023 at 5:16 PM Melanie Plageman
 wrote:
>
> Elsewhere [1] I required a way to estimate the time corresponding to a
> particular LSN in the past. I devised the attached LSNTimeline, a data
> structure mapping LSNs <-> timestamps with decreasing precision for
> older time, LSN pairs. This can be used to locate and translate a
> particular time to LSN or vice versa using linear interpolation.

Attached is a new version which fixes one overflow danger I noticed in
the original patch set.

I have also been doing some thinking about the LSNTimeline data
structure. Its array elements are combined before all elements have
been used. This sacrifices precision earlier than required. I tried
some alternative structures that would use the whole array. There are
a lot of options, though. Currently each element fits twice as many
members as the preceding element. To use the whole array, we'd have to
change the behavior from filling each element to its max capacity to
something that filled elements only partially. I'm not sure what the
best distribution would be.

> I've added an instance of the LSNTimeline to PgStat_WalStats and insert
> new values to it in background writer's main loop. This patch set also
> introduces some new pageinspect functions exposing LSN <-> time
> translations.

I was thinking that maybe it is silly to have the functions allowing
for translation between LSN and time in the pageinspect extension --
since they are not specifically related to pages (pages are just an
object that has an accessible LSN). I was thinking perhaps we add them
as system information functions. However, the closest related
functions I can think of are those to get the current LSN (like
pg_current_wal_lsn ()). And those are listed as system administration
functions under backup control [1]. I don't think the LSN <-> time
functionality fits under backup control.

If I did put them in one of the system information function sections
[2], which one would work best?

- Melanie

[1] 
https://www.postgresql.org/docs/devel/functions-admin.html#FUNCTIONS-ADMIN-BACKUP
[2] https://www.postgresql.org/docs/devel/functions-info.html#FUNCTIONS-INFO
From 8590125d66ce366b35251e5aff14db1a858edda9 Mon Sep 17 00:00:00 2001
From: Melanie Plageman 
Date: Wed, 27 Dec 2023 16:40:27 -0500
Subject: [PATCH v2 2/5] Add LSNTimeline for converting LSN <-> time

Add a new structure, LSNTimeline, consisting of LSNTimes -- each an LSN,
time pair. Each LSNTime can represent multiple logical LSN, time pairs,
referred to as members. LSN <-> time conversions can be done using
linear interpolation with two LSNTimes on the LSNTimeline.

This commit does not add a global instance of LSNTimeline. It adds the
structures and functions needed to maintain and access such a timeline.
---
 src/backend/utils/activity/pgstat_wal.c | 199 
 src/include/pgstat.h|  34 
 src/tools/pgindent/typedefs.list|   2 +
 3 files changed, 235 insertions(+)

diff --git a/src/backend/utils/activity/pgstat_wal.c b/src/backend/utils/activity/pgstat_wal.c
index 1a3c0e1a669..e8d9660f82e 100644
--- a/src/backend/utils/activity/pgstat_wal.c
+++ b/src/backend/utils/activity/pgstat_wal.c
@@ -17,8 +17,11 @@
 
 #include "postgres.h"
 
+#include "access/xlog.h"
 #include "utils/pgstat_internal.h"
 #include "executor/instrument.h"
+#include "utils/builtins.h"
+#include "utils/timestamp.h"
 
 
 PgStat_PendingWalStats PendingWalStats = {0};
@@ -32,6 +35,12 @@ PgStat_PendingWalStats PendingWalStats = {0};
 static WalUsage prevWalUsage;
 
 
+static void lsntime_absorb(LSNTime *a, const LSNTime *b);
+void lsntime_insert(LSNTimeline *timeline, TimestampTz time, XLogRecPtr lsn);
+
+XLogRecPtr estimate_lsn_at_time(const LSNTimeline *timeline, TimestampTz time);
+TimestampTz estimate_time_at_lsn(const LSNTimeline *timeline, XLogRecPtr lsn);
+
 /*
  * Calculate how much WAL usage counters have increased and update
  * shared WAL and IO statistics.
@@ -184,3 +193,193 @@ pgstat_wal_snapshot_cb(void)
 		   sizeof(pgStatLocal.snapshot.wal));
 	LWLockRelease(_shmem->lock);
 }
+
+/*
+ * Set *a to be the earlier of *a or *b.
+ */
+static void
+lsntime_absorb(LSNTime *a, const LSNTime *b)
+{
+	LSNTime		result;
+	uint64		new_members = a->members + b->members;
+
+	if (a->time < b->time)
+		result = *a;
+	else if (b->time < a->time)
+		result = *b;
+	else if (a->lsn < b->lsn)
+		result = *a;
+	else if (b->lsn < a->lsn)
+		result = *b;
+	else
+		result = *a;
+
+	*a = result;
+	a->members = new_members;
+}
+
+/*
+ * Insert a new LSNTime into the LSNTimeline in the first element with spare
+ * capacity.
+ */
+void
+lsntime_insert(LSNTimeline *timeline, TimestampTz time,
+			   XLogRecPtr lsn)
+{
+	LSNTime		temp;
+	LSNTime		carry = {.lsn = lsn,.time = time,.members = 1};
+
+	for (int i = 0; i < timeline->length; i++)
+	{
+		bool		full;
+		LSNTime*cur = >data[i];
+
+		/*
+		 * An array element's capacity to represent members is 2 ^ its
+		 * 

Re: Flushing large data immediately in pqcomm

2024-01-30 Thread Robert Haas
On Tue, Jan 30, 2024 at 12:58 PM Melih Mutlu  wrote:
> Sounds like it's difficult to come up with a heuristic that would work well 
> enough for most cases.
> One thing with sending data instead of copying it if the buffer is empty is 
> that initially the buffer is empty. I believe it will stay empty forever if 
> we do not copy anything when the buffer is empty. We can maybe simply set the 
> threshold to the buffer size/2 (4kB) and hope that will work better. Or copy 
> the data only if it fits into the remaining space in the buffer. What do you 
> think?
>
> An additional note while I mentioned pq_putmessage_noblock(), I've been 
> testing sending input data immediately in pq_putmessage_noblock() without 
> blocking and copy the data into PqSendBuffer only if the socket would block 
> and cannot send it. Unfortunately, I don't have strong numbers to demonstrate 
> any improvement in perf or timing yet. But I still like to know what would 
> you think about it?

I think this is an area where it's very difficult to foresee on
theoretical grounds what will be right in practice. The problem is
that the best algorithm probably depends on what usage patterns are
common in practice. I think one common usage pattern will be a bunch
of roughly equal-sized messages in a row, like CopyData or DataRow
messages -- but those messages won't have a consistent width. It would
probably be worth testing what behavior you see in such cases -- start
with say a stream of 100 byte messages and then gradually increase and
see how the behavior evolves.

But you can also have other patterns, with messages of different sizes
interleaved. In the case of FE-to-BE traffic, the extended query
protocol might be a good example of that: the Parse message could be
quite long, or not, but the Bind Describe Execute Sync messages that
follow are probably all short. That case doesn't arise in this
direction, but I can't think exactly of what cases that do. It seems
like someone would need to play around and try some different cases
and maybe log the sizes of the secure_write() calls with various
algorithms, and then try to figure out what's best. For example, if
the alternating short-write, long-write behavior that Heikki mentioned
is happening, and I do think that particular thing is a very real
risk, then you haven't got it figured out yet...

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Bytea PL/Perl transform

2024-01-30 Thread Pavel Stehule
Hi

út 30. 1. 2024 v 18:35 odesílatel Pavel Stehule 
napsal:

>
>
> út 30. 1. 2024 v 18:26 odesílatel Dagfinn Ilmari Mannsåker <
> ilm...@ilmari.org> napsal:
>
>> Pavel Stehule  writes:
>>
>> > út 30. 1. 2024 v 17:46 odesílatel Dagfinn Ilmari Mannsåker <
>> > ilm...@ilmari.org> napsal:
>> >
>> >> Pavel Stehule  writes:
>> >>
>> >> > út 30. 1. 2024 v 17:18 odesílatel Dagfinn Ilmari Mannsåker <
>> >> > ilm...@ilmari.org> napsal:
>> >> >
>> >> >> Pavel Stehule  writes:
>> >> >>
>> >> >> > út 30. 1. 2024 v 16:43 odesílatel Dagfinn Ilmari Mannsåker <
>> >> >> > ilm...@ilmari.org> napsal:
>> >> >> >
>> >> >> >> Pavel Stehule  writes:
>> >> >> >>
>> >> >> >> > I inserted perl reference support - hstore_plperl and
>> json_plperl
>> >> does
>> >> >> >> it.
>> >> >> >> >
>> >> >> >> > +<->/* Dereference references recursively. */
>> >> >> >> > +<->while (SvROK(in))
>> >> >> >> > +<-><-->in = SvRV(in);
>> >> >> >>
>> >> >> >> That code in hstore_plperl and json_plperl is only relevant
>> because
>> >> they
>> >> >> >> deal with non-scalar values (hashes for hstore, and also arrays
>> for
>> >> >> >> json) which must be passed as references.  The recursive nature
>> of
>> >> the
>> >> >> >> dereferencing is questionable, and masked the bug fixed by commit
>> >> >> >> 1731e3741cbbf8e0b4481665d7d523bc55117f63.
>> >> >> >>
>> >> >> >> bytea_plperl only deals with scalars (specifically strings), so
>> >> should
>> >> >> >> not concern itself with references.  In fact, this code breaks
>> >> returning
>> >> >> >> objects with overloaded stringification, for example:
>> >> >> >>
>> >> >> >> CREATE FUNCTION plperlu_overload() RETURNS bytea LANGUAGE plperlu
>> >> >> >>   TRANSFORM FOR TYPE bytea
>> >> >> >>   AS $$
>> >> >> >> package StringOverload { use overload '""' => sub { "stuff"
>> }; }
>> >> >> >> return bless {}, "StringOverload";
>> >> >> >>   $$;
>> >> >> >>
>> >> >> >> This makes the server crash with an assertion failure from Perl
>> >> because
>> >> >> >> SvPVbyte() was passed a non-scalar value:
>> >> >> >>
>> >> >> >> postgres: ilmari regression_bytea_plperl [local] SELECT:
>> sv.c:2865:
>> >> >> >> Perl_sv_2pv_flags:
>> >> >> >> Assertion `SvTYPE(sv) != SVt_PVAV && SvTYPE(sv) != SVt_PVHV &&
>> >> >> SvTYPE(sv)
>> >> >> >> != SVt_PVFM' failed.
>> >> >> >>
>> >> >> >> If I remove the dereferincing loop it succeeds:
>> >> >> >>
>> >> >> >> SELECT encode(plperlu_overload(), 'escape') AS string;
>> >> >> >>  string
>> >> >> >> 
>> >> >> >>  stuff
>> >> >> >> (1 row)
>> >> >> >>
>> >> >> >> Attached is a v2 patch which removes the dereferencing and
>> includes
>> >> the
>> >> >> >> above example as a test.
>> >> >> >>
>> >> >> >
>> >> >> > But without dereference it returns bad value.
>> >> >>
>> >> >> Where exactly does it return a bad value?  The existing tests pass,
>> and
>> >> >> the one I included shows that it does the right thing in that case
>> too.
>> >> >> If you pass it an unblessed reference it returns the stringified
>> version
>> >> >> of that, as expected.
>> >> >>
>> >> >
>> >> > ugly test code
>> >> >
>> >> > (2024-01-30 13:44:28) postgres=# CREATE or replace FUNCTION
>> >> > perl_inverse_bytes(bytea) RETURNS bytea
>> >> > TRANSFORM FOR TYPE bytea
>> >> > AS $$ my $bytes =  pack 'H*', '0123'; my $ref = \$bytes;
>> >>
>> >> You are returning a reference, not a string.
>> >>
>> >
>> > I know, but for this case, should not be raised an error?
>>
>> I don't think so, as I explained in my previous reply:
>>
>> > There's no reason to ban references, that would break every Perl
>> > programmer's expectations.
>>
>> To elaborate on this: when a function is defined to return a string
>> (which bytea effectively is, as far as Perl is converned), I as a Perl
>> programmer would expect PL/Perl to just stringify whatever value I
>> returned, according to the usual Perl rules.
>>
>
> ok
>
> Pavel
>
>
>>
>> I also said:
>>
>> > If we really want to be strict, we should at least allow references to
>> > objects that overload stringification, as they are explicitly designed
>> > to be well-behaved as strings.  But that would be a lot of extra code
>> > for very little benefit over just letting Perl stringify everything.
>>
>
>> By "a lot of code", I mean everything `string_amg`-related in the
>> amagic_applies() function
>> (https://github.com/Perl/perl5/blob/v5.38.0/gv.c#L3401-L3545).  We can't
>> just call it: it's only available since Perl 5.38 (released last year),
>> and we support Perl versions all the way back to 5.14.
>>
>> - ilmari
>>
>
I marked this patch as ready for committer.

It is almost trivial, make check-world, make doc passed

Regards

Pavel


Re: UUID v7

2024-01-30 Thread Sergey Prokhorenko
typo:
being carried to time step

should be:being carried to timestemp

Sergey Prokhorenko sergeyprokhore...@yahoo.com.au 

On Tuesday, 30 January 2024 at 04:35:45 pm GMT+3, Andrey M. Borodin 
 wrote:  
 
 

> On 30 Jan 2024, at 15:33, Junwang Zhao  wrote:
> 
> It's always good to add a newline at the end of a  source file, though
> this might be nitpicky.

Thanks, also fixed warning found by CFBot.


Best regards, Andrey Borodin.
  

Re: Flushing large data immediately in pqcomm

2024-01-30 Thread Melih Mutlu
Hi Robert,

Robert Haas , 29 Oca 2024 Pzt, 20:48 tarihinde şunu
yazdı:

> > If there's already some data in PqSendBuffer, I wonder if it would be
> > better to fill it up with data, flush it, and then send the rest of the
> > data directly. Instead of flushing the partial data first. I'm afraid
> > that you'll make a tiny call to secure_write(), followed by a large one,
> > then a tine one again, and so forth. Especially when socket_putmessage
> > itself writes the msgtype and len, which are tiny, before the payload.
> >
> > Perhaps we should invent a new pq_putmessage() function that would take
> > an input buffer with 5 bytes of space reserved before the payload.
> > pq_putmessage() could then fill in the msgtype and len bytes in the
> > input buffer and send that directly. (Not wedded to that particular API,
> > but something that would have the same effect)
>
> I share the concern; I'm not sure about the best solution. I wonder if
> it would be useful to have pq_putmessagev() in the style of writev()
> et al. Or maybe what we need is secure_writev().
>

I thought about using writev() for not only pq_putmessage() but
pq_putmessage_noblock() too. Currently, pq_putmessage_noblock()
repallocs PqSendBuffer
and copies input buffer, which can easily be larger than 8kB, into
PqSendBuffer.I
also discussed it with Thomas off-list. The thing is that I believe we
would need secure_writev() with SSL/GSS cases handled properly. I'm just
not sure if the effort would be worthwhile considering what we gain from it.


> I also wonder if the threshold for sending data directly should be
> smaller than the buffer size, and/or whether it should depend on the
> buffer being empty.


You might be right. I'm not sure what the ideal threshold would be.


> If we have an 8kB buffer that currently has
> nothing in it, and somebody writes 2kB, I suspect it might be wrong to
> copy that into the buffer. If the same buffer had 5kB used and 3kB
> free, copying sounds a lot more likely to work out. The goal here is
> probably to condense sequences of short messages into a single
> transmission while sending long messages individually. I'm just not
> quite sure what heuristic would do that most effectively.
>

Sounds like it's difficult to come up with a heuristic that would work well
enough for most cases.
One thing with sending data instead of copying it if the buffer is empty is
that initially the buffer is empty. I believe it will stay empty forever if
we do not copy anything when the buffer is empty. We can maybe simply set
the threshold to the buffer size/2 (4kB) and hope that will work better. Or
copy the data only if it fits into the remaining space in the buffer. What
do you think?


An additional note while I mentioned pq_putmessage_noblock(), I've been
testing sending input data immediately in pq_putmessage_noblock() without
blocking and copy the data into PqSendBuffer only if the socket would block
and cannot send it. Unfortunately, I don't have strong numbers to
demonstrate any improvement in perf or timing yet. But I still like to know
what would you think about it?

Thanks,
-- 
Melih Mutlu
Microsoft


Re: Possibility to disable `ALTER SYSTEM`

2024-01-30 Thread Robert Haas
On Tue, Sep 12, 2023 at 10:39 AM Martín Marqués
 wrote:
> The outcome looked for is that the system GUCs that require a restart
> or reload are not modified unless it's through some orchestration or
> someone with physical access to the configuration files (yeah, we
> still have the COPY PROGRAM).

If I understand this correctly, you're saying it's not a security
vulnerability if someone finds a way to use COPY PROGRAM or some other
mechanism to bypass the ALTER SYSTEM restriction, because the point of
the constraint isn't to make it impossible for the superuser to modify
the configuration in a way that they shouldn't, but rather to make it
inconvenient for them to do so.

I have to admit that I'm a little afraid that people will mistake this
for an actual security feature and file bug reports or CVEs about the
superuser being able to circumvent these restrictions. If we add this,
we had better make sure that the documentation is extremely clear
about what we are guaranteeing, or more to the point about what we are
not guaranteeing.

I understand that there's some frustration on the part of Gabriele and
others that this proposal hasn't been enthusiastically adopted, but I
would ask for a little bit of forbearance because those are also, by
and large, not the people who will not have to cope with it when we
start getting security researchers threatening to publish our evilness
in the Register. Such conversations are no fun at all. Explaining that
we're not actually evil doesn't tend to work, because the security
researchers are just as convinced that they are right as anyone
arguing for this feature is. Statements like "we don't actually intend
to guarantee X" tend to fall on deaf ears.

In fact, I would go so far as to argue that many of our security
problems (and non-problems) are widely misunderstood even within our
own community, and that far from being something anyone should dismiss
as pedantry, it's actually a critical issue for the project to solve
and something we really need to address in order to be able to move
forward. From that point of view, this feature seems bound to make an
already-annoying problem worse. I don't necessarily expect the people
who are in favor of this feature to accept that as a reason not to do
this, but I do hope to be taken seriously when I say there's a real
issue there. Something can be a serious problem even if it's not YOUR
problem, and in this case, that apparently goes both ways.

I also think that using the GUC system to manage itself is a little
bit suspect. I wonder if it would be better to do this some other way,
e.g. a sentinel file in the data directory. For example, suppose we
refuse ALTER SYSTEM if $PGDATA/disable_alter_system exists, or
something like that. It seems like it would be very easy for an
external management solution (k8s or whatever) to drop that file in
place if desired, and then it would be crystal clear that there's no
way of bypassing the restriction from within the GUC system itself
(though you could still bypass it via filesystem access).

I agree with those who have said that this shouldn't disable
postgresql.auto.conf, but only the ability of ALTER SYSTEM to modify
it. Right now, third-party tools external to the server can count on
being able to add things to postgresql.auto.conf with the reasonable
expectations that they'll take effect. I'd rather not break that
property.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Flushing large data immediately in pqcomm

2024-01-30 Thread Melih Mutlu
Hi Heikki,

Heikki Linnakangas , 29 Oca 2024 Pzt, 19:12 tarihinde şunu
yazdı:

> > Proposed change modifies socket_putmessage to send any data larger than
> > 8K immediately without copying it into the send buffer. Assuming that
> > the send buffer would be flushed anyway due to reaching its limit, the
> > patch just gets rid of the copy part which seems unnecessary and sends
> > data without waiting.
>
> If there's already some data in PqSendBuffer, I wonder if it would be
> better to fill it up with data, flush it, and then send the rest of the
> data directly. Instead of flushing the partial data first. I'm afraid
> that you'll make a tiny call to secure_write(), followed by a large one,
> then a tine one again, and so forth. Especially when socket_putmessage
> itself writes the msgtype and len, which are tiny, before the payload.
>

I agree that I could do better there without flushing twice for both
PqSendBuffer and
input data. PqSendBuffer always has some data, even if it's tiny, since
msgtype and len are added.


> Perhaps we should invent a new pq_putmessage() function that would take
> an input buffer with 5 bytes of space reserved before the payload.
> pq_putmessage() could then fill in the msgtype and len bytes in the
> input buffer and send that directly. (Not wedded to that particular API,
> but something that would have the same effect)
>

I thought about doing this. The reason why I didn't was because I think
that such a change would require adjusting all input buffers wherever
pq_putmessage is called, and I did not want to touch that many different
places. These places where we need pq_putmessage might not be that many
though, I'm not sure.


>
> > This change affects places where pq_putmessage is used such as
> > pg_basebackup, COPY TO, walsender etc.
> >
> > I did some experiments to see how the patch performs.
> > Firstly, I loaded ~5GB data into a table [1], then ran "COPY test TO
> > STDOUT". Here are perf results of both the patch and HEAD > ...
> > The patch brings a ~5% gain in socket_putmessage.
> >
> > [1]
> > CREATE TABLE test(id int, name text, time TIMESTAMP);
> > INSERT INTO test (id, name, time) SELECT i AS id, repeat('dummy', 100)
> > AS name, NOW() AS time FROM generate_series(1, 1) AS i;
>
> I'm surprised by these results, because each row in that table is < 600
> bytes. PqSendBufferSize is 8kB, so the optimization shouldn't kick in in
> that test. Am I missing something?
>

You're absolutely right. I made a silly mistake there. I also think that
the way I did perf analysis does not make much sense, even if one row of
the table is greater than 8kB.
Here are some quick timing results after being sure that it triggers this
patch's optimization. I need to think more on how to profile this with
perf. I hope to share proper results soon.

I just added a bit more zeros [1] and ran [2] (hopefully measured the
correct thing)

HEAD:
real2m48,938s
user0m9,226s
sys 1m35,342s

Patch:
real2m40,690s
user0m8,492s
sys 1m31,001s

[1]
 INSERT INTO test (id, name, time) SELECT i AS id, repeat('dummy', 1)
AS name, NOW() AS time FROM generate_series(1, 100) AS i;

[2]
 rm /tmp/dummy && echo 3 | sudo tee /proc/sys/vm/drop_caches && time psql
-d postgres -c "COPY test TO STDOUT;" > /tmp/dummy

Thanks,
-- 
Melih Mutlu
Microsoft


Re: Bytea PL/Perl transform

2024-01-30 Thread Pavel Stehule
út 30. 1. 2024 v 18:26 odesílatel Dagfinn Ilmari Mannsåker <
ilm...@ilmari.org> napsal:

> Pavel Stehule  writes:
>
> > út 30. 1. 2024 v 17:46 odesílatel Dagfinn Ilmari Mannsåker <
> > ilm...@ilmari.org> napsal:
> >
> >> Pavel Stehule  writes:
> >>
> >> > út 30. 1. 2024 v 17:18 odesílatel Dagfinn Ilmari Mannsåker <
> >> > ilm...@ilmari.org> napsal:
> >> >
> >> >> Pavel Stehule  writes:
> >> >>
> >> >> > út 30. 1. 2024 v 16:43 odesílatel Dagfinn Ilmari Mannsåker <
> >> >> > ilm...@ilmari.org> napsal:
> >> >> >
> >> >> >> Pavel Stehule  writes:
> >> >> >>
> >> >> >> > I inserted perl reference support - hstore_plperl and
> json_plperl
> >> does
> >> >> >> it.
> >> >> >> >
> >> >> >> > +<->/* Dereference references recursively. */
> >> >> >> > +<->while (SvROK(in))
> >> >> >> > +<-><-->in = SvRV(in);
> >> >> >>
> >> >> >> That code in hstore_plperl and json_plperl is only relevant
> because
> >> they
> >> >> >> deal with non-scalar values (hashes for hstore, and also arrays
> for
> >> >> >> json) which must be passed as references.  The recursive nature of
> >> the
> >> >> >> dereferencing is questionable, and masked the bug fixed by commit
> >> >> >> 1731e3741cbbf8e0b4481665d7d523bc55117f63.
> >> >> >>
> >> >> >> bytea_plperl only deals with scalars (specifically strings), so
> >> should
> >> >> >> not concern itself with references.  In fact, this code breaks
> >> returning
> >> >> >> objects with overloaded stringification, for example:
> >> >> >>
> >> >> >> CREATE FUNCTION plperlu_overload() RETURNS bytea LANGUAGE plperlu
> >> >> >>   TRANSFORM FOR TYPE bytea
> >> >> >>   AS $$
> >> >> >> package StringOverload { use overload '""' => sub { "stuff"
> }; }
> >> >> >> return bless {}, "StringOverload";
> >> >> >>   $$;
> >> >> >>
> >> >> >> This makes the server crash with an assertion failure from Perl
> >> because
> >> >> >> SvPVbyte() was passed a non-scalar value:
> >> >> >>
> >> >> >> postgres: ilmari regression_bytea_plperl [local] SELECT:
> sv.c:2865:
> >> >> >> Perl_sv_2pv_flags:
> >> >> >> Assertion `SvTYPE(sv) != SVt_PVAV && SvTYPE(sv) != SVt_PVHV &&
> >> >> SvTYPE(sv)
> >> >> >> != SVt_PVFM' failed.
> >> >> >>
> >> >> >> If I remove the dereferincing loop it succeeds:
> >> >> >>
> >> >> >> SELECT encode(plperlu_overload(), 'escape') AS string;
> >> >> >>  string
> >> >> >> 
> >> >> >>  stuff
> >> >> >> (1 row)
> >> >> >>
> >> >> >> Attached is a v2 patch which removes the dereferencing and
> includes
> >> the
> >> >> >> above example as a test.
> >> >> >>
> >> >> >
> >> >> > But without dereference it returns bad value.
> >> >>
> >> >> Where exactly does it return a bad value?  The existing tests pass,
> and
> >> >> the one I included shows that it does the right thing in that case
> too.
> >> >> If you pass it an unblessed reference it returns the stringified
> version
> >> >> of that, as expected.
> >> >>
> >> >
> >> > ugly test code
> >> >
> >> > (2024-01-30 13:44:28) postgres=# CREATE or replace FUNCTION
> >> > perl_inverse_bytes(bytea) RETURNS bytea
> >> > TRANSFORM FOR TYPE bytea
> >> > AS $$ my $bytes =  pack 'H*', '0123'; my $ref = \$bytes;
> >>
> >> You are returning a reference, not a string.
> >>
> >
> > I know, but for this case, should not be raised an error?
>
> I don't think so, as I explained in my previous reply:
>
> > There's no reason to ban references, that would break every Perl
> > programmer's expectations.
>
> To elaborate on this: when a function is defined to return a string
> (which bytea effectively is, as far as Perl is converned), I as a Perl
> programmer would expect PL/Perl to just stringify whatever value I
> returned, according to the usual Perl rules.
>

ok

Pavel


>
> I also said:
>
> > If we really want to be strict, we should at least allow references to
> > objects that overload stringification, as they are explicitly designed
> > to be well-behaved as strings.  But that would be a lot of extra code
> > for very little benefit over just letting Perl stringify everything.
>

> By "a lot of code", I mean everything `string_amg`-related in the
> amagic_applies() function
> (https://github.com/Perl/perl5/blob/v5.38.0/gv.c#L3401-L3545).  We can't
> just call it: it's only available since Perl 5.38 (released last year),
> and we support Perl versions all the way back to 5.14.
>
> - ilmari
>


Re: Improve WALRead() to suck data directly from WAL buffers when possible

2024-01-30 Thread Alvaro Herrera
Hmm, this looks quite nice and simple.  My only comment is that a
sequence like this

   /* Read from WAL buffers, if available. */
   rbytes = XLogReadFromBuffers(_message.data[output_message.len],
startptr, nbytes, xlogreader->seg.ws_tli);
   output_message.len += rbytes;
   startptr += rbytes;
   nbytes -= rbytes;

   if (!WALRead(xlogreader,
_message.data[output_message.len],
startptr,

leaves you wondering if WALRead() should be called at all or not, in the
case when all bytes were read by XLogReadFromBuffers.  I think in many
cases what's going to happen is that nbytes is going to be zero, and
then WALRead is going to return having done nothing in its inner loop.
I think this warrants a comment somewhere.  Alternatively, we could
short-circuit the 'if' expression so that WALRead() is not called in
that case (but I'm not sure it's worth the loss of code clarity).

Also, but this is really quite minor, it seems sad to add more functions
with the prefix XLog, when we have renamed things to use the prefix WAL,
and we have kept the old names only to avoid backpatchability issues.
I mean, if we have WALRead() already, wouldn't it make perfect sense to
name the new routine WALReadFromBuffers?

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Tiene valor aquel que admite que es un cobarde" (Fernandel)




Re: Bytea PL/Perl transform

2024-01-30 Thread Dagfinn Ilmari Mannsåker
Pavel Stehule  writes:

> út 30. 1. 2024 v 17:46 odesílatel Dagfinn Ilmari Mannsåker <
> ilm...@ilmari.org> napsal:
>
>> Pavel Stehule  writes:
>>
>> > út 30. 1. 2024 v 17:18 odesílatel Dagfinn Ilmari Mannsåker <
>> > ilm...@ilmari.org> napsal:
>> >
>> >> Pavel Stehule  writes:
>> >>
>> >> > út 30. 1. 2024 v 16:43 odesílatel Dagfinn Ilmari Mannsåker <
>> >> > ilm...@ilmari.org> napsal:
>> >> >
>> >> >> Pavel Stehule  writes:
>> >> >>
>> >> >> > I inserted perl reference support - hstore_plperl and json_plperl
>> does
>> >> >> it.
>> >> >> >
>> >> >> > +<->/* Dereference references recursively. */
>> >> >> > +<->while (SvROK(in))
>> >> >> > +<-><-->in = SvRV(in);
>> >> >>
>> >> >> That code in hstore_plperl and json_plperl is only relevant because
>> they
>> >> >> deal with non-scalar values (hashes for hstore, and also arrays for
>> >> >> json) which must be passed as references.  The recursive nature of
>> the
>> >> >> dereferencing is questionable, and masked the bug fixed by commit
>> >> >> 1731e3741cbbf8e0b4481665d7d523bc55117f63.
>> >> >>
>> >> >> bytea_plperl only deals with scalars (specifically strings), so
>> should
>> >> >> not concern itself with references.  In fact, this code breaks
>> returning
>> >> >> objects with overloaded stringification, for example:
>> >> >>
>> >> >> CREATE FUNCTION plperlu_overload() RETURNS bytea LANGUAGE plperlu
>> >> >>   TRANSFORM FOR TYPE bytea
>> >> >>   AS $$
>> >> >> package StringOverload { use overload '""' => sub { "stuff" }; }
>> >> >> return bless {}, "StringOverload";
>> >> >>   $$;
>> >> >>
>> >> >> This makes the server crash with an assertion failure from Perl
>> because
>> >> >> SvPVbyte() was passed a non-scalar value:
>> >> >>
>> >> >> postgres: ilmari regression_bytea_plperl [local] SELECT: sv.c:2865:
>> >> >> Perl_sv_2pv_flags:
>> >> >> Assertion `SvTYPE(sv) != SVt_PVAV && SvTYPE(sv) != SVt_PVHV &&
>> >> SvTYPE(sv)
>> >> >> != SVt_PVFM' failed.
>> >> >>
>> >> >> If I remove the dereferincing loop it succeeds:
>> >> >>
>> >> >> SELECT encode(plperlu_overload(), 'escape') AS string;
>> >> >>  string
>> >> >> 
>> >> >>  stuff
>> >> >> (1 row)
>> >> >>
>> >> >> Attached is a v2 patch which removes the dereferencing and includes
>> the
>> >> >> above example as a test.
>> >> >>
>> >> >
>> >> > But without dereference it returns bad value.
>> >>
>> >> Where exactly does it return a bad value?  The existing tests pass, and
>> >> the one I included shows that it does the right thing in that case too.
>> >> If you pass it an unblessed reference it returns the stringified version
>> >> of that, as expected.
>> >>
>> >
>> > ugly test code
>> >
>> > (2024-01-30 13:44:28) postgres=# CREATE or replace FUNCTION
>> > perl_inverse_bytes(bytea) RETURNS bytea
>> > TRANSFORM FOR TYPE bytea
>> > AS $$ my $bytes =  pack 'H*', '0123'; my $ref = \$bytes;
>>
>> You are returning a reference, not a string.
>>
>
> I know, but for this case, should not be raised an error?

I don't think so, as I explained in my previous reply:

> There's no reason to ban references, that would break every Perl
> programmer's expectations.

To elaborate on this: when a function is defined to return a string
(which bytea effectively is, as far as Perl is converned), I as a Perl
programmer would expect PL/Perl to just stringify whatever value I
returned, according to the usual Perl rules.

I also said:

> If we really want to be strict, we should at least allow references to
> objects that overload stringification, as they are explicitly designed
> to be well-behaved as strings.  But that would be a lot of extra code
> for very little benefit over just letting Perl stringify everything.

By "a lot of code", I mean everything `string_amg`-related in the
amagic_applies() function
(https://github.com/Perl/perl5/blob/v5.38.0/gv.c#L3401-L3545).  We can't
just call it: it's only available since Perl 5.38 (released last year),
and we support Perl versions all the way back to 5.14.

- ilmari




Re: separating use of SerialSLRULock

2024-01-30 Thread Alvaro Herrera
On 2024-Jan-29, Alvaro Herrera wrote:

> It's terrifying that SerialAdd() doesn't seem to be covered by any
> tests, though.

I realized that there's some coverage when compiling with
TEST_SUMMARIZE_SERIAL, so I tried that and it looks OK.

One other change I made was in the comment that explains the locking
order.  I had put the new lock at the top, but when I tested adding some
asserts to verify that the other locks are not held, they turn out to
fire soon enough ... and the conflicting lock is the last one of that
list.  So I added the new lock below it, and the SLRU lock further down,
because SerialAdd does it that way.

I pushed it now.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: Possibility to disable `ALTER SYSTEM`

2024-01-30 Thread Gabriele Bartolini
Hi,

I am sending an updated patch, and submitting this to the next commit fest,
as I still believe this could be very useful.

Thanks,
Gabriele

On Thu, 7 Sept 2023 at 21:51, Gabriele Bartolini <
gabriele.bartol...@enterprisedb.com> wrote:

> Hi everyone,
>
> I would like to propose a patch that allows administrators to disable
> `ALTER SYSTEM` via either a runt-time option to pass to the Postgres server
> process at startup (e.g. `--disable-alter-system=true`, false by default)
> or a new GUC (or even both), without changing the current default method of
> the server.
>
> The main reason is that this would help improve the “security by default”
> posture of Postgres in a Kubernetes/Cloud Native environment - and, in
> general, in any environment on VMs/bare metal behind a configuration
> management system in which changes should only be made in a declarative way
> and versioned like Ansible Tower, to cite one.
>
> Below you find some background information and the longer story behind
> this proposal.
>
> Sticking to the Kubernetes use case, I am primarily speaking on behalf of
> the CloudNativePG open source operator (cloudnative-pg.io, of which I am
> one of the maintainers). However, I am sure that this option could benefit
> any operator for Postgres - an operator is the most common and recommended
> way to run a complex application like a PostgreSQL database management
> system inside Kubernetes.
>
> In this case, the state of a PostgreSQL cluster (for example its number of
> replicas, configuration, storage, etc.) is defined in a Custom Resource
> Definition in the form of configuration, typically YAML, and the operator
> works with Kubernetes to ensure that, at any moment, the requested Postgres
> cluster matches the observed one. This is a very basic example in
> CloudNativePG:
> https://cloudnative-pg.io/documentation/current/samples/cluster-example.yaml
>
> As I was mentioning above, in a Cloud Native environment it is expected
> that workloads are secure by default. Without going into much detail, many
> decisions have been made in that direction by operators for Postgres,
> including CloudNativePG. The goal of this proposal is to provide a way to
> ensure that changes to the PostgreSQL configuration in a Kubernetes
> controlled Postgres cluster are allowed only through the Kubernetes API.
>
> Basically, if you want to change an option for PostgreSQL, you need to
> change the desired state in the Kubernetes resource, then Kubernetes will
> converge (through the operator). In simple words, it’s like empowering the
> operator to impersonate the PostgreSQL superuser.
>
> However, given that we cannot force this use case, there could be roles
> with the login+superuser privileges connecting to the PostgreSQL instance
> and potentially “interfering” with the requested state defined in the
> configuration by imperatively running “ALTER SYSTEM” commands.
>
> For example: CloudNativePG has a fixed value for some GUCs in order to
> manage a full HA cluster, including SSL, log, some WAL and replication
> settings. While the operator eventually reconciles those settings, even the
> temporary change of those settings in a cluster might be harmful. Think for
> example of a user that, through `ALTER SYSTEM`, tries to change WAL level
> to minimal, or change the setting of the log (we require CSV), potentially
> creating issues to the underlying instance and cluster (potentially leaving
> it in an unrecoverable state in the case of other more invasive GUCS).
>
> At the moment, a possible workaround is that `ALTER SYSTEM` can be blocked
> by making the postgresql.auto.conf read only, but the returned message is
> misleading and that’s certainly bad user experience (which is very
> important in a cloud native environment):
>
> ```
> postgres=# ALTER SYSTEM SET wal_level TO minimal;
> ERROR:  could not open file "postgresql.auto.conf": Permission denied
> ```
>
> For this reason, I would like to propose the option to be given to the
> postgres process at startup, in order to be as less invasive as possible
> (the operator could then start Postgres requesting `ALTER SYSTEM` to be
> disabled). That’d be my preference at the moment, if possible.
>
> Alternatively, or in addition, the introduction of a GUC to disable `ALTER
> SYSTEM` altogether. This enables tuning this setting through configuration
> at the Kubernetes level, only if the operators require it - without
> damaging the rest of the users.
>
> Before I start writing any lines of code and propose a patch, I would like
> first to understand if there’s room for it.
>
> Thanks for your attention and … looking forward to your feedback!
>
> Ciao,
> Gabriele
> --
> Gabriele Bartolini
> Vice President, Cloud Native at EDB
> enterprisedb.com
>


-- 
Gabriele Bartolini
Vice President, Cloud Native at EDB
enterprisedb.com


0001-Add-enable_alter_system-GUC.patch
Description: Binary data


Re: Parallelize correlated subqueries that execute within each worker

2024-01-30 Thread Robert Haas
On Tue, Jan 30, 2024 at 11:17 AM Akshat Jaimini  wrote:
> I think we should move this patch to the next CF as I believe that work is 
> still going on resolving the last reported bug.

We shouldn't just keep pushing this forward to the next CF. It's been
idle since July. If it needs more work, mark it RwF and it can be
reopened when there's something for a reviewer to do.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: cleanup patches for incremental backup

2024-01-30 Thread Robert Haas
On Tue, Jan 30, 2024 at 10:51 AM Robert Haas  wrote:
> I think the solution here is to find a better way to wait for the
> inserts to be summarized, one that actually does wait for that to
> happen.

Here's a patch for that. I now think
a7097ca630a25dd2248229f21ebce4968d85d10a was actually misguided, and
served only to mask some of the failures caused by waiting for the WAL
summary file.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


v1-0001-Revise-pg_walsummary-s-002_blocks-test-to-avoid-s.patch
Description: Binary data


Re: Bytea PL/Perl transform

2024-01-30 Thread Pavel Stehule
út 30. 1. 2024 v 17:46 odesílatel Dagfinn Ilmari Mannsåker <
ilm...@ilmari.org> napsal:

> Pavel Stehule  writes:
>
> > út 30. 1. 2024 v 17:18 odesílatel Dagfinn Ilmari Mannsåker <
> > ilm...@ilmari.org> napsal:
> >
> >> Pavel Stehule  writes:
> >>
> >> > út 30. 1. 2024 v 16:43 odesílatel Dagfinn Ilmari Mannsåker <
> >> > ilm...@ilmari.org> napsal:
> >> >
> >> >> Pavel Stehule  writes:
> >> >>
> >> >> > I inserted perl reference support - hstore_plperl and json_plperl
> does
> >> >> it.
> >> >> >
> >> >> > +<->/* Dereference references recursively. */
> >> >> > +<->while (SvROK(in))
> >> >> > +<-><-->in = SvRV(in);
> >> >>
> >> >> That code in hstore_plperl and json_plperl is only relevant because
> they
> >> >> deal with non-scalar values (hashes for hstore, and also arrays for
> >> >> json) which must be passed as references.  The recursive nature of
> the
> >> >> dereferencing is questionable, and masked the bug fixed by commit
> >> >> 1731e3741cbbf8e0b4481665d7d523bc55117f63.
> >> >>
> >> >> bytea_plperl only deals with scalars (specifically strings), so
> should
> >> >> not concern itself with references.  In fact, this code breaks
> returning
> >> >> objects with overloaded stringification, for example:
> >> >>
> >> >> CREATE FUNCTION plperlu_overload() RETURNS bytea LANGUAGE plperlu
> >> >>   TRANSFORM FOR TYPE bytea
> >> >>   AS $$
> >> >> package StringOverload { use overload '""' => sub { "stuff" }; }
> >> >> return bless {}, "StringOverload";
> >> >>   $$;
> >> >>
> >> >> This makes the server crash with an assertion failure from Perl
> because
> >> >> SvPVbyte() was passed a non-scalar value:
> >> >>
> >> >> postgres: ilmari regression_bytea_plperl [local] SELECT: sv.c:2865:
> >> >> Perl_sv_2pv_flags:
> >> >> Assertion `SvTYPE(sv) != SVt_PVAV && SvTYPE(sv) != SVt_PVHV &&
> >> SvTYPE(sv)
> >> >> != SVt_PVFM' failed.
> >> >>
> >> >> If I remove the dereferincing loop it succeeds:
> >> >>
> >> >> SELECT encode(plperlu_overload(), 'escape') AS string;
> >> >>  string
> >> >> 
> >> >>  stuff
> >> >> (1 row)
> >> >>
> >> >> Attached is a v2 patch which removes the dereferencing and includes
> the
> >> >> above example as a test.
> >> >>
> >> >
> >> > But without dereference it returns bad value.
> >>
> >> Where exactly does it return a bad value?  The existing tests pass, and
> >> the one I included shows that it does the right thing in that case too.
> >> If you pass it an unblessed reference it returns the stringified version
> >> of that, as expected.
> >>
> >
> > ugly test code
> >
> > (2024-01-30 13:44:28) postgres=# CREATE or replace FUNCTION
> > perl_inverse_bytes(bytea) RETURNS bytea
> > TRANSFORM FOR TYPE bytea
> > AS $$ my $bytes =  pack 'H*', '0123'; my $ref = \$bytes;
>
> You are returning a reference, not a string.
>

I know, but for this case, should not be raised an error?


>
> > return $ref;
> > $$ LANGUAGE plperlu;
> > CREATE FUNCTION
> > (2024-01-30 13:44:33) postgres=# select perl_inverse_bytes(''), '
> '::bytea;
> > ┌──┬───┐
> > │  perl_inverse_bytes  │ bytea │
> > ╞══╪═══╡
> > │ \x5343414c41522830783130656134333829 │ \x20  │
> > └──┴───┘
> > (1 row)
>
> ~=# select encode('\x5343414c41522830783130656134333829', 'escape');
> ┌───┐
> │  encode   │
> ├───┤
> │ SCALAR(0x10ea438) │
> └───┘
>
> This is how Perl stringifies references in the absence of overloading.
> Return the byte string directly from your function and it will do the
> right thing:
>
> CREATE FUNCTION plperlu_bytes() RETURNS bytea LANGUAGE plperlu
>  TRANSFORM FOR TYPE bytea
>  AS $$ return pack  'H*', '0123'; $$;
>
> SELECT plperlu_bytes();
>  plperlu_bytes
> ---
>  \x0123
> (1 row)
>
>
> - ilmari
>


Re: Bytea PL/Perl transform

2024-01-30 Thread Dagfinn Ilmari Mannsåker
Pavel Stehule  writes:

> út 30. 1. 2024 v 17:18 odesílatel Dagfinn Ilmari Mannsåker <
> ilm...@ilmari.org> napsal:
>
>> Pavel Stehule  writes:
>>
>> > út 30. 1. 2024 v 16:43 odesílatel Dagfinn Ilmari Mannsåker <
>> > ilm...@ilmari.org> napsal:
>> >
>> >> Pavel Stehule  writes:
>> >>
>> >> > I inserted perl reference support - hstore_plperl and json_plperl does
>> >> it.
>> >> >
>> >> > +<->/* Dereference references recursively. */
>> >> > +<->while (SvROK(in))
>> >> > +<-><-->in = SvRV(in);
>> >>
>> >> That code in hstore_plperl and json_plperl is only relevant because they
>> >> deal with non-scalar values (hashes for hstore, and also arrays for
>> >> json) which must be passed as references.  The recursive nature of the
>> >> dereferencing is questionable, and masked the bug fixed by commit
>> >> 1731e3741cbbf8e0b4481665d7d523bc55117f63.
>> >>
>> >> bytea_plperl only deals with scalars (specifically strings), so should
>> >> not concern itself with references.  In fact, this code breaks returning
>> >> objects with overloaded stringification, for example:
>> >>
>> >> CREATE FUNCTION plperlu_overload() RETURNS bytea LANGUAGE plperlu
>> >>   TRANSFORM FOR TYPE bytea
>> >>   AS $$
>> >> package StringOverload { use overload '""' => sub { "stuff" }; }
>> >> return bless {}, "StringOverload";
>> >>   $$;
>> >>
>> >> This makes the server crash with an assertion failure from Perl because
>> >> SvPVbyte() was passed a non-scalar value:
>> >>
>> >> postgres: ilmari regression_bytea_plperl [local] SELECT: sv.c:2865:
>> >> Perl_sv_2pv_flags:
>> >> Assertion `SvTYPE(sv) != SVt_PVAV && SvTYPE(sv) != SVt_PVHV &&
>> SvTYPE(sv)
>> >> != SVt_PVFM' failed.
>> >>
>> >> If I remove the dereferincing loop it succeeds:
>> >>
>> >> SELECT encode(plperlu_overload(), 'escape') AS string;
>> >>  string
>> >> 
>> >>  stuff
>> >> (1 row)
>> >>
>> >> Attached is a v2 patch which removes the dereferencing and includes the
>> >> above example as a test.
>> >>
>> >
>> > But without dereference it returns bad value.
>>
>> Where exactly does it return a bad value?  The existing tests pass, and
>> the one I included shows that it does the right thing in that case too.
>> If you pass it an unblessed reference it returns the stringified version
>> of that, as expected.
>>
>
> ugly test code
>
> (2024-01-30 13:44:28) postgres=# CREATE or replace FUNCTION
> perl_inverse_bytes(bytea) RETURNS bytea
> TRANSFORM FOR TYPE bytea
> AS $$ my $bytes =  pack 'H*', '0123'; my $ref = \$bytes;

You are returning a reference, not a string.

> return $ref;
> $$ LANGUAGE plperlu;
> CREATE FUNCTION
> (2024-01-30 13:44:33) postgres=# select perl_inverse_bytes(''), ' '::bytea;
> ┌──┬───┐
> │  perl_inverse_bytes  │ bytea │
> ╞══╪═══╡
> │ \x5343414c41522830783130656134333829 │ \x20  │
> └──┴───┘
> (1 row)

~=# select encode('\x5343414c41522830783130656134333829', 'escape');
┌───┐
│  encode   │
├───┤
│ SCALAR(0x10ea438) │
└───┘

This is how Perl stringifies references in the absence of overloading.
Return the byte string directly from your function and it will do the
right thing:

CREATE FUNCTION plperlu_bytes() RETURNS bytea LANGUAGE plperlu
 TRANSFORM FOR TYPE bytea
 AS $$ return pack  'H*', '0123'; $$;

SELECT plperlu_bytes();
 plperlu_bytes
---
 \x0123
(1 row)


- ilmari




Re: Extend pgbench partitioning to pgbench_history

2024-01-30 Thread Gabriele Bartolini
Hi Abhijit,

Thanks for your input. Please accept my updated patch.

Ciao,
Gabriele

On Tue, 16 Jan 2024 at 12:53, Abhijit Menon-Sen  wrote:

> At 2023-11-30 11:29:15 +0100, gabriele.bartol...@enterprisedb.com wrote:
> >
> > With the attached patch, I extend the partitioning capability to the
> > pgbench_history table too.
> >
> > I have been thinking of adding an option to control this, but I preferred
> > to ask in this list whether it really makes sense or not (I struggle
> indeed
> > to see use cases where accounts is partitioned and history is not).
>
> I don't have a strong opinion about this, but I also can't think of a
> reason to want to create partitions for pgbench_accounts but leave out
> pgbench_history.
>
> > From ba8f507b126a9c5bd22dd40bb8ce0c1f0c43ac59 Mon Sep 17 00:00:00 2001
> > From: Gabriele Bartolini 
> > Date: Thu, 30 Nov 2023 11:02:39 +0100
> > Subject: [PATCH] Include pgbench_history in partitioning method for
> pgbench
> >
> > In case partitioning, make sure that pgbench_history is also partitioned
> with
> > the same criteria.
>
> I think "If partitioning" or "If we're creating partitions" would read
> better here. Also, same criteria as what? Maybe you could just add "as
> pgbench_accounts" to the end.
>
> > diff --git a/doc/src/sgml/ref/pgbench.sgml
> b/doc/src/sgml/ref/pgbench.sgml
> > index 05d3f81619..4c02d2a61d 100644
> > --- a/doc/src/sgml/ref/pgbench.sgml
> > +++ b/doc/src/sgml/ref/pgbench.sgml
> > […]
> > @@ -378,9 +378,9 @@ pgbench 
> options  d
> >
> --partitions=NUM
> >
> > 
> > -Create a partitioned pgbench_accounts table
> with
> > -NUM partitions of nearly equal size
> for
> > -the scaled number of accounts.
> > +Create partitioned pgbench_accounts and
> pgbench_history
> > +tables with NUM partitions of nearly
> equal size for
> > +the scaled number of accounts - and future history records.
> >  Default is 0, meaning no partitioning.
> > 
>
> I would just leave out the "-" and write "number of accounts and future
> history records".
>
> > diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
> > index 2e1650d0ad..87adaf4d8f 100644
> > --- a/src/bin/pgbench/pgbench.c
> > +++ b/src/bin/pgbench/pgbench.c
> > […]
> > @@ -889,8 +889,10 @@ usage(void)
> >  "  --index-tablespace=TABLESPACE\n"
> >  "   create indexes in the
> specified tablespace\n"
> >  "  --partition-method=(range|hash)\n"
> > -"   partition pgbench_accounts
> with this method (default: range)\n"
> > -"  --partitions=NUM partition pgbench_accounts
> into NUM parts (default: 0)\n"
> > +"   partition pgbench_accounts
> and pgbench_history with this method"
> > +"   (default: range)."
> > +"  --partitions=NUM partition pgbench_accounts
> and pgbench_history into NUM parts"
> > +"   (default: 0)\n"
> >  "  --tablespace=TABLESPACE  create tables in the
> specified tablespace\n"
> >  "  --unlogged-tablescreate tables as unlogged
> tables\n"
> >  "\nOptions to select what to run:\n"
>
> There's a missing newline after "(default: range).".
>
> I read through the rest of the patch closely. It looks fine to me. It
> applies, builds, and does create the partitions as intended.
>
> -- Abhijit
>


-- 
Gabriele Bartolini
Vice President, Cloud Native at EDB
enterprisedb.com


v2-0001-Include-pgbench_history-in-partitioning-method-fo.patch
Description: Binary data


Re: Bytea PL/Perl transform

2024-01-30 Thread Pavel Stehule
út 30. 1. 2024 v 17:18 odesílatel Dagfinn Ilmari Mannsåker <
ilm...@ilmari.org> napsal:

> Pavel Stehule  writes:
>
> > út 30. 1. 2024 v 16:43 odesílatel Dagfinn Ilmari Mannsåker <
> > ilm...@ilmari.org> napsal:
> >
> >> Pavel Stehule  writes:
> >>
> >> > I inserted perl reference support - hstore_plperl and json_plperl does
> >> it.
> >> >
> >> > +<->/* Dereference references recursively. */
> >> > +<->while (SvROK(in))
> >> > +<-><-->in = SvRV(in);
> >>
> >> That code in hstore_plperl and json_plperl is only relevant because they
> >> deal with non-scalar values (hashes for hstore, and also arrays for
> >> json) which must be passed as references.  The recursive nature of the
> >> dereferencing is questionable, and masked the bug fixed by commit
> >> 1731e3741cbbf8e0b4481665d7d523bc55117f63.
> >>
> >> bytea_plperl only deals with scalars (specifically strings), so should
> >> not concern itself with references.  In fact, this code breaks returning
> >> objects with overloaded stringification, for example:
> >>
> >> CREATE FUNCTION plperlu_overload() RETURNS bytea LANGUAGE plperlu
> >>   TRANSFORM FOR TYPE bytea
> >>   AS $$
> >> package StringOverload { use overload '""' => sub { "stuff" }; }
> >> return bless {}, "StringOverload";
> >>   $$;
> >>
> >> This makes the server crash with an assertion failure from Perl because
> >> SvPVbyte() was passed a non-scalar value:
> >>
> >> postgres: ilmari regression_bytea_plperl [local] SELECT: sv.c:2865:
> >> Perl_sv_2pv_flags:
> >> Assertion `SvTYPE(sv) != SVt_PVAV && SvTYPE(sv) != SVt_PVHV &&
> SvTYPE(sv)
> >> != SVt_PVFM' failed.
> >>
> >> If I remove the dereferincing loop it succeeds:
> >>
> >> SELECT encode(plperlu_overload(), 'escape') AS string;
> >>  string
> >> 
> >>  stuff
> >> (1 row)
> >>
> >> Attached is a v2 patch which removes the dereferencing and includes the
> >> above example as a test.
> >>
> >
> > But without dereference it returns bad value.
>
> Where exactly does it return a bad value?  The existing tests pass, and
> the one I included shows that it does the right thing in that case too.
> If you pass it an unblessed reference it returns the stringified version
> of that, as expected.
>

ugly test code

(2024-01-30 13:44:28) postgres=# CREATE or replace FUNCTION
perl_inverse_bytes(bytea) RETURNS bytea
TRANSFORM FOR TYPE bytea
AS $$ my $bytes =  pack 'H*', '0123'; my $ref = \$bytes;
return $ref;
$$ LANGUAGE plperlu;
CREATE FUNCTION
(2024-01-30 13:44:33) postgres=# select perl_inverse_bytes(''), ' '::bytea;
┌──┬───┐
│  perl_inverse_bytes  │ bytea │
╞══╪═══╡
│ \x5343414c41522830783130656134333829 │ \x20  │
└──┴───┘
(1 row)

expected

(2024-01-30 13:46:58) postgres=# select perl_inverse_bytes(''), ' '::bytea;
┌┬───┐
│ perl_inverse_bytes │ bytea │
╞╪═══╡
│ \x0123 │ \x20  │
└┴───┘
(1 row)


>
> CREATE FUNCTION plperl_reference() RETURNS bytea LANGUAGE plperl
>  TRANSFORM FOR TYPE bytea
>  AS $$ return []; $$;
>
> SELECT encode(plperl_reference(), 'escape') string;
> string
> ---
>  ARRAY(0x559a3109f0a8)
> (1 row)
>
> This would also crash if the dereferencing loop was left in place.
>
> > Maybe there should be a check so references cannot be returned? Probably
> is
> > not safe pass pointers between Perl and Postgres.
>
> There's no reason to ban references, that would break every Perl
> programmer's expectations.  And there are no pointers being passed,
> SvPVbyte() returns the stringified form of whatever's passed in, which
> is well-behaved for both blessed and unblessed references.
>
> If we really want to be strict, we should at least allow references to
> objects that overload stringification, as they are explicitly designed
> to be well-behaved as strings.  But that would be a lot of extra code
> for very little benefit over just letting Perl stringify everything.
>
> - ilmari
>


Re: Bytea PL/Perl transform

2024-01-30 Thread Dagfinn Ilmari Mannsåker
Pavel Stehule  writes:

> út 30. 1. 2024 v 16:43 odesílatel Dagfinn Ilmari Mannsåker <
> ilm...@ilmari.org> napsal:
>
>> Pavel Stehule  writes:
>>
>> > I inserted perl reference support - hstore_plperl and json_plperl does
>> it.
>> >
>> > +<->/* Dereference references recursively. */
>> > +<->while (SvROK(in))
>> > +<-><-->in = SvRV(in);
>>
>> That code in hstore_plperl and json_plperl is only relevant because they
>> deal with non-scalar values (hashes for hstore, and also arrays for
>> json) which must be passed as references.  The recursive nature of the
>> dereferencing is questionable, and masked the bug fixed by commit
>> 1731e3741cbbf8e0b4481665d7d523bc55117f63.
>>
>> bytea_plperl only deals with scalars (specifically strings), so should
>> not concern itself with references.  In fact, this code breaks returning
>> objects with overloaded stringification, for example:
>>
>> CREATE FUNCTION plperlu_overload() RETURNS bytea LANGUAGE plperlu
>>   TRANSFORM FOR TYPE bytea
>>   AS $$
>> package StringOverload { use overload '""' => sub { "stuff" }; }
>> return bless {}, "StringOverload";
>>   $$;
>>
>> This makes the server crash with an assertion failure from Perl because
>> SvPVbyte() was passed a non-scalar value:
>>
>> postgres: ilmari regression_bytea_plperl [local] SELECT: sv.c:2865:
>> Perl_sv_2pv_flags:
>> Assertion `SvTYPE(sv) != SVt_PVAV && SvTYPE(sv) != SVt_PVHV && SvTYPE(sv)
>> != SVt_PVFM' failed.
>>
>> If I remove the dereferincing loop it succeeds:
>>
>> SELECT encode(plperlu_overload(), 'escape') AS string;
>>  string
>> 
>>  stuff
>> (1 row)
>>
>> Attached is a v2 patch which removes the dereferencing and includes the
>> above example as a test.
>>
>
> But without dereference it returns bad value.

Where exactly does it return a bad value?  The existing tests pass, and
the one I included shows that it does the right thing in that case too.
If you pass it an unblessed reference it returns the stringified version
of that, as expected.

CREATE FUNCTION plperl_reference() RETURNS bytea LANGUAGE plperl
 TRANSFORM FOR TYPE bytea
 AS $$ return []; $$;

SELECT encode(plperl_reference(), 'escape') string;
string
---
 ARRAY(0x559a3109f0a8)
(1 row)

This would also crash if the dereferencing loop was left in place.

> Maybe there should be a check so references cannot be returned? Probably is
> not safe pass pointers between Perl and Postgres.

There's no reason to ban references, that would break every Perl
programmer's expectations.  And there are no pointers being passed,
SvPVbyte() returns the stringified form of whatever's passed in, which
is well-behaved for both blessed and unblessed references.

If we really want to be strict, we should at least allow references to
objects that overload stringification, as they are explicitly designed
to be well-behaved as strings.  But that would be a lot of extra code
for very little benefit over just letting Perl stringify everything.

- ilmari




Re: Parallelize correlated subqueries that execute within each worker

2024-01-30 Thread Akshat Jaimini
I think we should move this patch to the next CF as I believe that work is 
still going on resolving the last reported bug.

Re: Bytea PL/Perl transform

2024-01-30 Thread Pavel Stehule
út 30. 1. 2024 v 16:43 odesílatel Dagfinn Ilmari Mannsåker <
ilm...@ilmari.org> napsal:

> Pavel Stehule  writes:
>
> > I inserted perl reference support - hstore_plperl and json_plperl does
> it.
> >
> > +<->/* Dereference references recursively. */
> > +<->while (SvROK(in))
> > +<-><-->in = SvRV(in);
>
> That code in hstore_plperl and json_plperl is only relevant because they
> deal with non-scalar values (hashes for hstore, and also arrays for
> json) which must be passed as references.  The recursive nature of the
> dereferencing is questionable, and masked the bug fixed by commit
> 1731e3741cbbf8e0b4481665d7d523bc55117f63.
>
> bytea_plperl only deals with scalars (specifically strings), so should
> not concern itself with references.  In fact, this code breaks returning
> objects with overloaded stringification, for example:
>
> CREATE FUNCTION plperlu_overload() RETURNS bytea LANGUAGE plperlu
>   TRANSFORM FOR TYPE bytea
>   AS $$
> package StringOverload { use overload '""' => sub { "stuff" }; }
> return bless {}, "StringOverload";
>   $$;
>
> This makes the server crash with an assertion failure from Perl because
> SvPVbyte() was passed a non-scalar value:
>
> postgres: ilmari regression_bytea_plperl [local] SELECT: sv.c:2865:
> Perl_sv_2pv_flags:
> Assertion `SvTYPE(sv) != SVt_PVAV && SvTYPE(sv) != SVt_PVHV && SvTYPE(sv)
> != SVt_PVFM' failed.
>
> If I remove the dereferincing loop it succeeds:
>
> SELECT encode(plperlu_overload(), 'escape') AS string;
>  string
> 
>  stuff
> (1 row)
>
> Attached is a v2 patch which removes the dereferencing and includes the
> above example as a test.
>

But without dereference it returns bad value.

Maybe there should be a check so references cannot be returned? Probably is
not safe pass pointers between Perl and Postgres.



>
> - ilmari
>
>


Re: Fix some ubsan/asan related issues

2024-01-30 Thread Tristan Partin
Spend so much time writing out the email, I once again forget 
attachments...UGH.


--
Tristan Partin
Neon (https://neon.tech)
From 331cec1c9db6ff60dcc6d9ba62a9c8be4e5e95ed Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Mon, 29 Jan 2024 18:03:39 -0600
Subject: [PATCH v1 1/3] Refuse to register message in LogLogicalMessage if
 NULL

If this occurs, the memcpy of rdata_data in CopyXLogRecordToWAL breaks
the API contract of memcpy in glibc. The two pointer arguments are
marked as nonnull, even in the event the amount to copy is 0 bytes.
---
 src/backend/access/transam/xlog.c | 1 +
 src/backend/replication/logical/message.c | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 478377c4a2..929888beb5 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1288,6 +1288,7 @@ CopyXLogRecordToWAL(int write_len, bool isLogSwitch, XLogRecData *rdata,
 		}
 
 		Assert(CurrPos % XLOG_BLCKSZ >= SizeOfXLogShortPHD || rdata_len == 0);
+		Assert(rdata_data != NULL);
 		memcpy(currpos, rdata_data, rdata_len);
 		currpos += rdata_len;
 		CurrPos += rdata_len;
diff --git a/src/backend/replication/logical/message.c b/src/backend/replication/logical/message.c
index 2ac34e7781..126c57ef6e 100644
--- a/src/backend/replication/logical/message.c
+++ b/src/backend/replication/logical/message.c
@@ -67,7 +67,8 @@ LogLogicalMessage(const char *prefix, const char *message, size_t size,
 	XLogBeginInsert();
 	XLogRegisterData((char *) , SizeOfLogicalMessage);
 	XLogRegisterData(unconstify(char *, prefix), xlrec.prefix_size);
-	XLogRegisterData(unconstify(char *, message), size);
+	if (message)
+		XLogRegisterData(unconstify(char *, message), size);
 
 	/* allow origin filtering */
 	XLogSetRecordFlags(XLOG_INCLUDE_ORIGIN);
-- 
Tristan Partin
Neon (https://neon.tech)

From dc9488f3fdee69b981b52c985fb77106d7d301ff Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Wed, 24 Jan 2024 17:07:01 -0600
Subject: [PATCH v1 2/3] meson: Support compiling with -Db_sanitize=address

The ecpg is parser is extremely leaky, so we need to silence leak
detection.
---
 meson.build|  3 +++
 src/bin/initdb/initdb.c| 11 +++
 src/bin/pg_config/pg_config.c  | 10 ++
 src/bin/pg_resetwal/pg_resetwal.c  | 10 ++
 src/include/pg_config.h.in |  5 +
 src/interfaces/ecpg/preproc/ecpg.c | 11 +++
 6 files changed, 50 insertions(+)

diff --git a/meson.build b/meson.build
index 8ed51b6aae..d8c524d6f6 100644
--- a/meson.build
+++ b/meson.build
@@ -2530,6 +2530,9 @@ cdata.set_quoted('PG_VERSION_STR',
   )
 )
 
+if get_option('b_sanitize').contains('address')
+  cdata.set('USE_ADDRESS_SANITIZER', 1)
+endif
 
 ###
 # NLS / Gettext
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index ac409b0006..e18e716d9c 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -338,6 +338,17 @@ do { \
 		output_failed = true, output_errno = errno; \
 } while (0)
 
+#ifdef USE_ADDRESS_SANITIZER
+
+const char *__asan_default_options(void);
+
+const char *__asan_default_options(void)
+{
+	return "detect_leaks=0";
+}
+
+#endif
+
 /*
  * Escape single quotes and backslashes, suitably for insertions into
  * configuration files or SQL E'' strings.
diff --git a/src/bin/pg_config/pg_config.c b/src/bin/pg_config/pg_config.c
index 77d09ccfc4..26d0b2f62b 100644
--- a/src/bin/pg_config/pg_config.c
+++ b/src/bin/pg_config/pg_config.c
@@ -67,6 +67,16 @@ static const InfoItem info_items[] = {
 	{NULL, NULL}
 };
 
+#ifdef USE_ADDRESS_SANITIZER
+
+const char *__asan_default_options(void);
+
+const char *__asan_default_options(void)
+{
+	return "detect_leaks=0";
+}
+
+#endif
 
 static void
 help(void)
diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index e9dcb5a6d8..54f1ce5e44 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -89,6 +89,16 @@ static void KillExistingWALSummaries(void);
 static void WriteEmptyXLOG(void);
 static void usage(void);
 
+#ifdef USE_ADDRESS_SANITIZER
+
+const char *__asan_default_options(void);
+
+const char *__asan_default_options(void)
+{
+	return "detect_leaks=0";
+}
+
+#endif
 
 int
 main(int argc, char *argv[])
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 07e73567dc..ce0c700b6d 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -668,6 +668,11 @@
 /* Define to 1 if strerror_r() returns int. */
 #undef STRERROR_R_INT
 
+/* Define to 1 if using the address sanitizer. Typically this can be detecte
+ * with __has_feature(address_sanitizer), but GCC doesn't support it with C99.
+ * Remove it when the standard is bumped. */
+#undef USE_ADDRESS_SANITIZER
+
 /* Define to 1 to use ARMv8 CRC Extension. */
 #undef USE_ARMV8_CRC32C
 
diff --git 

Fix some ubsan/asan related issues

2024-01-30 Thread Tristan Partin

Patch 1:

Passing NULL as a second argument to memcpy breaks ubsan, and there 
didn't seem to be anything preventing that in the LogLogicalMessage() 
codepath. Here is a preventative measure in LogLogicalMessage() and an 
Assert() in CopyXLogRecordToWAL().


Patch 2:

Support building with -Db_sanitize=address in Meson. Various executables 
are leaky which can cause the builds to fail and tests to fail, when we 
are fine leaking this memory.


Personally, I am a big stickler for always freeing my memory in 
executables even if it gets released at program exit because it makes 
the leak sanitizer much more effective. However (!), I am not here to 
change the philosophy of memory management in one-off executables, so 
I have just silenced memory leaks in various executables for now.


Patch 3:

THIS DOESN'T WORK. DO NOT COMMIT. PROPOSING FOR DISCUSSION!

In my effort to try to see if the test suite would pass with asan 
enabled, I ran into a max_stack_depth issue. I tried maxing it out 
(hence, the patch), but that still didn't remedy my issue. I tried to 
look on the list for any relevant emails, but nothing turned up. Maybe 
others have not attempted this before? Seems doubtful.


Not entirely sure how to fix this issue. I personally find asan 
extremely effective, even more than valgrind, so it would be great if 
I could run Postgres with asan enabled to catch various stupid C issues 
I might create. On my system, ulimit -a returns 8192 kbytes, so Postgres 
just lets me set 7680 kbytes as the max. Whatever asan is doing doesn't 
seem to leave enough stack space for Postgres.


---

I would like to see patch 1 reviewed and committed. Patch 2 honestly 
doesn't matter unless asan support can be fixed. I can also add a patch 
that errors out the Meson build if asan support is requested. That way 
others don't spend time heading down a dead end.


--
Tristan Partin
Neon (https://neon.tech)




Re: cleanup patches for incremental backup

2024-01-30 Thread Robert Haas
On Mon, Jan 29, 2024 at 4:13 PM Nathan Bossart  wrote:
> On Mon, Jan 29, 2024 at 03:18:50PM -0500, Robert Haas wrote:
> > I'm wondering if what we need to do is run pg_walsummary on both
> > summary files in that case. If we just pick one or the other, how do
> > we know which one to pick?
>
> Even if we do that, isn't it possible that none of the summaries will
> include the change?  Presently, we get the latest summarized LSN, make a
> change, and then wait for the next summary file with a greater LSN than
> what we saw before the change.  But AFAICT there's no guarantee that means
> the change has been summarized yet, although the chances of that happening
> in a test are probably pretty small.
>
> Could we get the LSN before and after making the change and then inspect
> all summaries that include that LSN range?

The trick here is that each WAL summary file covers one checkpoint
cycle. The intent of the test is to load data into the table,
checkpoint, see what summaries exist, then update a row, checkpoint
again, and see what summaries now exist. We expect one new summary
because there's been one new checkpoint. When I was thinking about
this yesterday, I was imagining that we were somehow getting an extra
checkpoint in some cases. But it looks like it's actually an
off-by-one situation. In
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=calliphoridae=2024-01-29%2018%3A09%3A10
the new files that show up between "after insert" and "after new
summary" are:

00010152FAE0015AAAC8.summary (LSN distance ~500k)
00010152F7A80152FAE0.summary (LSN distance 824 bytes)

The checkpoint after the inserts says:

LOG:  checkpoint complete: wrote 14 buffers (10.9%); 0 WAL file(s)
added, 0 removed, 0 recycled; write=0.956 s, sync=0.929 s, total=3.059
s; sync files=39, longest=0.373 s, average=0.024 s; distance=491 kB,
estimate=491 kB; lsn=0/15AAB20, redo lsn=0/15AAAC8

And the checkpoint after the single-row update says:

LOG:  checkpoint complete: wrote 4 buffers (3.1%); 0 WAL file(s)
added, 0 removed, 0 recycled; write=0.648 s, sync=0.355 s, total=2.798
s; sync files=3, longest=0.348 s, average=0.119 s; distance=11 kB,
estimate=443 kB; lsn=0/15AD770, redo lsn=0/15AD718

So both of the new WAL summary files that are appearing here are from
checkpoints that happened before the single-row update. The larger
file is the one covering the 400 inserts, and the smaller one is the
checkpoint before that. Which means that the "Wait for a new summary
to show up." code isn't actually waiting long enough, and then the
whole thing goes haywire. The problem is, I think, that this code
naively thinks it can just wait for summarized_lsn and everything will
be fine ... but that assumes we were caught up when we first measured
the summarized_lsn, and that need not be so, because it takes some
short but non-zero amount of time for the summarizer to catch up with
the WAL generated during initdb.

I think the solution here is to find a better way to wait for the
inserts to be summarized, one that actually does wait for that to
happen.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Synchronizing slots from primary to standby

2024-01-30 Thread Bertrand Drouvot
Hi,

On Mon, Jan 29, 2024 at 09:15:57AM +, Bertrand Drouvot wrote:
> Hi,
> 
> On Mon, Jan 29, 2024 at 02:35:52PM +0530, Amit Kapila wrote:
> > I think it is better to create a separate patch for two_phase after
> > this patch gets committed.
> 
> Yeah, makes sense, will do, thanks!

It's done in [1].

[1]: 
https://www.postgresql.org/message-id/ZbkYrLPhH%2BRxpZlW%40ip-10-97-1-34.eu-west-3.compute.internal

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Bytea PL/Perl transform

2024-01-30 Thread Dagfinn Ilmari Mannsåker
Pavel Stehule  writes:

> I inserted perl reference support - hstore_plperl and json_plperl does it.
>
> +<->/* Dereference references recursively. */
> +<->while (SvROK(in))
> +<-><-->in = SvRV(in);

That code in hstore_plperl and json_plperl is only relevant because they
deal with non-scalar values (hashes for hstore, and also arrays for
json) which must be passed as references.  The recursive nature of the
dereferencing is questionable, and masked the bug fixed by commit
1731e3741cbbf8e0b4481665d7d523bc55117f63.

bytea_plperl only deals with scalars (specifically strings), so should
not concern itself with references.  In fact, this code breaks returning
objects with overloaded stringification, for example:

CREATE FUNCTION plperlu_overload() RETURNS bytea LANGUAGE plperlu
  TRANSFORM FOR TYPE bytea
  AS $$
package StringOverload { use overload '""' => sub { "stuff" }; }
return bless {}, "StringOverload";
  $$;

This makes the server crash with an assertion failure from Perl because
SvPVbyte() was passed a non-scalar value:

postgres: ilmari regression_bytea_plperl [local] SELECT: sv.c:2865: 
Perl_sv_2pv_flags:
Assertion `SvTYPE(sv) != SVt_PVAV && SvTYPE(sv) != SVt_PVHV && SvTYPE(sv) != 
SVt_PVFM' failed.

If I remove the dereferincing loop it succeeds:

SELECT encode(plperlu_overload(), 'escape') AS string;
 string

 stuff
(1 row)

Attached is a v2 patch which removes the dereferencing and includes the
above example as a test.

- ilmari

>From aabaf4f5932f59de2fed48bbba7339807a1f04bd Mon Sep 17 00:00:00 2001
From: "ok...@github.com" 
Date: Tue, 30 Jan 2024 10:31:00 +0100
Subject: [PATCH v2] Add bytea transformation for plperl

---
 contrib/Makefile  |  4 +-
 contrib/bytea_plperl/.gitignore   |  4 ++
 contrib/bytea_plperl/Makefile | 39 ++
 contrib/bytea_plperl/bytea_plperl--1.0.sql| 19 +++
 contrib/bytea_plperl/bytea_plperl.c   | 44 
 contrib/bytea_plperl/bytea_plperl.control |  7 +++
 contrib/bytea_plperl/bytea_plperlu--1.0.sql   | 19 +++
 contrib/bytea_plperl/bytea_plperlu.control|  6 +++
 .../bytea_plperl/expected/bytea_plperl.out| 49 ++
 .../bytea_plperl/expected/bytea_plperlu.out   | 49 ++
 contrib/bytea_plperl/meson.build  | 51 +++
 contrib/bytea_plperl/sql/bytea_plperl.sql | 31 +++
 contrib/bytea_plperl/sql/bytea_plperlu.sql| 31 +++
 contrib/meson.build   |  1 +
 doc/src/sgml/plperl.sgml  | 16 ++
 15 files changed, 368 insertions(+), 2 deletions(-)
 create mode 100644 contrib/bytea_plperl/.gitignore
 create mode 100644 contrib/bytea_plperl/Makefile
 create mode 100644 contrib/bytea_plperl/bytea_plperl--1.0.sql
 create mode 100644 contrib/bytea_plperl/bytea_plperl.c
 create mode 100644 contrib/bytea_plperl/bytea_plperl.control
 create mode 100644 contrib/bytea_plperl/bytea_plperlu--1.0.sql
 create mode 100644 contrib/bytea_plperl/bytea_plperlu.control
 create mode 100644 contrib/bytea_plperl/expected/bytea_plperl.out
 create mode 100644 contrib/bytea_plperl/expected/bytea_plperlu.out
 create mode 100644 contrib/bytea_plperl/meson.build
 create mode 100644 contrib/bytea_plperl/sql/bytea_plperl.sql
 create mode 100644 contrib/bytea_plperl/sql/bytea_plperlu.sql

diff --git a/contrib/Makefile b/contrib/Makefile
index da4e2316a3..56c628df00 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -77,9 +77,9 @@ ALWAYS_SUBDIRS += sepgsql
 endif
 
 ifeq ($(with_perl),yes)
-SUBDIRS += bool_plperl hstore_plperl jsonb_plperl
+SUBDIRS += bool_plperl bytea_plperl hstore_plperl jsonb_plperl
 else
-ALWAYS_SUBDIRS += bool_plperl hstore_plperl jsonb_plperl
+ALWAYS_SUBDIRS += bool_plperl bytea_plperl hstore_plperl jsonb_plperl
 endif
 
 ifeq ($(with_python),yes)
diff --git a/contrib/bytea_plperl/.gitignore b/contrib/bytea_plperl/.gitignore
new file mode 100644
index 00..5dcb3ff972
--- /dev/null
+++ b/contrib/bytea_plperl/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/contrib/bytea_plperl/Makefile b/contrib/bytea_plperl/Makefile
new file mode 100644
index 00..1814d2f418
--- /dev/null
+++ b/contrib/bytea_plperl/Makefile
@@ -0,0 +1,39 @@
+# contrib/bytea_plperl/Makefile
+
+MODULE_big = bytea_plperl
+OBJS = \
+	$(WIN32RES) \
+	bytea_plperl.o
+PGFILEDESC = "bytea_plperl - bytea transform for plperl"
+
+PG_CPPFLAGS = -I$(top_srcdir)/src/pl/plperl
+
+EXTENSION = bytea_plperlu bytea_plperl
+DATA = bytea_plperlu--1.0.sql bytea_plperl--1.0.sql
+
+REGRESS = bytea_plperl bytea_plperlu
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/bytea_plperl
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
+
+# We must link libperl explicitly
+ifeq ($(PORTNAME), win32)

Documentation: warn about two_phase when altering a subscription

2024-01-30 Thread Bertrand Drouvot
Hi hackers,

776621a5e4 added a "warning" in the documentation to alter a subscription (to
ensure the slot's failover property matches the subscription's one).

The same remark could be done for the two_phase option. This patch is an attempt
to do so.

Looking forward to your feedback,

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
>From 756886e59afddd09fa6f87ab95af7292ebca3e76 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot 
Date: Tue, 30 Jan 2024 14:48:16 +
Subject: [PATCH v1] Documentation: warn about two_phase when altering a
 subscription's slot name.

776621a5e4 added a warning about the newly failover option. Doing the same
for the already existing two_phase one.
---
 doc/src/sgml/ref/alter_subscription.sgml | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)
 100.0% doc/src/sgml/ref/

diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index e9e6d9d74a..cd553f6312 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -235,15 +235,17 @@ ALTER SUBSCRIPTION name RENAME TO <
  
   When altering the
   slot_name,
-  the failover property value of the named slot may differ from the
+  the failover and two_phase properties
+  values of the named slot may differ from their counterparts
   failover
-  parameter specified in the subscription. When creating the slot,
-  ensure the slot failover property matches the
+  and two_phase
+  parameters specified in the subscription. When creating the slot, ensure
+  the slot failover and two_phase
+  properties match their counterparts parameters values of the subscription.
+  Otherwise, the slot on the publisher may behave differently from what subscription's
   failover
-  parameter value of the subscription. Otherwise, the slot on the
-  publisher may behave differently from what subscription's
-  failover
-  option says. The slot on the publisher could either be
+  and two_phase
+  options say: for example, the slot on the publisher could either be
   synced to the standbys even when the subscription's
   failover
   option is disabled or could be disabled for sync
-- 
2.34.1



Re: scram_iterations is undocumented GUC_REPORT

2024-01-30 Thread Daniel Gustafsson
> On 30 Jan 2024, at 15:36, Tom Lane  wrote:
> 
> Alvaro Herrera  writes:
>> I noticed while answering a question that commit b577743000cd added the
>> GUC scram_iterations and marked it GUC_REPORT, but failed to add it to
>> the PQparameterStatus documentation.
> 
> Why is it GUC_REPORT at all?  I don't see a strong need for that.
> (In particular, the report would be delivered too late to be of any
> use in client authentication.)

It's needed by PQencryptPasswordConn.

--
Daniel Gustafsson





Re: [PATCH] Add native windows on arm64 support

2024-01-30 Thread Dave Cramer
On Tue, 30 Jan 2024 at 08:38, Andrew Dunstan  wrote:

>
> On 2024-01-29 Mo 11:20, Dave Cramer wrote:
>
>
> Dave Cramer
> www.postgres.rocks
>
>
> On Mon, 29 Jan 2024 at 11:16, Andrew Dunstan  wrote:
>
>>
>> On 2024-01-26 Fr 09:18, Dave Cramer wrote:
>>
>>
>>
>> On Fri, 26 Jan 2024 at 07:36, Andrew Dunstan  wrote:
>>
>>>
>>> On 2024-01-25 Th 20:32, Michael Paquier wrote:
>>> > On Thu, Jan 25, 2024 at 04:52:30PM -0500, Dave Cramer wrote:
>>> >> On Thu, 25 Jan 2024 at 16:32, Andrew Dunstan 
>>> wrote:
>>> >>> On 2024-01-25 Th 16:17, Dave Cramer wrote:
>>> >>> Yeah, I think the default Developer Command Prompt for VS2022 is set
>>> up
>>> >>> for x86 builds. AIUI you should start by executing "vcvarsall
>>> x64_arm64".
>>> >> Yup, now I'm in the same state you are
>>> > Wait a minute here.  Based on [1], x64_arm64 means you can use a x64
>>> > host and you'll be able to produce ARM64 builds, still these will not
>>> > be able to run on the host where they were built.  How much of the
>>> > patch posted upthread is required to produce such builds?  Basically
>>> > everything from it, I guess, so as build dependencies can be
>>> > satisfied?
>>> >
>>> > [1]:
>>> https://learn.microsoft.com/en-us/cpp/build/building-on-the-command-line?view=msvc-170
>>>
>>>
>>> If you look at the table here x86 and x64 are the only supported host
>>> architectures. But that's OK, the x64 binaries will run on arm64 (W11
>>> ARM64 has x64 emulation builtin). If that didn't work Dave and I would
>>> not have got as far as we have. But you want the x64_arm64 argument to
>>> vcvarsall so you will get ARM64 output.
>>>
>>
>> I've rebuilt it using  x64_arm64 and with the attached (very naive patch)
>> and I still get an x64 binary :(
>>
>>
>> With this patch I still get a build error, but it's different :-)
>>
>>
>> [1406/2088] "link" @src/backend/postgres.exe.rsp
>> FAILED: src/backend/postgres.exe src/backend/postgres.pdb
>> "link" @src/backend/postgres.exe.rsp
>>Creating library src\backend\postgres.exe.lib
>>
>> storage_lmgr_s_lock.c.obj : error LNK2019: unresolved external symbol
>> spin_delay referenced in function perform_spin_delay
>>
>> src\backend\postgres.exe : fatal error LNK1120: 1 unresolved externals
>>
>
> Did you add the latest lock.patch ?
>
>
>
>
> I'm a bit confused about exactly what needs to be applied. Can you supply
> a complete patch to be applied to a pristine checkout that will let me
> build?
>
>
> cheers
>

See attached.

Dave


0001-fix-build-for-arm.patch
Description: Binary data


0002-naive-patch-to-fix-locking-for-arm64.patch
Description: Binary data


Re: scram_iterations is undocumented GUC_REPORT

2024-01-30 Thread Alvaro Herrera
On 2024-Jan-30, Jelte Fennema-Nio wrote:

> On Tue, 30 Jan 2024 at 13:37, Alvaro Herrera  wrote:
> >
> > I noticed while answering a question that commit b577743000cd added the
> > GUC scram_iterations and marked it GUC_REPORT, but failed to add it to
> > the PQparameterStatus documentation.
> 
> +1 the improvements your suggesting (although 3 I don't know enough
> about to be sure)
> 
> One important note though is that this list is tracked in two
> different places, so both of these places should be updated:
> - doc/src/sgml/protocol.sgml
> - doc/src/sgml/libpq.sgml

Ooh, you're right.  I propose to turn the list into a

which looks _much_ nicer to read, as in the attached screenshot of the
PDF.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
 really, I see PHP as like a strange amalgamation of C, Perl, Shell
 inflex: you know that "amalgam" means "mixture with mercury",
   more or less, right?
 i.e., "deadly poison"
>From 3f7009c6d37c890081bd9511d6a50cee17cee0e5 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Tue, 30 Jan 2024 15:27:47 +0100
Subject: [PATCH v2] Update PQparameterStatus and ParameterStatus docs

Cover scram_iterations, which were missed in 16.

Also turn the list into a  with 2 columns, which is much
nicer to read.
---
 doc/src/sgml/libpq.sgml| 43 ++
 doc/src/sgml/protocol.sgml | 43 ++
 2 files changed, 40 insertions(+), 46 deletions(-)

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index d0d5aefadc..1d8998efb2 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -2509,30 +2509,27 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
   
 
   
-   Parameters reported as of the current release include
-   server_version,
-   server_encoding,
-   client_encoding,
-   application_name,
-   default_transaction_read_only,
-   in_hot_standby,
-   is_superuser,
-   session_authorization,
-   DateStyle,
-   IntervalStyle,
-   TimeZone,
-   integer_datetimes, and
-   standard_conforming_strings.
-   (server_encoding, TimeZone, and
-   integer_datetimes were not reported by releases before 8.0;
-   standard_conforming_strings was not reported by releases
-   before 8.1;
-   IntervalStyle was not reported by releases before 8.4;
-   application_name was not reported by releases before
-   9.0;
-   default_transaction_read_only and
+   Parameters reported as of the current release include:
+   
+application_name
+client_encoding
+DateStyle
+default_transaction_read_only
+in_hot_standby
+integer_datetimes
+IntervalStyle
+is_superuser
+scram_iterations
+server_encoding
+server_version
+session_authorization
+standard_conforming_strings
+TimeZone
+   
+   (default_transaction_read_only and
in_hot_standby were not reported by releases before
-   14.)
+   14; scram_iterations was not reported by releases
+   before 16.)
Note that
server_version,
server_encoding and
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index bb4fef1f51..ed1d62f5f8 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1314,30 +1314,27 @@ SELCT 1/0;
 

 At present there is a hard-wired set of parameters for which
-ParameterStatus will be generated: they are
-server_version,
-server_encoding,
-client_encoding,
-application_name,
-default_transaction_read_only,
-in_hot_standby,
-is_superuser,
-session_authorization,
-DateStyle,
-IntervalStyle,
-TimeZone,
-integer_datetimes, and
-standard_conforming_strings.
-(server_encoding, TimeZone, and
-integer_datetimes were not reported by releases before 8.0;
-standard_conforming_strings was not reported by releases
-before 8.1;
-IntervalStyle was not reported by releases before 8.4;
-application_name was not reported by releases before
-9.0;
-default_transaction_read_only and
+ParameterStatus will be generated.  They are:
+
+ application_name
+ client_encoding
+ DateStyle
+ default_transaction_read_only
+ in_hot_standby
+ integer_datetimes
+ IntervalStyle
+ is_superuser
+ scram_iterations
+ server_encoding
+ server_version
+ session_authorization
+ standard_conforming_strings
+ TimeZone
+
+(default_transaction_read_only and
 in_hot_standby were not reported by releases before
-14.)
+14; scram_iterations was not reported by releases
+before 16.)
 Note that
 server_version,
 server_encoding and
-- 
2.39.2



Re: scram_iterations is undocumented GUC_REPORT

2024-01-30 Thread Tom Lane
Alvaro Herrera  writes:
> I noticed while answering a question that commit b577743000cd added the
> GUC scram_iterations and marked it GUC_REPORT, but failed to add it to
> the PQparameterStatus documentation.

Why is it GUC_REPORT at all?  I don't see a strong need for that.
(In particular, the report would be delivered too late to be of any
use in client authentication.)

regards, tom lane




Re: psql's FETCH_COUNT (cursor) is not being respected for CTEs

2024-01-30 Thread Daniel Verite
vignesh C wrote:

> patching file src/interfaces/libpq/exports.txt
> Hunk #1 FAILED at 191.
> 1 out of 1 hunk FAILED -- saving rejects to file
> src/interfaces/libpq/exports.txt.rej
> 
> Please post an updated version for the same.

PFA a rebased version.


Best regards,
-- 
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite
From 8cfb82b1e36e96996637948a231ae35b9af1e074 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Daniel=20V=C3=A9rit=C3=A9?= 
Date: Tue, 30 Jan 2024 14:38:21 +0100
Subject: [PATCH v6 1/2] Implement retrieval of results in chunks with libpq.

This mode is similar to the single-row mode except that chunks
of results contain up to N rows instead of a single row.
It is meant to reduce the overhead of the row-by-row allocations
for large result sets.
The mode is selected with PQsetChunkedRowsMode(int maxRows) and results
have the new status code PGRES_TUPLES_CHUNK.
---
 doc/src/sgml/libpq.sgml   |  96 ++
 .../libpqwalreceiver/libpqwalreceiver.c   |   1 +
 src/bin/pg_amcheck/pg_amcheck.c   |   1 +
 src/interfaces/libpq/exports.txt  |   1 +
 src/interfaces/libpq/fe-exec.c| 117 +++---
 src/interfaces/libpq/libpq-fe.h   |   4 +-
 src/interfaces/libpq/libpq-int.h  |   7 +-
 7 files changed, 184 insertions(+), 43 deletions(-)

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index d0d5aefadc..f7f5a04df6 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -3537,7 +3537,20 @@ ExecStatusType PQresultStatus(const PGresult *res);
 The PGresult contains a single result 
tuple
 from the current command.  This status occurs only when
 single-row mode has been selected for the query
-(see ).
+(see ).
+   
+  
+ 
+
+ 
+  PGRES_TUPLES_CHUNK
+  
+   
+The PGresult contains several tuples
+from the current command. The count of tuples cannot exceed
+the maximum passed to .
+This status occurs only when the chunked mode has been selected
+for the query (see ).

   
  
@@ -5189,8 +5202,8 @@ PGresult *PQgetResult(PGconn *conn);
   
Another frequently-desired feature that can be obtained with
 and 
-   is retrieving large query results a row at a time.  This is discussed
-   in .
+   is retrieving large query results a limited number of rows at a time.  This 
is discussed
+   in .
   
 
   
@@ -5554,12 +5567,13 @@ int PQflush(PGconn *conn);
 
 
 
- To enter single-row mode, call PQsetSingleRowMode
- before retrieving results with PQgetResult.
- This mode selection is effective only for the query currently
- being processed. For more information on the use of
- PQsetSingleRowMode,
- refer to .
+ To enter single-row or chunked modes, call
+ respectively PQsetSingleRowMode
+ or PQsetChunkedRowsMode before retrieving results
+ with PQgetResult.  This mode selection is effective
+ only for the query currently being processed. For more information on the
+ use of these functions refer
+ to .
 
 
 
@@ -5926,10 +5940,10 @@ UPDATE mytable SET x = x + 1 WHERE id = 42;
   
  
 
- 
-  Retrieving Query Results Row-by-Row
+ 
+  Retrieving Query Results by chunks
 
-  
+  
libpq
single-row mode
   
@@ -5940,13 +5954,15 @@ UPDATE mytable SET x = x + 1 WHERE id = 42;
PGresult.  This can be unworkable for commands
that return a large number of rows.  For such cases, applications can use
 and  
in
-   single-row mode.  In this mode, the result row(s) are
-   returned to the application one at a time, as they are received from the
-   server.
+   single-row mode or chunked 
mode.
+   In these modes, the result row(s) are returned to the application one at a
+   time for the single-row mode and by chunks for the chunked mode, as they
+   are received from the server.
   
 
   
-   To enter single-row mode, call 
+   To enter these modes, call 
+or 
immediately after a successful call of 
(or a sibling function).  This mode selection is effective only for the
currently executing query.  Then call 
@@ -5954,7 +5970,8 @@ UPDATE mytable SET x = x + 1 WHERE id = 42;
linkend="libpq-async"/>.  If the query returns any rows, they are returned
as individual PGresult objects, which look like
normal query results except for having status code
-   PGRES_SINGLE_TUPLE instead of
+   PGRES_SINGLE_TUPLE for the single-row mode and
+   PGRES_TUPLES_CHUNK for the chunked mode, instead of
PGRES_TUPLES_OK.  After the last row, or immediately if
the query returns zero rows, a zero-row object with status
PGRES_TUPLES_OK is returned; this is the signal that no
@@ -5967,9 +5984,9 @@ UPDATE mytable SET x = x + 1 WHERE id = 42;
   
 
   
-   When using pipeline mode, 

Re: POC, WIP: OR-clause support for indexes

2024-01-30 Thread jian he
On Tue, Dec 5, 2023 at 6:55 PM Andrei Lepikhov
 wrote:
>
> Here is fresh version with the pg_dump.pl regex fixed. Now it must pass
> buildfarm.

+JumbleState *
+JumbleExpr(Expr *expr, uint64 *queryId)
+{
+ JumbleState *jstate = NULL;
+
+ Assert(queryId != NULL);
+
+ jstate = (JumbleState *) palloc(sizeof(JumbleState));
+
+ /* Set up workspace for query jumbling */
+ jstate->jumble = (unsigned char *) palloc(JUMBLE_SIZE);
+ jstate->jumble_len = 0;
+ jstate->clocations_buf_size = 32;
+ jstate->clocations = (LocationLen *)
+ palloc(jstate->clocations_buf_size * sizeof(LocationLen));
+ jstate->clocations_count = 0;
+ jstate->highest_extern_param_id = 0;
+
+ /* Compute query ID */
+ _jumbleNode(jstate, (Node *) expr);
+ *queryId = DatumGetUInt64(hash_any_extended(jstate->jumble,
+ jstate->jumble_len,
+ 0));
+
+ if (*queryId == UINT64CONST(0))
+ *queryId = UINT64CONST(1);
+
+ return jstate;
+}

+/*
+ * Hash function that's compatible with guc_name_compare
+ */
+static uint32
+orclause_hash(const void *data, Size keysize)
+{
+ OrClauseGroupKey   *key = (OrClauseGroupKey *) data;
+ uint64 hash;
+
+ (void) JumbleExpr(key->expr, );
+ hash += ((uint64) key->opno + (uint64) key->exprtype) % UINT64_MAX;
+ return hash;
+}

correct me if i am wrong:
in orclause_hash, you just want to return a uint32, then why does the
JumbleExpr function return struct JumbleState.
here JumbleExpr, we just simply hash part of a Query struct,
so JumbleExpr's queryId would be confused with JumbleQuery function's queryId.

not sure the purpose of the following:
+ if (*queryId == UINT64CONST(0))
+ *queryId = UINT64CONST(1);

even if  *queryId is 0
`hash += ((uint64) key->opno + (uint64) key->exprtype) % UINT64_MAX;`
will make the hash return non-zero?

+ MemSet(, 0, sizeof(info));
i am not sure this is necessary.

Some comments on OrClauseGroupEntry would be great.

seems there is no doc.

create or replace function retint(int) returns int as
$func$
begin return $1 + round(10 * random()); end
$func$ LANGUAGE plpgsql;

set enable_or_transformation to on;
EXPLAIN (COSTS OFF)
SELECT count(*) FROM tenk1
WHERE thousand = 42 AND (tenthous * retint(1) = NULL OR tenthous *
retint(1) = 3) OR thousand = 41;

returns:
QUERY PLAN
---
 Aggregate
   ->  Seq Scan on tenk1
 Filter: (((thousand = 42) AND ((tenthous * retint(1)) = ANY
('{NULL,3}'::integer[]))) OR (thousand = 41))
(3 rows)

Based on the query plan, retint executed once, but here it should be
executed twice?
maybe we need to use contain_volatile_functions to check through the
other part of the operator expression.

+ if (IsA(leftop, Const))
+ {
+ opno = get_commutator(opno);
+
+ if (!OidIsValid(opno))
+ {
+ /* Commuter doesn't exist, we can't reverse the order */
+ or_list = lappend(or_list, orqual);
+ continue;
+ }
+
+ nconst_expr = get_rightop(orqual);
+ const_expr = get_leftop(orqual);
+ }
+ else if (IsA(rightop, Const))
+ {
+ const_expr = get_rightop(orqual);
+ nconst_expr = get_leftop(orqual);
+ }
+ else
+ {
+ or_list = lappend(or_list, orqual);
+ continue;
+ }
do we need to skip this transformation for the const type is anyarray?




Re: Functions to return random numbers in a given range

2024-01-30 Thread Dean Rasheed
On Tue, 30 Jan 2024 at 12:47, Aleksander Alekseev
 wrote:
>
> Maybe I'm missing something but I'm not sure if I understand what this
> test tests particularly:
>
> ```
> -- There should be no triple duplicates in 1000 full-range 32-bit random()
> -- values.  (Each of the C(1000, 3) choices of triplets from the 1000 values
> -- has a probability of 1/(2^32)^2 of being a triple duplicate, so the
> -- average number of triple duplicates is 1000 * 999 * 998 / 6 / 2^64, which
> -- is roughly 9e-12.)
> SELECT r, count(*)
> FROM (SELECT random(-2147483648, 2147483647) r
>   FROM generate_series(1, 1000)) ss
> GROUP BY r HAVING count(*) > 2;
> ```
>
> The intent seems to be to check the fact that random numbers are
> distributed evenly. If this is the case I think the test is wrong. The
> sequence of numbers 100, 100, 100, 100, 100 is as random as 99, 8, 4,
> 12, 45 and every particular sequence has low probability. All in all
> personally I would argue that this is a meaningless test that just
> fails with a low probability. Same for the tests that follow below.
>

I'm following the same approach used to test the existing random
functions, and the idea is the same. For example, this existing test:

-- There should be no duplicates in 1000 random() values.
-- (Assuming 52 random bits in the float8 results, we could
-- take as many as 3000 values and still have less than 1e-9 chance
-- of failure, per https://en.wikipedia.org/wiki/Birthday_problem)
SELECT r, count(*)
FROM (SELECT random() r FROM generate_series(1, 1000)) ss
GROUP BY r HAVING count(*) > 1;

If the underlying PRNG were non-uniform, or the method of reduction to
the required range was flawed in some way that reduced the number of
actual possible return values, then the probability of duplicates
would be increased. A non-uniform distribution would probably be
caught by the KS tests, but uniform gaps in the possible outputs might
not be, so I think this test still has value.

> The proper way of testing PRNG would be to call setseed() and compare
> return values with expected ones. I don't mind testing the proposed
> invariants but they should do this after calling setseed(). Currently
> the patch places the tests right before the call.
>

There are also new tests of that nature, following the call to
setseed(0.5). They're useful for a quick visual check of the results,
and confirming the expected number of digits after the decimal point
in the numeric case. However, I think those tests are insufficient on
their own.

Regards,
Dean




Re: compiling postgres on windows arm using meson

2024-01-30 Thread Nazir Bilal Yavuz
Hi,

On Thu, 18 Jan 2024 at 05:07, Dave Cramer  wrote:
>
> Greetings,
> Getting the following error
>
> [1146/2086] Generating src/backend/postgres.def with a custom command 
> (wrapped by meson to set PATH)
> FAILED: src/backend/postgres.def
> "C:\Program Files\Meson\meson.exe" "--internal" "exe" "--unpickle" 
> "C:\Users\davec\projects\postgresql\build\meson-private\meson_exe_perl.EXE_53b41ebc2e76cfc92dd6a2af212140770543faae.dat"
> while executing ['c:\\perl\\bin\\perl.EXE', 
> '../src/backend/../tools/msvc_gendef.pl', '--arch', 'aarch64', '--tempdir', 
> 'src/backend/postgres.def.p', '--deffile', 'src/backend/postgres.def', 
> 'src/backend/postgres_lib.a', 'src/common/libpgcommon_srv.a', 
> 'src/port/libpgport_srv.a']
> --- stdout ---
>
> --- stderr ---
> Usage: msvc_gendef.pl --arch  --deffile  --tempdir  
> files-or-directories
> arch: x86 | x86_64
> deffile: path of the generated file
> tempdir: directory for temporary files
> files or directories: object files or directory containing object files
>
> log attached

>From the docs [1]: PostgreSQL will only build for the x64 architecture
on 64-bit Windows.

So, I think that is expected.

[1] 
https://www.postgresql.org/docs/current/install-windows-full.html#INSTALL-WINDOWS-FULL-64-BIT

-- 
Regards,
Nazir Bilal Yavuz
Microsoft




Re: [PATCH] Add native windows on arm64 support

2024-01-30 Thread Andrew Dunstan


On 2024-01-29 Mo 11:20, Dave Cramer wrote:


Dave Cramer
www.postgres.rocks


On Mon, 29 Jan 2024 at 11:16, Andrew Dunstan  wrote:


On 2024-01-26 Fr 09:18, Dave Cramer wrote:



On Fri, 26 Jan 2024 at 07:36, Andrew Dunstan
 wrote:


On 2024-01-25 Th 20:32, Michael Paquier wrote:
> On Thu, Jan 25, 2024 at 04:52:30PM -0500, Dave Cramer wrote:
>> On Thu, 25 Jan 2024 at 16:32, Andrew Dunstan
 wrote:
>>> On 2024-01-25 Th 16:17, Dave Cramer wrote:
>>> Yeah, I think the default Developer Command Prompt for
VS2022 is set up
>>> for x86 builds. AIUI you should start by executing
"vcvarsall x64_arm64".
>> Yup, now I'm in the same state you are
> Wait a minute here.  Based on [1], x64_arm64 means you can
use a x64
> host and you'll be able to produce ARM64 builds, still
these will not
> be able to run on the host where they were built.  How much
of the
> patch posted upthread is required to produce such builds? 
Basically
> everything from it, I guess, so as build dependencies can be
> satisfied?
>
> [1]:

https://learn.microsoft.com/en-us/cpp/build/building-on-the-command-line?view=msvc-170


If you look at the table here x86 and x64 are the only
supported host
architectures. But that's OK, the x64 binaries will run on
arm64 (W11
ARM64 has x64 emulation builtin). If that didn't work Dave
and I would
not have got as far as we have. But you want the x64_arm64
argument to
vcvarsall so you will get ARM64 output.


I've rebuilt it using  x64_arm64 and with the attached (very
naive patch) and I still get an x64 binary :(



With this patch I still get a build error, but it's different :-)


[1406/2088] "link" @src/backend/postgres.exe.rsp
FAILED: src/backend/postgres.exe src/backend/postgres.pdb
"link" @src/backend/postgres.exe.rsp
   Creating library src\backend\postgres.exe.lib

storage_lmgr_s_lock.c.obj : error LNK2019: unresolved external
symbol spin_delay referenced in function perform_spin_delay

src\backend\postgres.exe : fatal error LNK1120: 1 unresolved externals


Did you add the latest lock.patch ?





I'm a bit confused about exactly what needs to be applied. Can you 
supply a complete patch to be applied to a pristine checkout that will 
let me build?



cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: Optmize bitmapword macros calc (src/backend/nodes/bitmapset.c)

2024-01-30 Thread Ranier Vilela
Em seg., 29 de jan. de 2024 às 19:40, Nathan Bossart <
nathandboss...@gmail.com> escreveu:

> On Tue, Jan 30, 2024 at 11:23:57AM +1300, David Rowley wrote:
> > On Tue, 30 Jan 2024 at 08:32, Nathan Bossart 
> wrote:
> >> I'm currently +0.1 for this change.  I don't see any huge problem with
> >> trimming a few instructions, but I'm dubious there's any measurable
> impact.
> >> However, a cycle saved is a cycle earned...
> >
> > FWIW, In [1] and subsequent replies, there are several examples of
> > benchmarks where various bitmapset functions are sitting high in the
> > profiles. So I wouldn't be too surprised if such a small change to the
> > WORDNUM and BITNUM macros made a noticeable difference.
>
> Good to know, thanks.  If there is indeed demonstrable improvement, I'd
> readily adjust my +0.1 to +1.
>
Following the suggestions, I did a quick test with one of the scripts.

Ubuntu 64 bits
gcc 12.3 64 bits

create table t1 (a int) partition by list(a);
select 'create table t1_'||x||' partition of t1 for values
in('||x||');' from generate_series(0,9)x;

test1.sql
select * from t1 where a > 1 and a < 3;

pgbench -U postgres -n -f test1.sql -T 15 postgres

head:

tps = 27983.182940
tps = 28916.903038
tps = 29051.878855

patched:

tps = 27517.301246
tps = 27848.684133
tps = 28669.367300


create table t2 (a int) partition by list(a);
select 'create table t2_'||x||' partition of t2 for values
in('||x||');' from generate_series(0,)x;

test2.sql
select * from t2 where a > 1 and a < 3;

pgbench -U postgres -n -f test2.sql -T 15 postgres

head:

tps = 27144.044463
tps = 28932.948620
tps = 29299.016248

patched:

tps = 27363.364039
tps = 28588.141586
tps = 28669.367300

To my complete surprise, the change is slower.
I can't say how, with fewer instructions, gcc makes the binary worse.

best regards,
Ranier Vilela


Re: UUID v7

2024-01-30 Thread Andrey M. Borodin


> On 30 Jan 2024, at 15:33, Junwang Zhao  wrote:
> 
> It's always good to add a newline at the end of a  source file, though
> this might be nitpicky.

Thanks, also fixed warning found by CFBot.


Best regards, Andrey Borodin.


v17-0001-Implement-UUID-v7.patch
Description: Binary data


Re: Improvement discussion of custom and generic plans

2024-01-30 Thread Quan Zongliang



On 2023/11/3 15:27, Quan Zongliang wrote:

Hi

We have one such problem. A table field has skewed data. Statistics:
n_distinct | -0.4481973
most_common_vals   | {5f006ca25b52ed78e457b150ee95a30c}
most_common_freqs  | {0.5518474}


Data generation:

CREATE TABLE s_user (
  user_id varchar(32) NOT NULL,
  corp_id varchar(32),
  status int NOT NULL
  );

insert into s_user
select md5('user_id ' || a), md5('corp_id ' || a),
  case random()<0.877675 when true then 1 else -1 end
   FROM generate_series(1,10031) a;

insert into s_user
select md5('user_id ' || a), md5('corp_id 10032'),
  case random()<0.877675 when true then 1 else -1 end
   FROM generate_series(10031,22383) a;

CREATE INDEX s_user_corp_id_idx ON s_user USING btree (corp_id);

analyze s_user;


1. First, define a PREPARE statement
prepare stmt as select count(*) from s_user where status=1 and corp_id = 
$1;


2. Run it five times. Choose the custom plan.
explain (analyze,buffers) execute stmt('5f006ca25b52ed78e457b150ee95a30c');

Here's the plan:
  Aggregate  (cost=639.84..639.85 rows=1 width=8) (actual 
time=4.653..4.654 rows=1 loops=1)

    Buffers: shared hit=277
    ->  Seq Scan on s_user  (cost=0.00..612.76 rows=10830 width=0) 
(actual time=1.402..3.747 rows=10836 loops=1)
  Filter: ((status = 1) AND ((corp_id)::text = 
'5f006ca25b52ed78e457b150ee95a30c'::text))

  Rows Removed by Filter: 11548
  Buffers: shared hit=277
  Planning Time: 0.100 ms
  Execution Time: 4.674 ms
(8 rows)

3.From the sixth time. Choose generic plan.
We can see that there is a huge deviation between the estimate and the 
actual value:
  Aggregate  (cost=11.83..11.84 rows=1 width=8) (actual 
time=4.424..4.425 rows=1 loops=1)

    Buffers: shared hit=154 read=13
    ->  Bitmap Heap Scan on s_user  (cost=4.30..11.82 rows=2 width=0) 
(actual time=0.664..3.371 rows=10836 loops=1)

  Recheck Cond: ((corp_id)::text = $1)
  Filter: (status = 1)
  Rows Removed by Filter: 1517
  Heap Blocks: exact=154
  Buffers: shared hit=154 read=13
  ->  Bitmap Index Scan on s_user_corp_id_idx  (cost=0.00..4.30 
rows=2 width=0) (actual time=0.635..0.635 rows=12353 loops=1)

    Index Cond: ((corp_id)::text = $1)
    Buffers: shared read=13
  Planning Time: 0.246 ms
  Execution Time: 4.490 ms
(13 rows)

This is because in the choose_custom_plan function, the generic plan is 
attempted after executing the custom plan five times.


     if (plansource->num_custom_plans < 5)
     return true;

The generic plan uses var_eq_non_const to estimate the average selectivity.

These are facts that many people already know. So a brief introduction.


Our users actually use such parameter conditions in very complex PREPARE 
statements. Once they use the generic plan for the sixth time. The 
execution time will change from 5 milliseconds to 5 minutes.



To improve this problem. The following approaches can be considered:

1. Determine whether data skew exists in the PREPARE statement parameter 
conditions based on the statistics.

However, there is no way to know if the user will use the skewed parameter.

2.When comparing the cost of the generic plan with the average cost of 
the custom plan(function choose_custom_plan). Consider whether the 
maximum cost of a custom plan executed is an order of magnitude 
different from the cost of a generic plan.
If the first five use a small selectivity condition. And after the sixth 
use a high selectivity condition. Problems will still arise.


3.Trace the execution time of the PREPARE statement. When an execution 
time is found to be much longer than the average execution time, the 
custom plan is forced to run.



Is there any better idea?

I tried to do a demo. Add a member paramid to Const. When Const is 
generated by Param, the Const is identified as coming from Param. Then 
check in var_eq_const to see if the field in the condition using this 
parameter is skewed. If so, choose_custom_plan returns true every time, 
forcing custom_plan to be used.
Only conditional expressions such as var eq param or param eq var can be 
supported.

If it makes sense. Continue to improve this patch.


--
Quan Zongliang

diff --git a/src/backend/optimizer/util/clauses.c 
b/src/backend/optimizer/util/clauses.c
index 94eb56a1e7..3384520dc1 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -2489,6 +2489,8 @@ eval_const_expressions_mutator(Node *node,

pval,

prm->isnull,

typByVal);
+   if (paramLI->paramFetch 
== NULL)
+   

Re: Incorrect cost for MergeAppend

2024-01-30 Thread Alexander Kuzmenkov
On Tue, Jan 30, 2024 at 1:20 PM David Rowley  wrote:
> You should likely focus on trying to find a test that does not require
> making 2 tables with 10k rows.

Is 1k smallint OK? It should fit in one page. Still reproduces the
error, and the entire test case runs in under 10 ms.
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 8dbf790e89..2e1ec41a54 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -1470,7 +1470,7 @@ create_merge_append_path(PlannerInfo *root,
 	  root,
 	  pathkeys,
 	  subpath->total_cost,
-	  subpath->parent->tuples,
+	  subpath->rows,
 	  subpath->pathtarget->width,
 	  0.0,
 	  work_mem,
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
index 56a40d50f9..780499 100644
--- a/src/test/regress/expected/inherit.out
+++ b/src/test/regress/expected/inherit.out
@@ -1716,6 +1716,28 @@ order by t1.b limit 10;
 reset enable_nestloop;
 drop table matest0 cascade;
 NOTICE:  drop cascades to table matest1
+-- Check that merge append is chosen in presence of filters.
+create table ma0(a smallint primary key);
+create table ma1() inherits (ma0);
+insert into ma0 select generate_series(1, 1000);
+insert into ma1 select generate_series(1, 1000);
+analyze ma0;
+analyze ma1;
+explain (costs off) select * from ma0 where a < 100 order by a;
+QUERY PLAN 
+---
+ Merge Append
+   Sort Key: ma0.a
+   ->  Index Only Scan using ma0_pkey on ma0 ma0_1
+ Index Cond: (a < 100)
+   ->  Sort
+ Sort Key: ma0_2.a
+ ->  Seq Scan on ma1 ma0_2
+   Filter: (a < 100)
+(8 rows)
+
+drop table ma0 cascade;
+NOTICE:  drop cascades to table ma1
 --
 -- Test merge-append for UNION ALL append relations
 --
diff --git a/src/test/regress/sql/inherit.sql b/src/test/regress/sql/inherit.sql
index 971aac54fd..dae8cc7dbe 100644
--- a/src/test/regress/sql/inherit.sql
+++ b/src/test/regress/sql/inherit.sql
@@ -624,6 +624,19 @@ reset enable_nestloop;
 
 drop table matest0 cascade;
 
+-- Check that merge append is chosen in presence of filters.
+create table ma0(a smallint primary key);
+create table ma1() inherits (ma0);
+insert into ma0 select generate_series(1, 1000);
+insert into ma1 select generate_series(1, 1000);
+analyze ma0;
+analyze ma1;
+
+explain (costs off) select * from ma0 where a < 100 order by a;
+
+drop table ma0 cascade;
+
+
 --
 -- Test merge-append for UNION ALL append relations
 --


pg_stat_advisor extension

2024-01-30 Thread Илья Евдокимов
Hello PostgreSQL Hackers, I'm thrilled to share with you a new PostgreSQL extension I've developed, called 'pg_stat_advisor'. The genesis of this extension traces back to a conversation in this: https://www.postgresql.org/message-id/e2512fd5-77a4-825b-e456-c0586e37f293%40enterprisedb.com The 'pg_stat_advisor' extension is architected to optimize query plan. It operates by suggesting when to create extended statistics, particularly in queries where current selectivity estimates fall short. This is achieved through the GUC parameter 'pg_stat_advisor.suggest_statistics_threshold', which assesses the ratio of total tuples compared to the planned rows. This feature is instrumental in identifying scenarios where the planner's estimates could be optimized. A key strength of 'pg_stat_advisor' lies in its ability to recommend extended statistics, significantly bolstering query planning. You can install the extension by LOAD 'pg_stat_advisor';SET pg_stat_advisor.suggest_statistics_threshold = 1.0; Example: CREATE TABLE t (i INT, j INT);INSERT INTO t SELECT i/10, i/100 from generate_series(1, 100) i;ANALYZE t; EXPLAIN ANALYZE SELECT * FROM t WHERE i = 100 AND j = 10;NOTICE:  pg_stat_advisor suggestion: CREATE STATISTICS t_i_j ON i, j FROM t                                                   QUERY PLAN  Gather  (cost=1000.00..11675.10 rows=1 width=8) (actual time=0.400..59.292 rows=10 loops=1)   Workers Planned: 2   Workers Launched: 2   ->  Parallel Seq Scan on t  (cost=0.00..10675.00 rows=1 width=8) (actual time=35.614..54.291 rows=3 loops=3)         Filter: ((i = 100) AND (j = 10))         Rows Removed by Filter: 30 Planning Time: 0.081 ms Execution Time: 59.413 ms(8 rows) After EXPLAIN ANALYZE command you can see the message of suggestion creating statistics with name 't_i_j' on 'i', 'j' columns from 't' table. I look forward to your thoughts, feedback, and any suggestions you might have.  Best regards.Ilia Evdokimov,Tantor Labs LLC.From 9485605416030e79843feabf9c101a88703b9779 Mon Sep 17 00:00:00 2001
From: Ilia Evdokimov 
Date: Tue, 30 Jan 2024 13:49:07 +0300
Subject: [PATCH] 'pg_stat_advisor' extension.

This serves as a hook into the executor for determining total rows and planned rows.
The process starts by checking the `pg_stat_advisor.suggest_statistics_threshold` GUC parameter.
If it's set to 0.0, the extension does not proceed further. When the parameter is greater than 0.0,
the extension evaluates the accuracy of planned rows. A suggestion for creating statistics is made
if the ratio of total rows to planned rows is greater than or equal to this threshold.

Only then does it extract the relation and columns from the query.
The extension checks pg_statistic_ext for existing relevant statistics. If no statistics are found,
it prints a notice suggesting the creation of statistics, using the naming format 'relationName_columns'.

Author: Ilia Evdokimov
---
 contrib/Makefile  |   1 +
 contrib/meson.build   |   1 +
 contrib/pg_stat_advisor/.gitignore|   3 +
 contrib/pg_stat_advisor/Makefile  |  20 +
 contrib/pg_stat_advisor/README.md |  85 
 .../expected/pg_stat_advisor.out  |  96 
 contrib/pg_stat_advisor/meson.build   |  30 ++
 contrib/pg_stat_advisor/pg_stat_advisor.c | 477 ++
 .../pg_stat_advisor/sql/pg_stat_advisor.sql   |  50 ++
 9 files changed, 763 insertions(+)
 create mode 100644 contrib/pg_stat_advisor/.gitignore
 create mode 100644 contrib/pg_stat_advisor/Makefile
 create mode 100644 contrib/pg_stat_advisor/README.md
 create mode 100644 contrib/pg_stat_advisor/expected/pg_stat_advisor.out
 create mode 100644 contrib/pg_stat_advisor/meson.build
 create mode 100644 contrib/pg_stat_advisor/pg_stat_advisor.c
 create mode 100644 contrib/pg_stat_advisor/sql/pg_stat_advisor.sql

diff --git a/contrib/Makefile b/contrib/Makefile
index da4e2316a3..da9a4ceeaa 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -34,6 +34,7 @@ SUBDIRS = \
 		pg_buffercache	\
 		pg_freespacemap \
 		pg_prewarm	\
+		pg_stat_advisor \
 		pg_stat_statements \
 		pg_surgery	\
 		pg_trgm		\
diff --git a/contrib/meson.build b/contrib/meson.build
index c12dc906ca..a20d99443b 100644
--- a/contrib/meson.build
+++ b/contrib/meson.build
@@ -49,6 +49,7 @@ subdir('pgcrypto')
 subdir('pg_freespacemap')
 subdir('pg_prewarm')
 subdir('pgrowlocks')
+subdir('pg_stat_advisor')
 subdir('pg_stat_statements')
 subdir('pgstattuple')
 subdir('pg_surgery')
diff --git a/contrib/pg_stat_advisor/.gitignore b/contrib/pg_stat_advisor/.gitignore
new file mode 100644
index 00..913175ff6e
--- /dev/null
+++ b/contrib/pg_stat_advisor/.gitignore
@@ -0,0 +1,3 @@
+/log/
+/results/
+/tmp_check/
diff --git a/contrib/pg_stat_advisor/Makefile b/contrib/pg_stat_advisor/Makefile
new file mode 100644
index 

  1   2   >