date:20230814

Re: [PATCH] Add function to_oct

2023-08-14 Thread Vik Fearing


On 8/15/23 06:11, Nathan Bossart wrote:

On Tue, Jul 25, 2023 at 08:29:17PM -0700, Nathan Bossart wrote:

Here's a new version of the patch with the silly mistakes fixed.


If there are no objections, I'd like to commit this patch soon.


I just took a look at this (and the rest of the thread).  I am a little 
bit disappointed that we don't have a generic base conversion function, 
but even if we did I think these specialized functions would still be 
useful.


No objection from me.
--
Vik Fearing

Re: ECPG Semantic Analysis

2023-08-14 Thread Michael Meskes

Hi,

> I have a modified version of ECPG, to which I gave the ability to
> do semantic analysis of SQL statements. Where i can share it or with
> whom can I discuss it?

Feel free to send it my way. 

Thanks,
Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De
Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org

Re: proposal: jsonb_populate_array

2023-08-14 Thread Pavel Stehule

út 15. 8. 2023 v 7:48 odesílatel Vik Fearing 
napsal:

> On 8/14/23 15:37, Pavel Stehule wrote:
> > po 14. 8. 2023 v 15:09 odesílatel Erik Rijkers  napsal:
> >
> >> Op 8/14/23 om 14:51 schreef Pavel Stehule:> po 14. 8. 2023 v 11:32
> >> odesílatel Alvaro Herrera 
> >>   > with proposed function I can write
> >>   >
> >>   > select jsonb_populate_array(null:date[],
> >>   > '["2023-07-13","2023-07-14"]'::jsonb)
> >>   >
> >> Not yet committed, but outstanding
> >> SQL/JSON patches (v11) will let you do:
> >>
> >> select json_query(
> >>   '["2023-07-13", "2023-07-14"]'::jsonb
> >> , '$' returning date[]
> >> );
> >>  json_query
> >> -
> >>{2023-07-13,2023-07-14}
> >> (1 row)
> >>
> >> That's (more or less) what you want, no?
> >>
> >
> > Yes, the functionality is exactly the same, but still maybe for
> completeness
> > the function json_populate_array can be nice.
> >
> > In old API the transformations between json and row/record types is well
> > covered, but for array, only direction array->json is covered
>
> I don't think we should be extending the old API when there are Standard
> ways of doing the same thing.  In fact, I would like to see the old way
> slowly be deprecated.
>
> > I think so this can be +/- 40 lines of C code
>
> It seems to me like a good candidate for an extension.
>

Unfortunately, these small extensions have zero chance to be available for
users that use some cloud postgres.



> --
> Vik Fearing
>
>

Re: proposal: jsonb_populate_array

2023-08-14 Thread Vik Fearing


On 8/14/23 15:37, Pavel Stehule wrote:

po 14. 8. 2023 v 15:09 odesílatel Erik Rijkers  napsal:


Op 8/14/23 om 14:51 schreef Pavel Stehule:> po 14. 8. 2023 v 11:32
odesílatel Alvaro Herrera 
  > with proposed function I can write
  >
  > select jsonb_populate_array(null:date[],
  > '["2023-07-13","2023-07-14"]'::jsonb)
  >
Not yet committed, but outstanding
SQL/JSON patches (v11) will let you do:

select json_query(
  '["2023-07-13", "2023-07-14"]'::jsonb
, '$' returning date[]
);
 json_query
-
   {2023-07-13,2023-07-14}
(1 row)

That's (more or less) what you want, no?



Yes, the functionality is exactly the same, but still maybe for  completeness
the function json_populate_array can be nice.

In old API the transformations between json and row/record types is well
covered, but for array, only direction array->json is covered


I don't think we should be extending the old API when there are Standard 
ways of doing the same thing.  In fact, I would like to see the old way 
slowly be deprecated.



I think so this can be +/- 40 lines of C code


It seems to me like a good candidate for an extension.
--
Vik Fearing

Re: Using defines for protocol characters

2023-08-14 Thread Tatsuo Ishii

> I tried to address all the feedback in v5 of the patch, which is attached.
> I limited the patch to only the characters that have names in the "Message
> Formats" section of protocol.sgml instead of trying to invent names for
> unnamed characters.
> 
> I'm aware of one inconsistency.  While I grouped all the authentication
> request messages ('R') into PqMsg_AuthenticationRequest, I added separate
> macros for all the different meanings of 'p', i.e., GSSResponse,
> PasswordMessage, SASLInitialResponse, and SASLResponse.  I wanted to list
> everything in protocol.sgml (even the duplicate ) to ease greppability, but
> the code is structure such that adding macros for all the different
> authentication messages would be kind of pointless.  Plus, there is a
> separate set of authentication request codes just below the added macros.
> So, this is where I landed...

Is it possible to put the new define staff into protocol.h then let
pqcomm.h include protocol.h? This makes Pgpool-II and other middle
ware/drivers written in C easier to use the defines so that they only
include protocol.h to use the defines.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Re: proposal: jsonb_populate_array

2023-08-14 Thread Vik Fearing


On 8/14/23 15:47, Chapman Flack wrote:

On 2023-08-14 09:11, Erik Rijkers wrote:

  , '$' returning date[]


I certainly like that syntax better.

It's not that the "here's a null to tell you the type I want"
is terribly unclear, but it seems not to be an idiom I have
seen a lot of in PostgreSQL before now. Are there other places
it's currently used that I've overlooked?


It has been used since forever in polymorphic aggregate final functions. 
 I don't mind it there, but I do not like it in general user-facing 
functions.


https://www.postgresql.org/docs/current/sql-createaggregate.html
--
Vik Fearing

Re: Extract numeric filed in JSONB more effectively

2023-08-14 Thread Pavel Stehule

út 15. 8. 2023 v 7:23 odesílatel Pavel Stehule 
napsal:

> Hi
>
> út 15. 8. 2023 v 5:24 odesílatel Andy Fan 
> napsal:
>
>>
>>>  jsonb_extract_xx_type just cares about the argtype, but
>>> 'explain select xx'  will still access the const->constvalue.
>>> const->constvalue is 0 which is set by makeNullConst currently,
>>> and it is ok for the current supported type.
>>>
>>
>> The exception is numeric data type, the constvalue can't be 0.
>> so hack it with the below line.  maybe not good enough,  but I
>> have no better solution now.
>>
>> +   Const   *target =
>>  makeNullConst(fexpr->funcresulttype,
>> +
>>-1,
>> +
>>InvalidOid);
>> +   /*
>> +* Since all the above functions are strict, we
>> can't input
>> +* a NULL value.
>> +*/
>> +   target->constisnull = false;
>> +
>> +   Assert(target->constbyval || target->consttype ==
>> NUMERICOID);
>> +
>> +   /* Mock a valid datum for !constbyval type. */
>> +   if (fexpr->funcresulttype == NUMERICOID)
>> +   target->constvalue =
>> DirectFunctionCall1(numeric_in, CStringGetDatum("0"));
>>
>>
> Personally I think this workaround is too dirty, and better to use a
> strict function (I believe so the overhead for NULL values is acceptable),
> or introduce a different mechanism.
>
> Your design is workable, and I think acceptable, but I don't think it is
> an ideal or final solution. It is not really generic. It doesn't help with
> XML or Hstore. You need to touch cast functions, which I think is not best,
> because cast functions should not cooperate on optimization of execution of
> another function.
>
> My idea of an ideal solution is the introduction of the possibility to use
> "any" pseudotype as return type with possibility to set default return
> type. Now, "any" is allowed only for arguments. The planner can set the
> expected type when it knows it, or can use the default type.
>
> so for extraction of jsonb field we can use FUNCTION
> jsonb_extract_field(jsonb, text) RETURNS "any" DEFAULT jsonb
>
> if we call SELECT jsonb_extract_field(..., 'x') -> then it returns jsonb,
> if we use SELECT jsonb_extract_field('...', 'x')::date, then it returns date
>
> With this possibility we don't need to touch to cast functions, and we can
> simply implement similar functions for other non atomic types.
>

this syntax can be used instead NULL::type trick

like

SELECT jsonb_populate_record('{...}')::pg_class;

instead

SELECT jsonb_populate_record(NULL::pg_class, '{...}')



>
>
>
> --
>> Best Regards
>> Andy Fan
>>
>

Re: Extract numeric filed in JSONB more effectively

2023-08-14 Thread Pavel Stehule

Hi

út 15. 8. 2023 v 5:24 odesílatel Andy Fan  napsal:

>
>>  jsonb_extract_xx_type just cares about the argtype, but
>> 'explain select xx'  will still access the const->constvalue.
>> const->constvalue is 0 which is set by makeNullConst currently,
>> and it is ok for the current supported type.
>>
>
> The exception is numeric data type, the constvalue can't be 0.
> so hack it with the below line.  maybe not good enough,  but I
> have no better solution now.
>
> +   Const   *target =
>  makeNullConst(fexpr->funcresulttype,
> +
>-1,
> +
>InvalidOid);
> +   /*
> +* Since all the above functions are strict, we
> can't input
> +* a NULL value.
> +*/
> +   target->constisnull = false;
> +
> +   Assert(target->constbyval || target->consttype ==
> NUMERICOID);
> +
> +   /* Mock a valid datum for !constbyval type. */
> +   if (fexpr->funcresulttype == NUMERICOID)
> +   target->constvalue =
> DirectFunctionCall1(numeric_in, CStringGetDatum("0"));
>
>
Personally I think this workaround is too dirty, and better to use a strict
function (I believe so the overhead for NULL values is acceptable), or
introduce a different mechanism.

Your design is workable, and I think acceptable, but I don't think it is an
ideal or final solution. It is not really generic. It doesn't help with XML
or Hstore. You need to touch cast functions, which I think is not best,
because cast functions should not cooperate on optimization of execution of
another function.

My idea of an ideal solution is the introduction of the possibility to use
"any" pseudotype as return type with possibility to set default return
type. Now, "any" is allowed only for arguments. The planner can set the
expected type when it knows it, or can use the default type.

so for extraction of jsonb field we can use FUNCTION
jsonb_extract_field(jsonb, text) RETURNS "any" DEFAULT jsonb

if we call SELECT jsonb_extract_field(..., 'x') -> then it returns jsonb,
if we use SELECT jsonb_extract_field('...', 'x')::date, then it returns date

With this possibility we don't need to touch to cast functions, and we can
simply implement similar functions for other non atomic types.



-- 
> Best Regards
> Andy Fan
>

Re: Fix pg_stat_reset_single_table_counters function

2023-08-14 Thread Masahiro Ikeda


On 2023-08-15 11:48, Masahiko Sawada wrote:
On Mon, Aug 14, 2023 at 5:12 PM Masahiro Ikeda 
 wrote:

I changed the table to check the stats from pg_database to
pg_shdescription
because the stats can update via the SQL interface COMMENT command.


It seems to work well.

+COMMENT ON DATABASE :current_database IS 'This is a test comment';
-- insert or update in 'pg_shdescription'

I think the current_database should be quoted (see other examples
where using current_database(), e.g. collate.linux.utf8.sql). Also it
would be better to reset the comment after the test.


Thanks! I fixed the issues in the v4 patch.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATIONFrom bca4c1844994be8ed80a29b8cb760e2eb865dca9 Mon Sep 17 00:00:00 2001
From: Masahiro Ikeda 
Date: Mon, 14 Aug 2023 16:48:30 +0900
Subject: [PATCH] Fix pg_stat_reset_single_table_counters function.

This commit revives the feature to reset statistics for a single
relation shared across all databases in the cluster to zero, which
was implemented by the following commit.
* Enhance pg_stat_reset_single_table_counters function(e04267844)

The following commit accidentally deleted the feature.
* pgstat: store statistics in shared memory(5891c7a8e)

Need to backpatch from 15.

Reported-by: Mitsuru Hinata
---
 src/backend/utils/adt/pgstatfuncs.c |  9 +--
 src/test/regress/expected/stats.out | 42 +
 src/test/regress/sql/stats.sql  | 26 ++
 3 files changed, 75 insertions(+), 2 deletions(-)

diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 2a4c8ef87f..2b9742ad21 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -17,6 +17,7 @@
 #include "access/htup_details.h"
 #include "access/xlog.h"
 #include "access/xlogprefetcher.h"
+#include "catalog/catalog.h"
 #include "catalog/pg_authid.h"
 #include "catalog/pg_type.h"
 #include "common/ip.h"
@@ -1776,13 +1777,17 @@ pg_stat_reset_shared(PG_FUNCTION_ARGS)
 	PG_RETURN_VOID();
 }
 
-/* Reset a single counter in the current database */
+/*
+ * Reset a statistics for a single object, which may be of current
+ * database or shared across all databases in the cluster.
+ */
 Datum
 pg_stat_reset_single_table_counters(PG_FUNCTION_ARGS)
 {
 	Oid			taboid = PG_GETARG_OID(0);
+	Oid			dboid = (IsSharedRelation(taboid) ? InvalidOid : MyDatabaseId);
 
-	pgstat_reset(PGSTAT_KIND_RELATION, MyDatabaseId, taboid);
+	pgstat_reset(PGSTAT_KIND_RELATION, dboid, taboid);
 
 	PG_RETURN_VOID();
 }
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 319164a5e9..eb24a02147 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -764,6 +764,48 @@ FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass;
 2 | t  |3 | t
 (1 row)
 
+-
+-- Test to reset stats for a table shared across all databases (ex. pg_shdescription)
+-
+-- store the old comment to reset
+SELECT shobj_description(d.oid, 'pg_database') as description_before
+FROM pg_database d WHERE datname = current_database() \gset
+-- update the stats in pg_shdescription
+BEGIN;
+SELECT current_database() as current_database \gset
+COMMENT ON DATABASE :"current_database" IS 'This is a test comment';
+SELECT pg_stat_force_next_flush();
+ pg_stat_force_next_flush 
+--
+ 
+(1 row)
+
+COMMIT;
+-- check to reset the stats
+SELECT n_tup_ins + n_tup_upd > 0 FROM pg_stat_all_tables WHERE relid = 'pg_shdescription'::regclass;
+ ?column? 
+--
+ t
+(1 row)
+
+SELECT pg_stat_reset_single_table_counters('pg_shdescription'::regclass);
+ pg_stat_reset_single_table_counters 
+-
+ 
+(1 row)
+
+SELECT n_tup_ins + n_tup_upd FROM pg_stat_all_tables WHERE relid = 'pg_shdescription'::regclass;
+ ?column? 
+--
+0
+(1 row)
+
+-- cleanup the comment
+\if :{?description_before}
+  COMMENT ON DATABASE :"current_database" IS :'description_before';
+\else
+  COMMENT ON DATABASE :"current_database" IS NULL;
+\endif
 -
 -- Test that various stats views are being properly populated
 -
diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql
index 9a16df1c49..735118c452 100644
--- a/src/test/regress/sql/stats.sql
+++ b/src/test/regress/sql/stats.sql
@@ -376,6 +376,32 @@ COMMIT;
 SELECT seq_scan, :'test_last_seq' = last_seq_scan AS seq_ok, idx_scan, :'test_last_idx' < last_idx_scan AS idx_ok
 FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass;
 
+-
+-- Test to reset stats for a table shared across all databases (ex. pg_shdescription)
+-
+
+-- store the old comment to reset
+SELECT shobj_description(d.oid, 'pg_database') as description_before
+FROM pg_database d WHERE datname = current_database() \gset
+
+-- update the stats in pg_shdescription
+BEGIN;
+SELECT current_database() as current_database \gset
+COMMENT ON DATABASE :"current_database" IS

RE: [PoC] pg_upgrade: allow to upgrade publisher node

2023-08-14 Thread Zhijie Hou (Fujitsu)

On Tuesday, August 15, 2023 11:06 AM Amit Kapila  
wrote:
> 
> On Tue, Aug 15, 2023 at 7:51 AM Masahiko Sawada 
> wrote:
> >
> > On Mon, Aug 14, 2023 at 2:07 PM Amit Kapila 
> wrote:
> > >
> > > On Mon, Aug 14, 2023 at 7:57 AM Masahiko Sawada
>  wrote:
> > > > Another idea is (which might have already discussed thoguh) that we
> check if the latest shutdown checkpoint LSN in the control file matches the
> confirmed_flush_lsn in pg_replication_slots view. That way, we can ensure that
> the slot has consumed all WAL records before the last shutdown. We don't
> need to worry about WAL records generated after starting the old cluster
> during the upgrade, at least for logical replication slots.
> > > >
> > >
> > > Right, this is somewhat closer to what Patch is already doing. But
> > > remember in this case we need to remember and use the latest
> > > checkpoint from the control file before the old cluster is started
> > > because otherwise the latest checkpoint location could be even
> > > updated during the upgrade. So, instead of reading from WAL, we need
> > > to change so that we rely on the control file's latest LSN.
> >
> > Yes, I was thinking the same idea.
> >
> > But it works for only replication slots for logical replication. Do we
> > want to check if no meaningful WAL records are generated after the
> > latest shutdown checkpoint, for manually created slots (or non-logical
> > replication slots)? If so, we would need to have something reading WAL
> > records in the end.
> >
> 
> > > I would prefer this
> > > idea than to invent a new API/tool like pg_replslotdata.
> >
> > +1

Changed the check to compare the latest checkpoint lsn from pg_controldata
with the confirmed_flush_lsn in pg_replication_slots view.

> >
> > >
> > > The other point you and Bruce seem to be favoring is that instead of
> > > dumping/restoring slots via pg_dump, we remember the required
> > > information of slots retrieved during their validation in pg_upgrade
> > > itself and use that to create the slots in the new cluster. Though I
> > > am not aware of doing similar treatment for other objects we restore
> > > in this case it seems reasonable especially because slots are not
> > > stored in the catalog and we anyway already need to retrieve the
> > > required information to validate them, so trying to again retrieve
> > > it via pg_dump doesn't seem useful unless I am missing something.
> > > Does this match your understanding?
> >
> > If there are use cases for --logical-replication-slots-only option
> > other than pg_upgrade, it would be good to have it in pg_dump. I was
> > just not sure of other use cases.
> >
> 
> It was primarily for upgrade purposes only. So, as we can't see a good reason 
> to
> go via pg_dump let's do it in upgrade unless someone thinks otherwise.

Removed the new option in pg_dump and modified the pg_upgrade
directly use the slot info to restore the slot in new cluster.

> 
> > >
> > > Yet another thing I am trying to consider is whether we can allow to
> > > upgrade slots from 16 or 15 to later versions. As of now, the patch
> > > has the following check:
> > > getLogicalReplicationSlots()
> > > {
> > > ...
> > > + /* Check whether we should dump or not */ if (fout->remoteVersion
> > > + < 17) return;
> > > ...
> > > }
> > >
> > > If we decide to use the existing view pg_replication_slots then can
> > > we consider upgrading slots from the prior version to 17? Now, if we
> > > want to invent any new API similar to pg_replslotdata then we can't
> > > do this because it won't exist in prior versions but OTOH using
> > > existing view pg_replication_slots can allow us to fetch slot info
> > > from older versions as well. So, I think it is worth considering.
> >
> > I think that without 0001 patch the replication slots will not be able
> > to pass the confirmed_flush_lsn check.
> >
> 
> Right, but we can think of backpatching the same. Anyway, we can do that as a
> separate work by starting a new thread to see if there is a broader agreement
> for backpatching such a change. For now, we can focus on >=v17.
> 

Here is the new version patch which addressed above points.
The new version patch also removes the --exclude-logical-replication-slots
option due to recent comment. 
Thanks Kuroda-san for addressing most of the points. 

Best Regards,
Hou zj


v20-0003-pg_upgrade-Add-check-function-for-logical-replic.patch
Description:  v20-0003-pg_upgrade-Add-check-function-for-logical-replic.patch


v20-0001-Always-persist-to-disk-logical-slots-during-a-sh.patch
Description:  v20-0001-Always-persist-to-disk-logical-slots-during-a-sh.patch


v20-0002-pg_upgrade-Allow-to-replicate-logical-replicatio.patch
Description:  v20-0002-pg_upgrade-Allow-to-replicate-logical-replicatio.patch

Re: [PATCH] Add function to_oct

2023-08-14 Thread Nathan Bossart

On Tue, Jul 25, 2023 at 08:29:17PM -0700, Nathan Bossart wrote:
> Here's a new version of the patch with the silly mistakes fixed.

If there are no objections, I'd like to commit this patch soon.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Using defines for protocol characters

2023-08-14 Thread Nathan Bossart

I tried to address all the feedback in v5 of the patch, which is attached.
I limited the patch to only the characters that have names in the "Message
Formats" section of protocol.sgml instead of trying to invent names for
unnamed characters.

I'm aware of one inconsistency.  While I grouped all the authentication
request messages ('R') into PqMsg_AuthenticationRequest, I added separate
macros for all the different meanings of 'p', i.e., GSSResponse,
PasswordMessage, SASLInitialResponse, and SASLResponse.  I wanted to list
everything in protocol.sgml (even the duplicateѕ) to ease greppability, but
the code is structure such that adding macros for all the different
authentication messages would be kind of pointless.  Plus, there is a
separate set of authentication request codes just below the added macros.
So, this is where I landed...

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 78c2a110a8c940690d1ff363e545acb48adf966e Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Mon, 14 Aug 2023 14:09:30 -0700
Subject: [PATCH v5 1/1] Introduce macros for protocol characters.

Author: Dave Cramer
Reviewed-by: Alvaro Herrera, Tatsuo Ishii, Peter Smith, Robert Haas, Tom Lane, Peter Eisentraut
Discussion: https://postgr.es/m/CADK3HHKbBmK-PKf1bPNFoMC%2BoBt%2BpD9PH8h5nvmBQskEHm-Ehw%40mail.gmail.com
---
 src/backend/access/common/printsimple.c |  5 +-
 src/backend/access/transam/parallel.c   | 14 ++---
 src/backend/backup/basebackup_copy.c| 16 +++---
 src/backend/commands/async.c|  2 +-
 src/backend/commands/copyfromparse.c| 22 
 src/backend/commands/copyto.c   |  6 +--
 src/backend/libpq/auth-sasl.c   |  2 +-
 src/backend/libpq/auth.c|  8 +--
 src/backend/postmaster/postmaster.c |  2 +-
 src/backend/replication/walsender.c | 18 +++
 src/backend/tcop/dest.c |  8 +--
 src/backend/tcop/fastpath.c |  2 +-
 src/backend/tcop/postgres.c | 68 
 src/backend/utils/error/elog.c  |  5 +-
 src/backend/utils/misc/guc.c|  2 +-
 src/include/libpq/pqcomm.h  | 51 ++
 src/interfaces/libpq/fe-auth.c  |  2 +-
 src/interfaces/libpq/fe-connect.c   | 19 ---
 src/interfaces/libpq/fe-exec.c  | 50 +-
 src/interfaces/libpq/fe-protocol3.c | 70 +
 src/interfaces/libpq/fe-trace.c | 70 ++---
 21 files changed, 258 insertions(+), 184 deletions(-)

diff --git a/src/backend/access/common/printsimple.c b/src/backend/access/common/printsimple.c
index ef818228ac..62de95e1ba 100644
--- a/src/backend/access/common/printsimple.c
+++ b/src/backend/access/common/printsimple.c
@@ -20,6 +20,7 @@
 
 #include "access/printsimple.h"
 #include "catalog/pg_type.h"
+#include "libpq/pqcomm.h"
 #include "libpq/pqformat.h"
 #include "utils/builtins.h"
 
@@ -32,7 +33,7 @@ printsimple_startup(DestReceiver *self, int operation, TupleDesc tupdesc)
 	StringInfoData buf;
 	int			i;
 
-	pq_beginmessage(, 'T'); /* RowDescription */
+	pq_beginmessage(, PqMsg_RowDescription);
 	pq_sendint16(, tupdesc->natts);
 
 	for (i = 0; i < tupdesc->natts; ++i)
@@ -65,7 +66,7 @@ printsimple(TupleTableSlot *slot, DestReceiver *self)
 	slot_getallattrs(slot);
 
 	/* Prepare and send message */
-	pq_beginmessage(, 'D');
+	pq_beginmessage(, PqMsg_DataRow);
 	pq_sendint16(, tupdesc->natts);
 
 	for (i = 0; i < tupdesc->natts; ++i)
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 1738aecf1f..194a1207be 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -1127,7 +1127,7 @@ HandleParallelMessage(ParallelContext *pcxt, int i, StringInfo msg)
 
 	switch (msgtype)
 	{
-		case 'K':/* BackendKeyData */
+		case PqMsg_BackendKeyData:
 			{
 int32		pid = pq_getmsgint(msg, 4);
 
@@ -1137,8 +1137,8 @@ HandleParallelMessage(ParallelContext *pcxt, int i, StringInfo msg)
 break;
 			}
 
-		case 'E':/* ErrorResponse */
-		case 'N':/* NoticeResponse */
+		case PqMsg_ErrorResponse:
+		case PqMsg_NoticeResponse:
 			{
 ErrorData	edata;
 ErrorContextCallback *save_error_context_stack;
@@ -1183,7 +1183,7 @@ HandleParallelMessage(ParallelContext *pcxt, int i, StringInfo msg)
 break;
 			}
 
-		case 'A':/* NotifyResponse */
+		case PqMsg_NotificationResponse:
 			{
 /* Propagate NotifyResponse. */
 int32		pid;
@@ -1217,7 +1217,7 @@ HandleParallelMessage(ParallelContext *pcxt, int i, StringInfo msg)
 break;
 			}
 
-		case 'X':/* Terminate, indicating clean exit */
+		case PqMsg_Terminate:
 			{
 shm_mq_detach(pcxt->worker[i].error_mqh);
 pcxt->worker[i].error_mqh = NULL;
@@ -1372,7 +1372,7 @@ ParallelWorkerMain(Datum main_arg)
 	 * protocol message is defined, but it won't actually be used for anything
 	 * in this case.
 	 */
-	pq_beginmessage(, 'K');
+	pq_beginmessage(,

Re: Extract numeric filed in JSONB more effectively

2023-08-14 Thread Andy Fan

>
>
>  jsonb_extract_xx_type just cares about the argtype, but
> 'explain select xx'  will still access the const->constvalue.
> const->constvalue is 0 which is set by makeNullConst currently,
> and it is ok for the current supported type.
>

The exception is numeric data type, the constvalue can't be 0.
so hack it with the below line.  maybe not good enough,  but I
have no better solution now.

+   Const   *target =
 makeNullConst(fexpr->funcresulttype,
+
 -1,
+
 InvalidOid);
+   /*
+* Since all the above functions are strict, we
can't input
+* a NULL value.
+*/
+   target->constisnull = false;
+
+   Assert(target->constbyval || target->consttype ==
NUMERICOID);
+
+   /* Mock a valid datum for !constbyval type. */
+   if (fexpr->funcresulttype == NUMERICOID)
+   target->constvalue =
DirectFunctionCall1(numeric_in, CStringGetDatum("0"));

-- 
Best Regards
Andy Fan


v7-0001-optimize-casting-jsonb-to-a-given-type.patch
Description: Binary data

Re: proposal: jsonb_populate_array

2023-08-14 Thread jian he

\df jsonb_populate_record
 List of functions
   Schema   | Name  | Result data type | Argument data
types | Type
+---+--+-+--
 pg_catalog | jsonb_populate_record | anyelement   | anyelement,
jsonb   | func
(1 row)

manual:
> anyelement  Indicates that a function accepts any data type.
> For the “simple” family of polymorphic types, the matching and deduction 
> rules work like this:
> Each position (either argument or return value) declared as anyelement is 
> allowed to have any specific actual data type, but in any given call they 
> must all be the same actual type.

So jsonb_populate_record signature can handle cases like
jsonb_populate_record(anyarray, jsonb)? obviously this is a cast, it
may fail.
also if input is anyarray, so the output anyarray will have the same
base type as input anyarray.

Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-08-14 Thread Amit Kapila

On Tue, Aug 15, 2023 at 7:51 AM Masahiko Sawada  wrote:
>
> On Mon, Aug 14, 2023 at 2:07 PM Amit Kapila  wrote:
> >
> > On Mon, Aug 14, 2023 at 7:57 AM Masahiko Sawada  
> > wrote:
> > > Another idea is (which might have already discussed thoguh) that we check 
> > > if the latest shutdown checkpoint LSN in the control file matches the 
> > > confirmed_flush_lsn in pg_replication_slots view. That way, we can ensure 
> > > that the slot has consumed all WAL records before the last shutdown. We 
> > > don't need to worry about WAL records generated after starting the old 
> > > cluster during the upgrade, at least for logical replication slots.
> > >
> >
> > Right, this is somewhat closer to what Patch is already doing. But
> > remember in this case we need to remember and use the latest
> > checkpoint from the control file before the old cluster is started
> > because otherwise the latest checkpoint location could be even updated
> > during the upgrade. So, instead of reading from WAL, we need to change
> > so that we rely on the control file's latest LSN.
>
> Yes, I was thinking the same idea.
>
> But it works for only replication slots for logical replication. Do we
> want to check if no meaningful WAL records are generated after the
> latest shutdown checkpoint, for manually created slots (or non-logical
> replication slots)? If so, we would need to have something reading WAL
> records in the end.
>

This feature only targets logical replication slots. I don't see a
reason to be different for manually created logical replication slots.
Is there something particular that you think we could be missing?

> > I would prefer this
> > idea than to invent a new API/tool like pg_replslotdata.
>
> +1
>
> >
> > The other point you and Bruce seem to be favoring is that instead of
> > dumping/restoring slots via pg_dump, we remember the required
> > information of slots retrieved during their validation in pg_upgrade
> > itself and use that to create the slots in the new cluster. Though I
> > am not aware of doing similar treatment for other objects we restore
> > in this case it seems reasonable especially because slots are not
> > stored in the catalog and we anyway already need to retrieve the
> > required information to validate them, so trying to again retrieve it
> > via pg_dump doesn't seem useful unless I am missing something. Does
> > this match your understanding?
>
> If there are use cases for --logical-replication-slots-only option
> other than pg_upgrade, it would be good to have it in pg_dump. I was
> just not sure of other use cases.
>

It was primarily for upgrade purposes only. So, as we can't see a good
reason to go via pg_dump let's do it in upgrade unless someone thinks
otherwise.

> >
> > Yet another thing I am trying to consider is whether we can allow to
> > upgrade slots from 16 or 15 to later versions. As of now, the patch
> > has the following check:
> > getLogicalReplicationSlots()
> > {
> > ...
> > + /* Check whether we should dump or not */
> > + if (fout->remoteVersion < 17)
> > + return;
> > ...
> > }
> >
> > If we decide to use the existing view pg_replication_slots then can we
> > consider upgrading slots from the prior version to 17? Now, if we want
> > to invent any new API similar to pg_replslotdata then we can't do this
> > because it won't exist in prior versions but OTOH using existing view
> > pg_replication_slots can allow us to fetch slot info from older
> > versions as well. So, I think it is worth considering.
>
> I think that without 0001 patch the replication slots will not be able
> to pass the confirmed_flush_lsn check.
>

Right, but we can think of backpatching the same. Anyway, we can do
that as a separate work by starting a new thread to see if there is a
broader agreement for backpatching such a change. For now, we can
focus on >=v17.

-- 
With Regards,
Amit Kapila.

Re: Fix pg_stat_reset_single_table_counters function

2023-08-14 Thread Masahiko Sawada

Hi,

On Mon, Aug 14, 2023 at 5:12 PM Masahiro Ikeda  wrote:
>
> Hi,
>
> On 2023-08-13 04:12, Andres Freund wrote:
> > On 2023-08-10 17:48:10 +0900, Masahiko Sawada wrote:
> >> Good catch! I've confirmed that the issue has been fixed by your
> >> patch.
> >
> > Indeed.
>
> Thanks for your responses!
>
> >> However, I'm not sure the added regression tests are stable since
> >> autovacuum workers may scan the pg_database and increment the
> >> statistics after resetting the stats.
> >
> > What about updating the table and checking the update count is reset?
> > That'd
> > not be reset by autovacuum.
>
> Yes. I confirmed that the stats are incremented by autovacuum as you
> said.
>
> I updated the patch to v3.
> * remove the code to bump the CATALOG_VERSION_NO because I misunderstood
> * change the test logic to check the update count instead of scan count

Thank you for updating the patch!

>
> I changed the table to check the stats from pg_database to
> pg_shdescription
> because the stats can update via the SQL interface COMMENT command.

It seems to work well.

+COMMENT ON DATABASE :current_database IS 'This is a test comment';
-- insert or update in 'pg_shdescription'

I think the current_database should be quoted (see other examples
where using current_database(), e.g. collate.linux.utf8.sql). Also it
would be better to reset the comment after the test.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: logicalrep_worker_launch -- counting/checking the worker limits

2023-08-14 Thread Peter Smith

A rebase was needed due to a recent push [1].

PSA v3.

--
[1] 
https://github.com/postgres/postgres/commit/2a8b40e3681921943a2989fd4ec6cdbf8766566c

Kind Regards,
Peter Smith.
Fujitsu Australia


v3-0001-logicalrep_worker_launch-limit-checks.patch
Description: Binary data

Re: Do we want a hashset type?

2023-08-14 Thread jian he

https://github.com/tvondra/hashset

On Mon, Aug 14, 2023 at 11:23 PM Florents Tselai
 wrote:
>
> Has anyone put this in a git repo / extension package or similar ?
>
> I’d like to try it out outside the core pg tree.
>
> > On 1 Jul 2023, at 12:04 PM, Joel Jacobson  wrote:
> >
> > On Fri, Jun 30, 2023, at 06:50, jian he wrote:
> >> more like a C questions
> >> in this context does
> >> #define HASHSET_GET_VALUES(set) ((int32 *) ((set)->data +
> >> CEIL_DIV((set)->capacity, 8)))
> >> define first, then define struct int4hashset_t. Is this normally ok?
> >
> > Yes, it's fine. Macros are just text substitutions done pre-compilation.
> >
> >> Also does
> >> #define HASHSET_GET_VALUES(set) ((int32 *) ((set)->data +
> >> CEIL_DIV((set)->capacity, 8)))
> >>
> >> remove (int32 *) will make it generic? then when you use it, you can
> >> cast whatever type you like?
> >
> > Maybe, but might be less error-prone more descriptive with different
> > macros for each type, e.g. INT4HASHSET_GET_VALUES,
> > similar to the existing PG_GETARG_INT4HASHSET
> >
> > Curious to hear what everybody thinks about the interface, documentation,
> > semantics and implementation?
> >
> > Is there anything missing or something that you think should be 
> > changed/improved?
> >
> > /Joel
> >
> >
>


-- 
 I recommend David Deutsch's <>

  Jian

Re: [PoC] Improve dead tuple storage for lazy vacuum

2023-08-14 Thread Masahiko Sawada

Hi,

On Mon, Aug 14, 2023 at 8:05 PM John Naylor
 wrote:
>
> On Thu, Jul 13, 2023 at 3:09 PM Masahiko Sawada  wrote:
> >
> > 0007, 0008, 0010, and 0011 are straightforward and agree to merge them.

Thank you for updating the patch!

>
> [Part 1 - clear the deck of earlier performance work etc]
>
> Thanks for taking a look! I've merged 0007 and 0008. The others need a 
> performance test to justify them -- an eyeball check is not enough. I've now 
> made the time to do that.
>
>  sparse loads
>
> v38 0001-0006 (still using node3 for this test only):
>
> select avg(load_ms) from generate_series(1,100) x(x), lateral (select * from 
> bench_load_random_int(100 * 1000 * (1+x-x))) a;
>  avg
> -
>  27.1000
>
> select avg(load_ms) from generate_series(1,30) x(x), lateral (select * from 
> bench_load_random_int(500 * 1000 * (1+x-x))) a;
>  avg
> --
>  165.6333
>
> v38-0007-Optimize-RT_EXTEND_DOWN.patch
>
> select avg(load_ms) from generate_series(1,100) x(x), lateral (select * from 
> bench_load_random_int(100 * 1000 * (1+x-x))) a;
>  avg
> -
>  25.0900
>
> select avg(load_ms) from generate_series(1,30) x(x), lateral (select * from 
> bench_load_random_int(500 * 1000 * (1+x-x))) a;
>  avg
> --
>  157.3667
>
> That seems worth doing.
>
> v38-0008-Use-4-children-for-node-4-also-attempt-portable-.patch
>
> This combines two things because I messed up a rebase: Use fanout of 4, and 
> try some macros for shmem sizes, both 32- and 64-bit. Looking at this much, I 
> no longer have a goal to have a separate set of size-classes for non-SIMD 
> platforms, because that would cause global maintenance problems -- it's 
> probably better to reduce worst-case search time where necessary. That would 
> be much more localized.
>
> > I have some questions on 0009 patch:
>
> > According to the comment, this optimization is for only gcc?
>
> No, not at all. That tells me the comment is misleading.
>
> > I think this change reduces
> > readability and maintainability.
>
> Well, that much is obvious. What is not obvious is how much it gains us over 
> the alternatives. I do have a simpler idea, though...
>
>  load mostly node4
>
> select * from bench_search_random_nodes(250*1000, '0xFF');
> n4 = 42626, n16 = 21492, n32 = 0, n64 = 0, n256 = 257
>  mem_allocated | load_ms | search_ms
> ---+-+---
>7352384 |  25 | 0
>
> v38-0009-TEMP-take-out-search-time-from-bench.patch
>
> This is just to allow LATERAL queries for better measurements.
>
> select avg(load_ms) from generate_series(1,100) x(x), lateral (select * from 
> bench_search_random_nodes(250*1000 * (1+x-x), '0xFF')) a;
>
>  avg
> -
>  24.8333

0007, 0008, and 0009 look good to me.

>
> v38-0010-Try-a-simpler-way-to-avoid-memmove.patch
>
> This slightly rewrites the standard loop so that gcc doesn't turn it into a 
> memmove(). Unlike the patch you didn't like, this *is* gcc-specific. (needs a 
> comment, which I forgot)
>
>  avg
> -
>  21.9600
>
> So, that's not a trivial difference. I wasn't a big fan of Andres' __asm("") 
> workaround, but that may be just my ignorance about it. We need something 
> like either of the two.
>
> v38-0011-Optimize-add_child_4-take-2.patch
>  avg
> -
>  21.3500
>
> This is possibly faster than v38-0010, but looking like not worth the 
> complexity, assuming the other way avoids the bug going forward.

I prefer 0010 but is it worth testing with other compilers such as clang?

>
> > According to the bugzilla ticket
> > referred to in the comment, it's realized as a bug in the community,
> > so once the gcc bug fixes, we might no longer need this trick, no?
>
> No comment in two years...
>
> v38-0013-Use-constant-for-initial-copy-of-chunks-and-chil.patch
>
> This is the same as v37-0011. I wasn't quite satisfied with it since it still 
> has two memcpy() calls, but it actually seems to regress:
>
>  avg
> -
>  22.0900
>
> v38-0012-Use-branch-free-coding-to-skip-new-element-index.patch
>
> This patch uses a single loop for the copy.
>
>  avg
> -
>  21.0300
>
> Within noise level of v38-0011, but it's small and simple, so I like it, at 
> least for small arrays.

Agreed.

>
> v38-0014-node48-Remove-need-for-RIGHTMOST_ONE-in-radix-tr.patch
> v38-0015-node48-Remove-dead-code-by-using-loop-local-var.patch
>
> Just small cleanups.
>
> v38-0016-Use-memcpy-for-children-when-growing-into-node48.patch
>
> Makes sense, but untested.

Agreed.

BTW cfbot reported that some regression tests failed due to OOM. I've
attached the patch to fix it.

>
> ===
> [Part 2]
>
> Per off-list discussion with Masahiko, it makes sense to take some of the

Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-08-14 Thread Masahiko Sawada

On Mon, Aug 14, 2023 at 2:07 PM Amit Kapila  wrote:
>
> On Mon, Aug 14, 2023 at 7:57 AM Masahiko Sawada  wrote:
> >
> > On Sat, Aug 12, 2023, 15:20 Amit Kapila  wrote:
> >>
> >> I don't think we need the complexity of version-specific checks if we
> >> do what we do in get_control_data(). Basically, invoke
> >> version-specific pg_replslotdata to get version-specific slot
> >> information. There has been a proposal for a tool like that [1]. Do
> >> you have something better in mind? If so, can you please explain the
> >> same a bit more?
> >
> >
> > Yeah, we need something like pg_replslotdata. If there are other useful 
> > usecases for this tool, it would be good to have it. But I'm not sure other 
> > than pg_upgrade usecase.
> >
> > Another idea is (which might have already discussed thoguh) that we check 
> > if the latest shutdown checkpoint LSN in the control file matches the 
> > confirmed_flush_lsn in pg_replication_slots view. That way, we can ensure 
> > that the slot has consumed all WAL records before the last shutdown. We 
> > don't need to worry about WAL records generated after starting the old 
> > cluster during the upgrade, at least for logical replication slots.
> >
>
> Right, this is somewhat closer to what Patch is already doing. But
> remember in this case we need to remember and use the latest
> checkpoint from the control file before the old cluster is started
> because otherwise the latest checkpoint location could be even updated
> during the upgrade. So, instead of reading from WAL, we need to change
> so that we rely on the control file's latest LSN.

Yes, I was thinking the same idea.

But it works for only replication slots for logical replication. Do we
want to check if no meaningful WAL records are generated after the
latest shutdown checkpoint, for manually created slots (or non-logical
replication slots)? If so, we would need to have something reading WAL
records in the end.

> I would prefer this
> idea than to invent a new API/tool like pg_replslotdata.

+1

>
> The other point you and Bruce seem to be favoring is that instead of
> dumping/restoring slots via pg_dump, we remember the required
> information of slots retrieved during their validation in pg_upgrade
> itself and use that to create the slots in the new cluster. Though I
> am not aware of doing similar treatment for other objects we restore
> in this case it seems reasonable especially because slots are not
> stored in the catalog and we anyway already need to retrieve the
> required information to validate them, so trying to again retrieve it
> via pg_dump doesn't seem useful unless I am missing something. Does
> this match your understanding?

If there are use cases for --logical-replication-slots-only option
other than pg_upgrade, it would be good to have it in pg_dump. I was
just not sure of other use cases.

>
> Yet another thing I am trying to consider is whether we can allow to
> upgrade slots from 16 or 15 to later versions. As of now, the patch
> has the following check:
> getLogicalReplicationSlots()
> {
> ...
> + /* Check whether we should dump or not */
> + if (fout->remoteVersion < 17)
> + return;
> ...
> }
>
> If we decide to use the existing view pg_replication_slots then can we
> consider upgrading slots from the prior version to 17? Now, if we want
> to invent any new API similar to pg_replslotdata then we can't do this
> because it won't exist in prior versions but OTOH using existing view
> pg_replication_slots can allow us to fetch slot info from older
> versions as well. So, I think it is worth considering.

I think that without 0001 patch the replication slots will not be able
to pass the confirmed_flush_lsn check.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: [RFC] Clang plugin for catching suspicious typedef casting

2023-08-14 Thread Xing Guo

Hi,

On 8/10/23, Dmitry Dolgov <9erthali...@gmail.com> wrote:
>> On Wed, Aug 09, 2023 at 03:23:32PM +0200, Peter Eisentraut wrote:
>> On 03.08.23 18:56, Dmitry Dolgov wrote:
>> > I would like to get your opinion, folks. Does it sound interesting
>> > enough for the community, is it worth it to pursue the idea?
>>
>> I think it's interesting, and doesn't look too invasive.

+1.

>> Maybe we can come up with three or four interesting use cases and try to
>> implement them.  BlockNumber vs. Buffer type checking is obviously a bit
>> marginal to get anyone super-excited, but it's a reasonable demo.

Several months ago, I implemented a clang plugin[1] for checking
suspicious return/goto/break/continue in PG_TRY/PG_CATCH blocks and it
found unsafe codes in Postgres[2]. It's implemented using clang's AST
matcher API.

I would like to contribute it well if this RFC can be accepted.

[1] 
https://github.com/higuoxing/clang-plugins/blob/main/lib/ReturnInPgTryBlockChecker.cpp
[2] 
https://www.postgresql.org/message-id/flat/CACpMh%2BCMsGMRKFzFMm3bYTzQmMU5nfEEoEDU2apJcc4hid36AQ%40mail.gmail.com

-- 
Best Regards,
Xing

Re: Regression test collate.icu.utf8 failed on REL_14_STABLE

2023-08-14 Thread Japin Li



On Tue, 15 Aug 2023 at 08:54, Japin Li  wrote:
> On Tue, 15 Aug 2023 at 08:49, Andy Fan  wrote:
>>>
>>>
>>>
>>>  DROP SCHEMA test_schema;
>>> +ERROR:  cannot drop schema test_schema because other objects depend on it
>>> +DETAIL:  collation test_schema.test11 depends on schema test_schema
>>> +HINT:  Use DROP ... CASCADE to drop the dependent objects too.
>>>  DROP ROLE regress_test_role;
>>> +ERROR:  role "regress_test_role" cannot be dropped because some objects
>>> depend on it
>>> +DETAIL:  owner of collation test_schema.test11
>>> +ERROR:  role "regress_test_role" cannot be dropped because some objects
>>> depend on it
>>> +DETAIL:  owner of collation test_schema.test11
>>>
>>> +ERROR:  role "regress_test_role" already exists
>>>
>>>
>>>
>>>
>> Did you run 'make installcheck' rather than 'make check' and there
>> was a failure before this round of test? This looks to me that there
>> are some objects are not cleaned well before this run.  you can try
>> 'make installcheck' with a pretty clean setup or run 'make check'
>> directly to verify this.
>

Thanks Andy, I think I find the root cause. In my environment, LANG has
different setting from others.

$ locale
LANG=C.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8

Then, I set LANG to en_US.UTF-8, all tests passed.  What I'm curious about
is why PG 15 can pass.

-- 
Regrads,
Japin Li

Re: Regression test collate.icu.utf8 failed on REL_14_STABLE

2023-08-14 Thread Japin Li



On Tue, 15 Aug 2023 at 08:49, Andy Fan  wrote:
>>
>>
>>
>>  DROP SCHEMA test_schema;
>> +ERROR:  cannot drop schema test_schema because other objects depend on it
>> +DETAIL:  collation test_schema.test11 depends on schema test_schema
>> +HINT:  Use DROP ... CASCADE to drop the dependent objects too.
>>  DROP ROLE regress_test_role;
>> +ERROR:  role "regress_test_role" cannot be dropped because some objects
>> depend on it
>> +DETAIL:  owner of collation test_schema.test11
>> +ERROR:  role "regress_test_role" cannot be dropped because some objects
>> depend on it
>> +DETAIL:  owner of collation test_schema.test11
>>
>> +ERROR:  role "regress_test_role" already exists
>>
>>
>>
>>
> Did you run 'make installcheck' rather than 'make check' and there
> was a failure before this round of test? This looks to me that there
> are some objects are not cleaned well before this run.  you can try
> 'make installcheck' with a pretty clean setup or run 'make check'
> directly to verify this.

I used `make check` and cleanup the entire build directory.  Here is my
compile & build script.

$ cat compile.sh
#!/bin/bash

set -e
rm -rf $(ls -I '*.sh')

../configure \
--prefix=$PWD/pg \
--enable-tap-tests \
--enable-debug \
--enable-cassert \
--enable-depend \
--enable-dtrace \
--with-icu \
--with-llvm \
--with-openssl \
--with-python \
--with-libxml \
--with-libxslt \
--with-lz4 \
--with-pam \
CFLAGS='-O0 -Wmissing-prototypes -Wincompatible-pointer-types' \
>configure.log 2>&1
make -j $(nproc) -s && make install -s
(cd contrib/ && make -j $(nproc) -s && make install -s)

-- 
Regrads,
Japin Li

Re: Regression test collate.icu.utf8 failed on REL_14_STABLE

2023-08-14 Thread Andy Fan

>
>
>
>  DROP SCHEMA test_schema;
> +ERROR:  cannot drop schema test_schema because other objects depend on it
> +DETAIL:  collation test_schema.test11 depends on schema test_schema
> +HINT:  Use DROP ... CASCADE to drop the dependent objects too.
>  DROP ROLE regress_test_role;
> +ERROR:  role "regress_test_role" cannot be dropped because some objects
> depend on it
> +DETAIL:  owner of collation test_schema.test11
> +ERROR:  role "regress_test_role" cannot be dropped because some objects
> depend on it
> +DETAIL:  owner of collation test_schema.test11
>
> +ERROR:  role "regress_test_role" already exists
>
>
>
>
Did you run 'make installcheck' rather than 'make check' and there
was a failure before this round of test? This looks to me that there
are some objects are not cleaned well before this run.  you can try
'make installcheck' with a pretty clean setup or run 'make check'
directly to verify this.

-- 
Best Regards
Andy Fan

Re: Support to define custom wait events for extensions

2023-08-14 Thread Michael Paquier

On Tue, Aug 15, 2023 at 09:14:15AM +0900, Masahiro Ikeda wrote:
> OK. I'll make a new patch and start a new thread.

Cool, thanks!
--
Michael


signature.asc
Description: PGP signature

Re: Support to define custom wait events for extensions

2023-08-14 Thread Masahiro Ikeda


On 2023-08-14 18:26, Michael Paquier wrote:

On Mon, Aug 14, 2023 at 05:55:42PM +0900, Masahiro Ikeda wrote:

I'm thinking a name like "contrib name + description summary" would
be nice. The "contrib name"  is namespace-like and the "description 
summary"
is the same as the name of the waiting event name in core. For 
example,
"DblinkConnect" for dblink. In the same as the core one, I thought the 
name

should be the camel case.


Or you could use something more in line with the other in-core wait
events formatted as camel-case, like DblinkConnect, etc.

BTW, is it better to discuss this in a new thread because other 
developers
might be interested in user-facing wait event names? I also would like 
to
add documentation on the wait events for each modules, as they are not 
mentioned

now.


Saying that, having some documentation on the page of each extension
is mandatory once these can be customized, in my opinion.  All that
should be discussed on a new, separate thread, to attract the correct
audience.


OK. I'll make a new patch and start a new thread.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

Re: Ignore 2PC transaction GIDs in query jumbling

2023-08-14 Thread Michael Paquier

On Mon, Aug 14, 2023 at 12:11:13PM +0100, Dagfinn Ilmari Mannsåker wrote:
> As far as I could tell the only thing missing was removing
> DeallocateStmt from the list of unhandled utility statement types (and
> updating comments to match).  Updated patch attached.

Hmm.  One issue with the patch is that we finish by considering
DEALLOCATE ALL and DEALLOCATE $1 as the same things, compiling the
same query IDs.  The difference is made in the Nodes by assigning NULL
to the name but we would now ignore it.  Wouldn't it be better to add
an extra field to DeallocateStmt to track separately the named
deallocate queries and ALL in monitoring?
--
Michael


signature.asc
Description: PGP signature

Re: PG 16 draft release notes ready

2023-08-14 Thread Bruce Momjian

On Thu, Aug 10, 2023 at 07:56:12AM +0300, Pavel Luzanov wrote:
> On 09.08.2023 21:06, Bruce Momjian wrote:
> > On Sun, Jul 23, 2023 at 02:09:17PM +0300, Pavel Luzanov wrote:
> > > Please consider to add item to the psql section:
> > > 
> > > Add psql \drg command to display role grants and remove the "Member of"
> > > column from \du & \dg altogether (d65ddaca)
> > The release notes are only current as of 2023-06-26 and I will consider
> > this when I updated them next week, thanks.
> 
> This item is a part of Beta 3 scheduled for August 10, 2023 (today). [1]
> It might be worth updating the release notes before the release.
> But I'm not familiar with the release process in detail, so I could be
> wrong.
> 
> 1. 
> https://www.postgresql.org/message-id/93c00ac3-08b3-33ba-5d77-6ceb6ab20254%40postgresql.org

The next text is:



Allow psql's access privilege commands to show system objects (Nathan 
Bossart, Pavel Luzanov)



--> The options are \dpS, \zS, and \drg.



The current release notes are at:

https://momjian.us/pgsql_docs/release-16.html

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.

Re: walsender "wakeup storm" on PG16, likely because of bc971f4025c (Optimize walsender wake up logic using condition variables)

2023-08-14 Thread Thomas Munro

On Tue, Aug 15, 2023 at 2:23 AM Tomas Vondra
 wrote:
> I'm not familiar with the condition variable code enough to have an
> opinion, but the patch seems to resolve the issue for me - I can no
> longer reproduce the high CPU usage.

Thanks, pushed.

ECPG Semantic Analysis

2023-08-14 Thread Juan Rodrigo Alejandro Burgos Mella

Hi
I have a modified version of ECPG, to which I gave the ability to do
semantic analysis of SQL statements. Where i can share it or with whom can
I discuss it?

Atte.
JRBM

Re: Regarding Contributions

2023-08-14 Thread Bruce Momjian

On Tue, Aug 15, 2023 at 01:24:44AM +0530, Satwik Sharma wrote:
> Hello, my name is Satwik Sharma, and I'm an enthusiast in the fields of data
> science and new to open source development. I presently use Python, SQL,
> MongoDB, PowerBI, Tableau, and am currently studying Scala. I came across the
> repository maintained by your company, and I want to contribute. It would be
> really useful if you could steer me in the appropriate path (which projects/
> repos to choose) and suggest some good first issues. In order to contribute
> more actively, what technologies should I learn as well?

You might want to look here:

https://www.postgresql.org/developer/

especially the developers FAQ.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.

Regarding Contributions

2023-08-14 Thread Satwik Sharma

Hello, my name is Satwik Sharma, and I'm an enthusiast in the fields of
data science and new to open source development. I presently use Python,
SQL, MongoDB, PowerBI, Tableau, and am currently studying Scala. I came
across the repository maintained by your company, and I want to contribute.
It would be really useful if you could steer me in the appropriate path
(which projects/repos to choose) and suggest some good first issues. In
order to contribute more actively, what technologies should I learn as well?

Re: run pgindent on a regular basis / scripted manner

2023-08-14 Thread Andrew Dunstan



On 2023-08-14 Mo 10:04, Peter Eisentraut wrote:

On 12.08.23 23:14, Andres Freund wrote:
It's a somewhat annoying task though, find all the typedefs, add them 
to the
right place in the file (we have an out of order entry right now). I 
think a
script that*adds*  (but doesn't remove) local typedefs would make 
this less

painful.


I was puzzled once that there does not appear to be such a script 
available.  Whatever the buildfarm does (before it merges it all 
together) should be available locally.  Then the workflow could be


type type type
compile
update typedefs
pgindent
commit




It's a bit more complicated :-)

You can see what the buildfarm does at 
 
It's been somewhat fragile over the years, which most people other than 
Tom and I have probably not noticed.


On most platforms it needs postgres to have been installed before 
looking for the typedefs.



cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

Re: In-placre persistance change of a relation

2023-08-14 Thread Nathan Bossart

I think there are some good ideas here.  I started to take a look at the
patches, and I've attached a rebased version of the patch set.  Apologies
if I am repeating any discussions from upthread.

First, I tested the time difference in ALTER TABLE SET UNLOGGED/LOGGED with
the patch applied, and the results looked pretty impressive.

before:
postgres=# alter table test set unlogged;
ALTER TABLE
Time: 5108.071 ms (00:05.108)
postgres=# alter table test set logged;
ALTER TABLE
Time: 6747.648 ms (00:06.748)

after:
postgres=# alter table test set unlogged;
ALTER TABLE
Time: 25.609 ms
postgres=# alter table test set logged;
ALTER TABLE
Time: 1241.800 ms (00:01.242)

My first question is whether 0001 is a prerequisite to 0002.  I'm assuming
it is, but the reason wasn't immediately obvious to me.  If it's just
nice-to-have, perhaps we could simplify the patch set a bit.  I see that
Heikki had some general concerns with the marker file approach [0], so
perhaps it is at least worth brainstorming some alternatives if we _do_
need it.

[0] https://postgr.es/m/9827ebd3-de2e-fd52-4091-a568387b1fc2%40iki.fi

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 5a4fb063a8b5e8a731373c4d06e51ad7fbeebebd Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Thu, 2 Mar 2023 17:25:12 +0900
Subject: [PATCH v29 1/3] Introduce storage mark files

In specific scenarios, certain operations followed by a crash-restart
may generate orphaned storage files that cannot be removed through
standard procedures or cause the server to fail during restart. This
commit introduces 'mark files' to convey information about the storage
file. In particular, an "UNCOMMITTED" mark file is implemented to
identify uncommitted files for removal during the subsequent startup.
---
 src/backend/access/rmgrdesc/smgrdesc.c|  37 +++
 src/backend/access/transam/README |  10 +
 src/backend/access/transam/xact.c |   7 +
 src/backend/access/transam/xlogrecovery.c |  18 ++
 src/backend/backup/basebackup.c   |   9 +-
 src/backend/catalog/storage.c | 268 +++-
 src/backend/storage/file/fd.c |   4 +-
 src/backend/storage/file/reinit.c | 287 +++---
 src/backend/storage/smgr/md.c |  94 ++-
 src/backend/storage/smgr/smgr.c   |  32 +++
 src/backend/storage/sync/sync.c   |  26 +-
 src/bin/pg_rewind/parsexlog.c |  16 ++
 src/common/relpath.c  |  47 ++--
 src/include/catalog/storage.h |   3 +
 src/include/catalog/storage_xlog.h|  35 ++-
 src/include/common/relpath.h  |   9 +-
 src/include/storage/fd.h  |   1 +
 src/include/storage/md.h  |   8 +-
 src/include/storage/reinit.h  |   8 +-
 src/include/storage/smgr.h|  17 ++
 src/test/recovery/t/013_crash_restart.pl  |  21 ++
 src/tools/pgindent/typedefs.list  |   6 +
 22 files changed, 834 insertions(+), 129 deletions(-)

diff --git a/src/backend/access/rmgrdesc/smgrdesc.c b/src/backend/access/rmgrdesc/smgrdesc.c
index bd841b96e8..f8187385c4 100644
--- a/src/backend/access/rmgrdesc/smgrdesc.c
+++ b/src/backend/access/rmgrdesc/smgrdesc.c
@@ -40,6 +40,37 @@ smgr_desc(StringInfo buf, XLogReaderState *record)
 		 xlrec->blkno, xlrec->flags);
 		pfree(path);
 	}
+	else if (info == XLOG_SMGR_UNLINK)
+	{
+		xl_smgr_unlink *xlrec = (xl_smgr_unlink *) rec;
+		char	   *path = relpathperm(xlrec->rlocator, xlrec->forkNum);
+
+		appendStringInfoString(buf, path);
+		pfree(path);
+	}
+	else if (info == XLOG_SMGR_MARK)
+	{
+		xl_smgr_mark *xlrec = (xl_smgr_mark *) rec;
+		char	   *path = GetRelationPath(xlrec->rlocator.dbOid,
+		   xlrec->rlocator.spcOid,
+		   xlrec->rlocator.relNumber,
+		   InvalidBackendId,
+		   xlrec->forkNum, xlrec->mark);
+		char	   *action = "";
+
+		switch (xlrec->action)
+		{
+			case XLOG_SMGR_MARK_CREATE:
+action = "CREATE";
+break;
+			case XLOG_SMGR_MARK_UNLINK:
+action = "DELETE";
+break;
+		}
+
+		appendStringInfo(buf, "%s %s", action, path);
+		pfree(path);
+	}
 }
 
 const char *
@@ -55,6 +86,12 @@ smgr_identify(uint8 info)
 		case XLOG_SMGR_TRUNCATE:
 			id = "TRUNCATE";
 			break;
+		case XLOG_SMGR_UNLINK:
+			id = "UNLINK";
+			break;
+		case XLOG_SMGR_MARK:
+			id = "MARK";
+			break;
 	}
 
 	return id;
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README
index 22c8ae9755..e10f6af0e3 100644
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -741,6 +741,16 @@ we must panic and abort recovery.  The DBA will have to manually clean up
 then restart recovery.  This is part of the reason for not writing a WAL
 entry until we've successfully done the original action.
 
+
+Smgr MARK files

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

2023-08-14 Thread Jeff Davis

On Sat, 2023-08-12 at 11:25 -0700, Andres Freund wrote:
> 
> I'm not sure that anything based, directly or indirectly, on
> search_path
> really is a realistic way to get there.

Can you explain a little more? I see what you mean generally, that
search_path is an imprecise thing, and that it leaves room for
ambiguity and mistakes.

But I also think we can do a lot better than we're doing today and
still retain the basic concept of search_path, which is good because
it's deeply integrated into postgres, and it's not clear that we're
going to get away from it any time soon.

> 
> 
> I think that'd be pretty painful from a UX perspective. Having to
> write
> e.g. operators as operator(schema, op) just sucks as an experience.

I'm not suggesting that the user fully-qualify everything; I'm
suggesting that the include a "SET search_path" clause if they depend
on anything other than pg_catalog.

> And with
> extensions plenty of operators will live outside of pg_catalog, so
> there is
> plenty things that will need qualifying.

In my proposal, that would still involve a "SET search_path TO
myextension, pg_catalog, pg_temp".

The main reason that's bad is that adding pg_temp at the end is painful
UX -- just something that the user needs to remember to do with little
obvious reason or observable impact; but it has important security
implications. Perhaps we should just not implicitly include pg_temp for
a function's search_path (at least for the case of CREATE FUNCTION ...
SEARCH SYSTEM)?

>   And because of things like type
> coercion search, which prefers "bettering fitting" coercions over
> search path
> order, you can't just put "less important" things later in search
> path.

I understand this introduces some ambiguity, but you just can't include
schemas in the search_path that you don't trust, for similar reasons as
$PATH. If you have a few objects you'd like to access in another user's
schema, fully-qualify them.

> We can't just store the oids at the time, because that'd end up very
> fragile -
> tables/functions/... might be dropped and recreated etc and thus
> change their
> oid.

Robert suggested something along those lines[1]. I won't rule it out,
but I agree that there are quite a few things left to figure out.

>  But we could change the core PLs to rewrite all the queries (*) so
> that
> they schema qualify absolutely everything, including operators and
> implicit
> type casts.

So not quite like "SET search_path FROM CURRENT": you resolve it to a
specific "schemaname.objectname", but stop just short of resolving to a
specific OID?

An interesting compromise, but I'm not sure what the benefit is vs. SET
search_path FROM CURRENT (or some defined search_path).

> That way objects referenced by functions can still be replaced, but
> search
> path can't be used to "inject" objects in different schemas.
> Obviously it
> could lead to errors on some schema changes - e.g. changing a column
> type
> might mean that a relevant cast lives in a different place than with
> the old
> type - but I think that'll be quite rare. Perhaps we could offer a
> ALTER
> FUNCTION ... REFRESH REFERENCES; or such?

Hmm. I feel like that's making things more complicated. I'd find it
more straightforward to use something like Robert's approach of fully
parsing something, and then have the REFRESH command reparse it when
something needs updating. Or perhaps just create all of the dependency
entries more like a view query and then auto-refresh.

> (*) Obviously the one thing that doesn't work for is use of EXECUTE
> in plpgsql
> and similar constructs elsewhere. I'm not sure there's much that can
> be done
> to make that safe, but it's worth thinking about more.

I think it would be really nice to have some better control over the
search_path regardless, because it still helps with cases like this. A
lot of C functions build queries, and I don't think it's reasonable to
constantly worry about the ambiguity of the schema for "=".

Regards,
Jeff Davis

[1]
https://www.postgresql.org/message-id/CA%2BTgmobd%3DeFRGWHhfG4mG2cA%2BdsVuA4jpBvD8N1NS%3DVc9eHFQg%40mail.gmail.com

Re: [PATCH] psql: Add tab-complete for optional view parameters

2023-08-14 Thread David Zhang





[..]

For below changes,

  else if (TailMatches("CREATE", "VIEW", MatchAny, "AS") ||
-             TailMatches("CREATE", "OR", "REPLACE", "VIEW", MatchAny,
"AS"))
+             TailMatches("CREATE", "VIEW", MatchAny, "WITH", "(*)", "AS")
||
+             TailMatches("CREATE", "OR", "REPLACE", "VIEW", MatchAny, "AS")
||
+             TailMatches("CREATE", "OR", "REPLACE", "VIEW", MatchAny,
"WITH", "(*)", "AS"))

it would be great to switch the order of the 3rd and the 4th line to make a
better match for "CREATE" and "CREATE OR REPLACE" .

I don't see how it would effect matching in any way - or am I
overlooking something here?


It won't affect the SQL matching. What I was trying to say is that using 
'CREATE OR REPLACE ...' after 'CREATE ...' can enhance code structure, 
making it more readable. For instance,


/* Complete CREATE [ OR REPLACE ] VIEW  WITH ( ... ) with "AS" */
else if (TailMatches("CREATE", "VIEW", MatchAny, "WITH", "(*)") ||
 TailMatches("CREATE", "OR", "REPLACE", "VIEW", 
MatchAny, "WITH", "(*)"))

    COMPLETE_WITH("AS");

"CREATE", "OR", "REPLACE", "VIEW", MatchAny, "WITH", "(*)" follows 
"CREATE", "VIEW", MatchAny, "WITH", "(*)") immediately.


best regards,

David

Re: Extract numeric filed in JSONB more effectively

2023-08-14 Thread Andy Fan

On Mon, Aug 14, 2023 at 10:10 PM Tom Lane  wrote:

> Chapman Flack  writes:
> > Providing a function with return type declared internal but
> > with no parameter of that type is not good,
>
> Not so much "not good" as "absolutely, positively WILL NOT HAPPEN".

Chap is pretty nice to others:).

>
>
> because then a
> > user could, in principle, call it and obtain a value of
> > 'internal' type, and so get around the typing rules that
> > prevent calling other internal functions.
>
> Right --- it'd completely break the system's type-safety for
> other internal-using functions.
>
>
I do see something bad in opr_sanity.sql.  Pavel suggested
get_fn_expr_argtype which can resolve this issue pretty well, so
I have changed

jsonb_extract_xx_type(..,  Oid taget_oid) -> anyelement.
to
jsonb_extract_xx_type(..,  anyelement) -> anyelement.

The only bad smell left is since I want to define jsonb_extract_xx_type
as strict so I can't use   jsonb_extract_xx_type(.., NULL::a-type)
since it will be evaluated to NULL directly.  So I hacked it with

/* mock the type. */
Const   *target =  makeNullConst(fexpr->funcresulttype,
 -1,
 InvalidOid);

/* hack the NULL attribute */
/*

 * Since all the above functions are strict, we can't input

 * a NULL value.

 */
target->constisnull = false;

 jsonb_extract_xx_type just cares about the argtype, but
'explain select xx'  will still access the const->constvalue.
const->constvalue is 0 which is set by makeNullConst currently,
and it is ok for the current supported type. but I'm not sure
about the future or if we still have a better solution.

v6 is attached.  any feedback is welcome!

-- 
Best Regards
Andy Fan

v6-0001-optimize-casting-jsonb-to-a-given-type.patch
Description: Binary data

Re: Replace known_assigned_xids_lck by memory barrier

2023-08-14 Thread Nathan Bossart

On Sun, Mar 19, 2023 at 12:43:43PM +0300, Michail Nikolaev wrote:
> In a nutshell: KnownAssignedXids as well as the head/tail pointers are
> modified only by the startup process, so spinlock is used to ensure
> that updates of the array and head/tail pointers are seen in a correct
> order. It is enough to pass the barrier after writing to the array
> (but before updating the pointers) to achieve the same result.

What sort of benefits do you see from this patch?  It might be worthwhile
in itself to remove spinlocks when possible, but IME it's much easier to
justify such changes when there is a tangible benefit we can point to.

/*
-* Now update the head pointer.  We use a spinlock to protect this
+* Now update the head pointer.  We use a memory barrier to protect this
 * pointer, not because the update is likely to be non-atomic, but to
 * ensure that other processors see the above array updates before they
 * see the head pointer change.
 *
 * If we're holding ProcArrayLock exclusively, there's no need to take 
the
-* spinlock.
+* barrier.
 */

Are the assignments in question guaranteed to be atomic?  IIUC we assume
that aligned 4-byte loads/stores are atomic, so we should be okay as long
as we aren't handling anything larger.

-   SpinLockAcquire(>known_assigned_xids_lck);
+   pg_write_barrier();
pArray->headKnownAssignedXids = head;
-   SpinLockRelease(>known_assigned_xids_lck);

This use of pg_write_barrier() looks correct to me, but don't we need
corresponding read barriers wherever we obtain the pointers?  FWIW I tend
to review src/backend/storage/lmgr/README.barrier in its entirety whenever
I deal with this stuff.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Regression test collate.icu.utf8 failed on REL_14_STABLE

2023-08-14 Thread Japin Li



Hi, hackers

I find when I compile PG 14 with --with-icu, collate.icu.utf8 and foreign_data
regression tests will failed.  However it is OK on REL_15_STABLE and master.
I also test this on REL_13_STABLE, and it also failed.  Here is the regression
diffs.

diff -U3 
/home/japin/Codes/postgres/build/../src/test/regress/expected/collate.icu.utf8.out
 /home/japin/Codes/postgres/build/src/test/regress/results/collate.icu.utf8.out
--- 
/home/japin/Codes/postgres/build/../src/test/regress/expected/collate.icu.utf8.out
  2023-08-14 17:37:31.960448245 +0800
+++ 
/home/japin/Codes/postgres/build/src/test/regress/results/collate.icu.utf8.out  
2023-08-14 21:30:44.335214886 +0800
@@ -1035,6 +1035,9 @@
   quote_literal(current_setting('lc_ctype')) || ');';
 END
 $$;
+ERROR:  collations with different collate and ctype values are not supported 
by ICU
+CONTEXT:  SQL statement "CREATE COLLATION test1 (provider = icu, lc_collate = 
'C.UTF-8', lc_ctype = 'en_US.UTF-8');"
+PL/pgSQL function inline_code_block line 3 at EXECUTE
 CREATE COLLATION test3 (provider = icu, lc_collate = 'en_US.utf8'); -- fail, 
need lc_ctype
 ERROR:  parameter "lc_ctype" must be specified
 CREATE COLLATION testx (provider = icu, locale = 'nonsense'); /* never fails 
with ICU */  DROP COLLATION testx;
@@ -1045,13 +1048,12 @@
  collname 
 --
  test0
- test1
  test5
-(3 rows)
+(2 rows)
 
 ALTER COLLATION test1 RENAME TO test11;
+ERROR:  collation "test1" for encoding "UTF8" does not exist
 ALTER COLLATION test0 RENAME TO test11; -- fail
-ERROR:  collation "test11" already exists in schema "collate_tests"
 ALTER COLLATION test1 RENAME TO test22; -- fail
 ERROR:  collation "test1" for encoding "UTF8" does not exist
 ALTER COLLATION test11 OWNER TO regress_test_role;
@@ -1059,18 +1061,19 @@
 ERROR:  role "nonsense" does not exist
 ALTER COLLATION test11 SET SCHEMA test_schema;
 COMMENT ON COLLATION test0 IS 'US English';
+ERROR:  collation "test0" for encoding "UTF8" does not exist
 SELECT collname, nspname, obj_description(pg_collation.oid, 'pg_collation')
 FROM pg_collation JOIN pg_namespace ON (collnamespace = pg_namespace.oid)
 WHERE collname LIKE 'test%'
 ORDER BY 1;
  collname |nspname| obj_description 
 --+---+-
- test0| collate_tests | US English
  test11   | test_schema   | 
  test5| collate_tests | 
-(3 rows)
+(2 rows)
 
 DROP COLLATION test0, test_schema.test11, test5;
+ERROR:  collation "test0" for encoding "UTF8" does not exist
 DROP COLLATION test0; -- fail
 ERROR:  collation "test0" for encoding "UTF8" does not exist
 DROP COLLATION IF EXISTS test0;
@@ -1078,10 +1081,17 @@
 SELECT collname FROM pg_collation WHERE collname LIKE 'test%';
  collname 
 --
-(0 rows)
+ test11
+ test5
+(2 rows)
 
 DROP SCHEMA test_schema;
+ERROR:  cannot drop schema test_schema because other objects depend on it
+DETAIL:  collation test_schema.test11 depends on schema test_schema
+HINT:  Use DROP ... CASCADE to drop the dependent objects too.
 DROP ROLE regress_test_role;
+ERROR:  role "regress_test_role" cannot be dropped because some objects depend 
on it
+DETAIL:  owner of collation test_schema.test11
 -- ALTER
 ALTER COLLATION "en-x-icu" REFRESH VERSION;
 NOTICE:  version has not changed
diff -U3 
/home/japin/Codes/postgres/build/../src/test/regress/expected/foreign_data.out 
/home/japin/Codes/postgres/build/src/test/regress/results/foreign_data.out
--- 
/home/japin/Codes/postgres/build/../src/test/regress/expected/foreign_data.out  
2023-08-14 17:37:31.964448260 +0800
+++ /home/japin/Codes/postgres/build/src/test/regress/results/foreign_data.out  
2023-08-14 21:30:55.571170376 +0800
@@ -5,10 +5,13 @@
 -- Suppress NOTICE messages when roles don't exist
 SET client_min_messages TO 'warning';
 DROP ROLE IF EXISTS regress_foreign_data_user, regress_test_role, 
regress_test_role2, regress_test_role_super, regress_test_indirect, 
regress_unprivileged_role;
+ERROR:  role "regress_test_role" cannot be dropped because some objects depend 
on it
+DETAIL:  owner of collation test_schema.test11
 RESET client_min_messages;
 CREATE ROLE regress_foreign_data_user LOGIN SUPERUSER;
 SET SESSION AUTHORIZATION 'regress_foreign_data_user';
 CREATE ROLE regress_test_role;
+ERROR:  role "regress_test_role" already exists
 CREATE ROLE regress_test_role2;
 CREATE ROLE regress_test_role_super SUPERUSER;
 CREATE ROLE regress_test_indirect;

Is it a bug that fixed in REL_15_STABLE? If yes, why not backpatch?


-- 
Regrads,
Japin Li

Re: Do we want a hashset type?

2023-08-14 Thread Florents Tselai

Has anyone put this in a git repo / extension package or similar ? 

I’d like to try it out outside the core pg tree. 

> On 1 Jul 2023, at 12:04 PM, Joel Jacobson  wrote:
> 
> On Fri, Jun 30, 2023, at 06:50, jian he wrote:
>> more like a C questions
>> in this context does
>> #define HASHSET_GET_VALUES(set) ((int32 *) ((set)->data +
>> CEIL_DIV((set)->capacity, 8)))
>> define first, then define struct int4hashset_t. Is this normally ok?
> 
> Yes, it's fine. Macros are just text substitutions done pre-compilation.
> 
>> Also does
>> #define HASHSET_GET_VALUES(set) ((int32 *) ((set)->data +
>> CEIL_DIV((set)->capacity, 8)))
>> 
>> remove (int32 *) will make it generic? then when you use it, you can
>> cast whatever type you like?
> 
> Maybe, but might be less error-prone more descriptive with different
> macros for each type, e.g. INT4HASHSET_GET_VALUES,
> similar to the existing PG_GETARG_INT4HASHSET
> 
> Curious to hear what everybody thinks about the interface, documentation,
> semantics and implementation?
> 
> Is there anything missing or something that you think should be 
> changed/improved?
> 
> /Joel
> 
>

Re: pgbench - adding pl/pgsql versions of tests

2023-08-14 Thread Nathan Bossart

On Wed, Jan 04, 2023 at 07:06:44PM +0100, Hannu Krosing wrote:
> 1. so I don't have to create the script and function manually each
> time I want to test mainly the database (instead of the
> client-database system)
> 
> 2. so that new users of PostgreSQL can easily see how much better OLTP
> workloads perform when packaged up as a server-side function

I'm not sure we should add micro-optimized versions of the existing scripts
to pgbench.  Your point about demonstrating the benefits of server-side
functions seems reasonable, but it also feels a bit like artifically
improving pgbench numbers.  I think I'd rather see some more variety in the
built-in scripts so that folks can more easily test a wider range of common
workloads.  Perhaps this could include a test that is focused on
server-side functions.

In any case, it looks like there is unaddressed feedback for this patch, so
I'm marking it as "waiting on author."

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: [PATCH] psql: \dn+ to show size of each schema (and \dA+ for AMs)

2023-08-14 Thread Justin Pryzby

On Thu, Dec 15, 2022 at 10:13:23AM -0600, Justin Pryzby wrote:
> This patch record was "closed for lack of interest", but I think what's
> actually needed is committer review of which approach to take.

On Tue, Aug 01, 2023 at 09:54:34AM +0200, Daniel Gustafsson wrote:
> > On 24 May 2023, at 23:05, Justin Pryzby  wrote:
> 
> > I'm planning to set this patch as ready
> 
> This is marked RfC so I'm moving this to the next CF, but the patch no longer
> applies so it needs a rebase.

I was still hoping to receive some feedback on which patches to squish.

-- 
Justin
>From 2b97b0fe27199a343f00f31aaf3fcd79fd1228f1 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 13 Jul 2021 21:25:48 -0500
Subject: [PATCH 1/4] Add pg_am_size(), pg_namespace_size() ..

See also: 358a897fa, 528ac10c7
---
 doc/src/sgml/func.sgml  |  39 ++
 src/backend/utils/adt/dbsize.c  | 132 
 src/include/catalog/pg_proc.dat |  19 +
 3 files changed, 190 insertions(+)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index be2f54c9141..8a1c8226c48 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -27814,6 +27814,45 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset

   
 
+  
+   
+
+ pg_namespace_size
+
+pg_namespace_size ( name )
+bigint
+   
+   
+pg_namespace_size ( oid )
+bigint
+   
+   
+Computes the total disk space used by relations in the namespace (schema)
+with the specified name or OID. To use this function, you must
+have CREATE privilege on the specified namespace
+or have privileges of the pg_read_all_stats role,
+unless it is the default namespace for the current database.
+   
+  
+
+  
+   
+
+ pg_am_size
+
+pg_am_size ( name )
+bigint
+   
+   
+pg_am_size ( oid )
+bigint
+   
+   
+Computes the total disk space used by relations using the access method
+with the specified name or OID.
+   
+  
+
   

 
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index e5c0f1c45b6..af0955d1790 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -13,19 +13,25 @@
 
 #include 
 
+#include "access/genam.h"
 #include "access/htup_details.h"
 #include "access/relation.h"
+#include "access/table.h"
 #include "catalog/catalog.h"
 #include "catalog/namespace.h"
 #include "catalog/pg_authid.h"
 #include "catalog/pg_database.h"
+#include "catalog/pg_namespace.h"
 #include "catalog/pg_tablespace.h"
 #include "commands/dbcommands.h"
+#include "commands/defrem.h"
 #include "commands/tablespace.h"
 #include "miscadmin.h"
 #include "storage/fd.h"
 #include "utils/acl.h"
 #include "utils/builtins.h"
+#include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
 #include "utils/numeric.h"
 #include "utils/rel.h"
 #include "utils/relfilenumbermap.h"
@@ -858,6 +864,132 @@ pg_size_bytes(PG_FUNCTION_ARGS)
 	PG_RETURN_INT64(result);
 }
 
+/*
+ * Return the sum of size of relations for which the given attribute of
+ * pg_class matches the specified OID value.
+ */
+static int64
+calculate_size_attvalue(AttrNumber attnum, Oid attval)
+{
+	int64		totalsize = 0;
+	ScanKeyData skey;
+	Relation	pg_class;
+	SysScanDesc scan;
+	HeapTuple	tuple;
+
+	ScanKeyInit(, attnum,
+BTEqualStrategyNumber, F_OIDEQ, attval);
+
+	pg_class = table_open(RelationRelationId, AccessShareLock);
+	scan = systable_beginscan(pg_class, InvalidOid, false, NULL, 1, );
+	while (HeapTupleIsValid(tuple = systable_getnext(scan)))
+	{
+		Form_pg_class classtuple = (Form_pg_class) GETSTRUCT(tuple);
+		Relation	rel;
+
+		rel = try_relation_open(classtuple->oid, AccessShareLock);
+		if (!rel)
+			continue;
+
+		for (ForkNumber forkNum = 0; forkNum <= MAX_FORKNUM; forkNum++)
+			totalsize += calculate_relation_size(&(rel->rd_locator), rel->rd_backend, forkNum);
+
+		relation_close(rel, AccessShareLock);
+	}
+
+	systable_endscan(scan);
+	table_close(pg_class, AccessShareLock);
+	return totalsize;
+}
+
+/* Compute the size of relations in a schema (namespace) */
+static int64
+calculate_namespace_size(Oid nspOid)
+{
+	/*
+	 * User must be a member of pg_read_all_stats or have CREATE privilege for
+	 * target namespace.
+	 */
+	if (!is_member_of_role(GetUserId(), ROLE_PG_READ_ALL_STATS))
+	{
+		AclResult	aclresult;
+
+		aclresult = object_aclcheck(NamespaceRelationId, nspOid, GetUserId(), ACL_CREATE);
+		if (aclresult != ACLCHECK_OK)
+			aclcheck_error(aclresult, OBJECT_SCHEMA,
+		   get_namespace_name(nspOid));
+	}
+
+	return calculate_size_attvalue(Anum_pg_class_relnamespace, nspOid);
+}
+
+Datum
+pg_namespace_size_oid(PG_FUNCTION_ARGS)
+{
+	Oid			nspOid = PG_GETARG_OID(0);
+	int64		size;
+
+	size = calculate_namespace_size(nspOid);
+
+	if (size < 0)
+

Re: AssertLog instead of Assert in some places

2023-08-14 Thread Ashutosh Bapat

On Sat, Aug 12, 2023 at 12:56 AM Andres Freund  wrote:
> On 2023-08-11 11:56:27 -0700, Peter Geoghegan wrote:
> > On Fri, Aug 11, 2023 at 11:23 AM Andres Freund  wrote:
> > > > Couldn't you say the same thing about defensive "can't happen" ERRORs?
> > > > They are essentially a form of assertion that isn't limited to
> > > > assert-enabled builds.
> > >
> > > Yes. A lot of them I hate them with the passion of a thousand suns ;). 
> > > "Oh,
> > > our transaction state machinery is confused. Yes, let's just continue 
> > > going
> > > through the same machinery again, that'll resolve it.".
> >
> > I am not unsympathetic to Ashutosh's point about conventional ERRORs
> > being easier to deal with when debugging your own code, during initial
> > development work.
>
> Oh, I am as well - I just don't think it's a good idea to introduce "log + 
> error"
> assertions to core postgres, because it seems very likely that they'll end up
> getting used a lot.
>
>

I am open to ideas which allow the same backend to recover after
meeting an easily recoverable but "can't happen" condition rather than
losing that backend and starting all over with a new backend. Not all
Assert'ed conditions are recoverable so a blanket GUC or compile time
option won't help. Those might make things worse. We need two separate
incantations for non-recoverable and recoverable Asserts respectively.

I like Peter's idea of having a new elevel, however it still requires
adding conditional USE_ASSERT, an if testing the condition and then
writing an error message. AssertLog() in the patch uses just a few
more letters.

It won't help to expand the scope of the problem since that will
reduce the chances of getting anything done.

-- 
Best Wishes,
Ashutosh Bapat

Re: [RFC] Add jit deform_counter

2023-08-14 Thread Dmitry Dolgov

> On Wed, Jul 19, 2023 at 05:18:29PM +0200, Dmitry Dolgov wrote:
> > On Tue, Jul 18, 2023, 3:32 PM Daniel Gustafsson  wrote
> >> Here is the patch with the proposed variation.
> >
> > This version still leaves non-text EXPLAIN formats with timing which
> doesn't
> > add up.  Below are JSON and XML examples:
>
> Good point. For the structured formats it should be represented via a nested
> level. I'll try to do this and other proposed changes as soon as I'll get
> back.

And here is it. The json version of EXPLAIN now looks like this:

 "JIT": {
   [...]
   "Timing": {
 "Generation": {
   "Deform": 0.000,
   "Total": 0.205
 },
 "Inlining": 0.065,
 "Optimization": 2.465,
 "Emission": 2.337,
 "Total": 5.072
   }
 },

>From 692ec7fb6c8132626e3597b753510dd1648ebeed Mon Sep 17 00:00:00 2001
From: Dmitrii Dolgov <9erthali...@gmail.com>
Date: Sat, 11 Jun 2022 11:54:40 +0200
Subject: [PATCH v6 1/2] Add deform_counter

At the moment generation_counter includes time spent on both JITing
expressions and tuple deforming. Those are configured independently via
corresponding options (jit_expressions and jit_tuple_deforming), which
means there is no way to find out what fraction of time tuple deforming
alone is taking.

Add deform_counter dedicated to tuple deforming, which allows seeing
more directly the influence jit_tuple_deforming is having on the query.
---
 doc/src/sgml/jit.sgml   |  2 +-
 src/backend/commands/explain.c  | 14 +++---
 src/backend/jit/jit.c   |  1 +
 src/backend/jit/llvm/llvmjit_expr.c |  6 ++
 src/include/jit/jit.h   |  3 +++
 5 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/jit.sgml b/doc/src/sgml/jit.sgml
index 998c972e8ba..1921557cb82 100644
--- a/doc/src/sgml/jit.sgml
+++ b/doc/src/sgml/jit.sgml
@@ -170,7 +170,7 @@ SET
  JIT:
Functions: 3
Options: Inlining false, Optimization false, Expressions true, Deforming true
-   Timing: Generation 1.259 ms, Inlining 0.000 ms, Optimization 0.797 ms, Emission 5.048 ms, Total 7.104 ms
+   Timing: Generation 1.259 ms (Deform 0.000 ms), Inlining 0.000 ms, Optimization 0.797 ms, Emission 5.048 ms, Total 7.104 ms
  Execution Time: 7.416 ms
 
As visible here, JIT was used, but inlining and
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 8570b14f621..78cced83931 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -893,6 +893,7 @@ ExplainPrintJIT(ExplainState *es, int jit_flags, JitInstrumentation *ji)
 
 	/* calculate total time */
 	INSTR_TIME_SET_ZERO(total_time);
+	/* don't add deform_counter, it's included into generation_counter */
 	INSTR_TIME_ADD(total_time, ji->generation_counter);
 	INSTR_TIME_ADD(total_time, ji->inlining_counter);
 	INSTR_TIME_ADD(total_time, ji->optimization_counter);
@@ -914,14 +915,15 @@ ExplainPrintJIT(ExplainState *es, int jit_flags, JitInstrumentation *ji)
 		 "Inlining", jit_flags & PGJIT_INLINE ? "true" : "false",
 		 "Optimization", jit_flags & PGJIT_OPT3 ? "true" : "false",
 		 "Expressions", jit_flags & PGJIT_EXPR ? "true" : "false",
-		 "Deforming", jit_flags & PGJIT_DEFORM ? "true" : "false");
+		 "Deform", jit_flags & PGJIT_DEFORM ? "true" : "false");
 
 		if (es->analyze && es->timing)
 		{
 			ExplainIndentText(es);
 			appendStringInfo(es->str,
-			 "Timing: %s %.3f ms, %s %.3f ms, %s %.3f ms, %s %.3f ms, %s %.3f ms\n",
+			 "Timing: %s %.3f ms (%s %.3f ms), %s %.3f ms, %s %.3f ms, %s %.3f ms, %s %.3f ms\n",
 			 "Generation", 1000.0 * INSTR_TIME_GET_DOUBLE(ji->generation_counter),
+			 "Deform", 1000.0 * INSTR_TIME_GET_DOUBLE(ji->deform_counter),
 			 "Inlining", 1000.0 * INSTR_TIME_GET_DOUBLE(ji->inlining_counter),
 			 "Optimization", 1000.0 * INSTR_TIME_GET_DOUBLE(ji->optimization_counter),
 			 "Emission", 1000.0 * INSTR_TIME_GET_DOUBLE(ji->emission_counter),
@@ -945,9 +947,15 @@ ExplainPrintJIT(ExplainState *es, int jit_flags, JitInstrumentation *ji)
 		{
 			ExplainOpenGroup("Timing", "Timing", true, es);
 
-			ExplainPropertyFloat("Generation", "ms",
+			ExplainOpenGroup("Generation", "Generation", true, es);
+			ExplainPropertyFloat("Deform", "ms",
+ 1000.0 * INSTR_TIME_GET_DOUBLE(ji->deform_counter),
+ 3, es);
+			ExplainPropertyFloat("Total", "ms",
  1000.0 * INSTR_TIME_GET_DOUBLE(ji->generation_counter),
  3, es);
+			ExplainCloseGroup("Generation", "Generation", true, es);
+
 			ExplainPropertyFloat("Inlining", "ms",
  1000.0 * INSTR_TIME_GET_DOUBLE(ji->inlining_counter),
  3, es);
diff --git a/src/backend/jit/jit.c b/src/backend/jit/jit.c
index fd1cf184c8e..4da8fee20b4 100644
--- a/src/backend/jit/jit.c
+++ b/src/backend/jit/jit.c
@@ -185,6 +185,7 @@ InstrJitAgg(JitInstrumentation *dst, JitInstrumentation *add)
 {
 	dst->created_functions += add->created_functions;

Re: AssertLog instead of Assert in some places

2023-08-14 Thread Ashutosh Bapat

On Fri, Aug 11, 2023 at 11:27 PM Andres Freund  wrote:
> >
> > Attached patch combines Assert and elog(ERROR, ) so that when an
> > Assert is triggered in assert-enabled binary, it will throw an error
> > while keeping the backend intact. Thus it does not affect gdb session
> > or psql session. These elogs do not make their way to non-assert
> > binary so do not make it to production like Assert.
>
> I am quite strongly against this. This will lead to assertions being hit in
> tests without that being noticed, e.g. because they happen in a background
> process that just restarts.

Fair point. Our regression doesn't check server error logs for
unwanted errors. How about restricting it to only client backends? I
don't know how to identify those from others but there must be a way.

--
Best Wishes,
Ashutosh Bapat

Re: Make psql's qeury canceling test simple by using signal() routine of IPC::Run

2023-08-14 Thread Yugo NAGATA

On Mon, 14 Aug 2023 08:29:25 +0900
Michael Paquier  wrote:

> On Sun, Aug 13, 2023 at 11:22:33AM +0200, Fabien COELHO wrote:
> > Test run is ok on my Ubuntu laptop.
> 
> I have a few comments about this patch.
> 
> On HEAD and even after this patch, we still have the following:
> SKIP: 
>   
>{  
>   
>   skip "cancel test requires a Unix shell", 2 
> if $windows_os;
> 
> Could the SKIP be removed for $windows_os?  If not, this had better be
> documented because the reason for the skip becomes incorrect.
> 
> The comment at the top of the SKIP block still states the following:
> # There is, as of this writing, no documented way to get the PID of
> # the process from IPC::Run.  As a workaround, we have psql print its
> # own PID (which is the parent of the shell launched by psql) to a
> # file.
> 
> This is also incorrect.

Thank you for your comments

I will check whether the test works in Windows and remove SKIP if possible.
Also, I'll fix the comment in either case.

Regards,
Yugo Nagata

-- 
Yugo NAGATA

Re: pgbench: allow to exit immediately when any client is aborted

2023-08-14 Thread Yugo NAGATA

On Sun, 13 Aug 2023 11:27:55 +0200 (CEST)
Fabien COELHO  wrote:

> 
> Hello Yugo-san,
> 
> >> I attached the updated patch v3 including changes above, a test,
> >> and fix of the typo you pointed out.
> >
> > I'm sorry but the test in the previous patch was incorrect.
> > I attached the correct one.
> 
> About pgbench exit on abort v3:
> 
> Patch does not "git apply", but is ok with "patch" although there are some
> minor warnings.

In my environment, the patch can be applied to the master branch without
any warnings...

> 
> Compiles. Code is ok. Tests are ok.
> 
> About Test:
> 
> The code is simple to get an error quickly but after a few transactions, 
> good. I'll do a minimal "-c 2 -j 2 -t 10" instead of "-c 4 -j 4 -T 10".

I fixed the test and attached the updated patch, v4.

Regards,
Yugo Nagata

> 
> -- 
> Fabien.


-- 
Yugo NAGATA 
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 6c5c8afa6d..83fe01c33f 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -787,6 +787,19 @@ pgbench  options  d
   
  
 
+ 
+  --exit-on-abort
+  
+   
+Exit immediately when any client is aborted due to some error. Without
+this option, even when a client is aborted, other clients could continue
+their run as specified by -t or -T option,
+and pgbench will print an incomplete results
+in this case.
+   
+  
+ 
+
  
   --log-prefix=prefix
   
@@ -985,7 +998,8 @@ pgbench  options  d
benchmark such as initial connection failures also exit with status 1.
Errors during the run such as database errors or problems in the script
will result in exit status 2. In the latter case,
-   pgbench will print partial results.
+   pgbench will print partial results if
+   --exit-on-abort option is not specified.
   
  
 
@@ -2801,14 +2815,17 @@ statement latencies in milliseconds, failures and retries:
  start a connection to the database server / the socket for connecting
  the client to the database server has become invalid). In such cases
  all clients of this thread stop while other threads continue to work.
+ However, --exit-on-abort is specified, all of the
+ threads stop immediately in this case.

  
  

  Direct client errors. They lead to immediate exit from
  pgbench with the corresponding error message
- only in the case of an internal pgbench
- error (which are supposed to never occur...). Otherwise in the worst
+ in the case of an internal pgbench
+ error (which are supposed to never occur...) or when
+ --exit-on-abort is specified . Otherwise in the worst
  case they only lead to the abortion of the failed client while other
  clients continue their run (but some client errors are handled without
  an abortion of the client and reported separately, see below). Later in
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 763c4b946a..4e64a60a63 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -765,6 +765,8 @@ static int64 total_weight = 0;
 
 static bool verbose_errors = false; /* print verbose messages of all errors */
 
+static bool exit_on_abort = false;	/* exit when any client is aborted */
+
 /* Builtin test scripts */
 typedef struct BuiltinScript
 {
@@ -911,6 +913,7 @@ usage(void)
 		   "  -T, --time=NUM   duration of benchmark test in seconds\n"
 		   "  -v, --vacuum-all vacuum all four standard tables before tests\n"
 		   "  --aggregate-interval=NUM aggregate data over NUM seconds\n"
+		   "  --exit-on-abort  exit when any client is aborted\n"
 		   "  --failures-detailed  report the failures grouped by basic types\n"
 		   "  --log-prefix=PREFIX  prefix for transaction time log file\n"
 		   "   (default: \"pgbench_log\")\n"
@@ -6617,6 +6620,7 @@ main(int argc, char **argv)
 		{"failures-detailed", no_argument, NULL, 13},
 		{"max-tries", required_argument, NULL, 14},
 		{"verbose-errors", no_argument, NULL, 15},
+		{"exit-on-abort", no_argument, NULL, 16},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -6950,6 +6954,10 @@ main(int argc, char **argv)
 benchmarking_option_set = true;
 verbose_errors = true;
 break;
+			case 16:			/* exit-on-abort */
+benchmarking_option_set = true;
+exit_on_abort = true;
+break;
 			default:
 /* getopt_long already emitted a complaint */
 pg_log_error_hint("Try \"%s --help\" for more information.", progname);
@@ -7558,11 +7566,17 @@ threadRun(void *arg)
 
 			advanceConnectionState(thread, st, );
 
+			/*
+			 * If --exit-on-abort is used, the program is going to exit
+			 * when any client is aborted.
+			 */
+			if (exit_on_abort && st->state == CSTATE_ABORTED)
+goto done;
 			/*
 			 * If advanceConnectionState changed client to finished state,
 			 *

Re: proposal: jsonb_populate_array

2023-08-14 Thread Pavel Stehule

po 14. 8. 2023 v 15:47 odesílatel Chapman Flack 
napsal:

> On 2023-08-14 09:11, Erik Rijkers wrote:
> >   , '$' returning date[]
>
> I certainly like that syntax better.
>
> It's not that the "here's a null to tell you the type I want"
> is terribly unclear, but it seems not to be an idiom I have
> seen a lot of in PostgreSQL before now. Are there other places
> it's currently used that I've overlooked?
>

It is used only for hstore, json, jsonb function if I remember correctly.

I dislike this idiom too, but SQL cannot use type as parameter. I proposed
anytype polymorphic pseudotype so instead

fx(null::int, ...) you can write (theoretically) fx('int', ...), but it
doesn't look too much better. For composite functions we can dynamically to
specify structure as SELECT FROM fx(...) AS (a int, b int), but it cannot
be used for scalar functions and cannot be used for static composite types.

I can imagine some special syntax of CAST, that can push type to inside
function, and allows to us to write functions like fx(non polymorphic
types) RETURNS any

for proposed functionality it can look like SELECT
CAST(json_populate_array('[]'::jsonb) AS date[])

> Regards,
> -Chap
>

Re: walsender "wakeup storm" on PG16, likely because of bc971f4025c (Optimize walsender wake up logic using condition variables)

2023-08-14 Thread Tomas Vondra




On 8/11/23 21:51, Thomas Munro wrote:
> On Sat, Aug 12, 2023 at 5:51 AM Andres Freund  wrote:
>> On 2023-08-11 15:31:43 +0200, Tomas Vondra wrote:
>>> It seems to me the issue is in WalSndWait, which was reworked to use
>>> ConditionVariableCancelSleep() in bc971f4025c. The walsenders end up
>>> waking each other in a busy loop, until the timing changes just enough
>>> to break the cycle.
>>
>> IMO ConditionVariableCancelSleep()'s behaviour of waking up additional
>> processes can nearly be considered a bug, at least when combined with
>> ConditionVariableBroadcast(). In that case the wakeup is never needed, and it
>> can cause situations like this, where condition variables basically
>> deteriorate to a busy loop.
>>
>> I hit this with AIO as well. I've been "solving" it by adding a
>> ConditionVariableCancelSleepEx(), which has a only_broadcasts argument.
>>
>> I'm inclined to think that any code that needs that needs the forwarding
>> behaviour is pretty much buggy.
> 
> Oh, I see what's happening.  Maybe commit b91dd9de wasn't the best
> idea, but bc971f4025c broke an assumption, since it doesn't use
> ConditionVariableSleep().  That is confusing the signal forwarding
> logic which expects to find our entry in the wait list in the common
> case.
> 
> What do you think about this patch?

I'm not familiar with the condition variable code enough to have an
opinion, but the patch seems to resolve the issue for me - I can no
longer reproduce the high CPU usage.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Extract numeric filed in JSONB more effectively

2023-08-14 Thread Tom Lane

Chapman Flack  writes:
> Providing a function with return type declared internal but
> with no parameter of that type is not good,

Not so much "not good" as "absolutely, positively WILL NOT HAPPEN".

> because then a
> user could, in principle, call it and obtain a value of
> 'internal' type, and so get around the typing rules that
> prevent calling other internal functions.

Right --- it'd completely break the system's type-safety for
other internal-using functions.

You could argue that we should never have abused "internal"
to this extent in the first place, compared to inventing a
plethora of internal-ish types to correspond to each of the
things "internal" is used for.  But here we are so we'd
better be darn careful with it.

regards, tom lane

Re: run pgindent on a regular basis / scripted manner

2023-08-14 Thread Peter Eisentraut


On 12.08.23 02:11, Tom Lane wrote:

Andres Freund  writes:

On 2023-08-11 18:30:02 -0400, Tom Lane wrote:

+1 for including this in CI tests



I didn't even mean CI - I meant 'make check-world' / 'meson test'. Which of
course would include CI automatically.


Hmm.  I'm allergic to anything that significantly increases the cost
of check-world, and this seems like it'd do that.

Maybe we could automate it, but not as part of check-world per se?


Also, during development, the code in progress is not always perfectly 
formatted, but I do want to keep running the test suites.

Re: run pgindent on a regular basis / scripted manner

2023-08-14 Thread Peter Eisentraut


On 12.08.23 23:14, Andres Freund wrote:

It's a somewhat annoying task though, find all the typedefs, add them to the
right place in the file (we have an out of order entry right now). I think a
script that*adds*  (but doesn't remove) local typedefs would make this less
painful.


I was puzzled once that there does not appear to be such a script 
available.  Whatever the buildfarm does (before it merges it all 
together) should be available locally.  Then the workflow could be


type type type
compile
update typedefs
pgindent
commit

Re: Extract numeric filed in JSONB more effectively

2023-08-14 Thread Chapman Flack


On 2023-08-14 03:06, Andy Fan wrote:

We'd still have functions like jsonb_field_as_numeric() under the
hood, but there's not an expectation that users call them explicitly.


To avoid the lots of functions like jsonb_field_as_int2/int4, I defined
Datum jsonb_object_field_type(.., Oid target_oid) at last,  so the
function must return "internal" or "anyelement".
...
I'm not sure how to fix that or deserves
a fix? Or shall I define jsonb_object_field_int2/int8 to avoid this?


As far as I'm concerned, if the intent is for this to be a function
that is swapped in by SupportRequestSimplify and not necessarily to
be called by users directly, I don't mind if users can't call it
directly. As long as there is a nice familiar jsonb function the user
can call in a nice familiar way and knows it will be handled
efficiently behind the curtain, that seems to be good enough for
the user--better, even, than having a new oddball function to
remember.

However, I believe the rule is that a function declared to return
internal must also declare at least one parameter as internal.
That way, a user won't be shown errors about displaying its
returned value, because the user won't be able to call it
in the first place, having no values of type 'internal' lying
around to pass to it. It could simply have that trailing oid
parameter declared as internal, and there you have a strictly
internal-use function.

Providing a function with return type declared internal but
with no parameter of that type is not good, because then a
user could, in principle, call it and obtain a value of
'internal' type, and so get around the typing rules that
prevent calling other internal functions.

Regards,
-Chap

Re: proposal: jsonb_populate_array

2023-08-14 Thread Chapman Flack


On 2023-08-14 09:11, Erik Rijkers wrote:

  , '$' returning date[]


I certainly like that syntax better.

It's not that the "here's a null to tell you the type I want"
is terribly unclear, but it seems not to be an idiom I have
seen a lot of in PostgreSQL before now. Are there other places
it's currently used that I've overlooked?

Regards,
-Chap

Re: proposal: jsonb_populate_array

2023-08-14 Thread Pavel Stehule

po 14. 8. 2023 v 15:09 odesílatel Erik Rijkers  napsal:

> Op 8/14/23 om 14:51 schreef Pavel Stehule:> po 14. 8. 2023 v 11:32
> odesílatel Alvaro Herrera 
>  > with proposed function I can write
>  >
>  > select jsonb_populate_array(null:date[],
>  > '["2023-07-13","2023-07-14"]'::jsonb)
>  >
> Not yet committed, but outstanding
> SQL/JSON patches (v11) will let you do:
>
> select json_query(
>  '["2023-07-13", "2023-07-14"]'::jsonb
>, '$' returning date[]
> );
> json_query
> -
>   {2023-07-13,2023-07-14}
> (1 row)
>
> That's (more or less) what you want, no?
>

Yes, the functionality is exactly the same, but still maybe for  completeness
the function json_populate_array can be nice.

In old API the transformations between json and row/record types is well
covered, but for array, only direction array->json is covered

I think so this can be +/- 40 lines of C code





> Let's hope it gets submitted 17-ish, anyway
>
> Erik
>
>
>
>
>
>

Re: proposal: jsonb_populate_array

2023-08-14 Thread Erik Rijkers

Op 8/14/23 om 14:51 schreef Pavel Stehule:> po 14. 8. 2023 v 11:32 
odesílatel Alvaro Herrera 

> with proposed function I can write
>
> select jsonb_populate_array(null:date[],
> '["2023-07-13","2023-07-14"]'::jsonb)
>
Not yet committed, but outstanding
SQL/JSON patches (v11) will let you do:

select json_query(
'["2023-07-13", "2023-07-14"]'::jsonb
  , '$' returning date[]
);
   json_query
-
 {2023-07-13,2023-07-14}
(1 row)

That's (more or less) what you want, no?

Let's hope it gets submitted 17-ish, anyway

Erik

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-08-14 Thread Tomas Vondra

Hi,

I haven't looked at the patch, but please add the patch to the next
commit fest (2023-09), so that we don't lose track of it.

See https://commitfest.postgresql.org


regards

Tomas

On 8/14/23 13:12, Jian Guo wrote:
> Hi hackers,
> 
> I have written a patch to add stats info for Vars in CTEs. With this
> patch, the join size estimation on the upper of CTE scans became more
> accurate.
> 
> In the function |selfuncs.c:eqjoinsel| it uses the number of the
> distinct values of the two join variables to estimate join size, and in
> the function |selfuncs.c:get_variable_numdistinct| return a default
> value |DEFAULT_NUM_DISTINCT| (200 in Postgres and 1000 in Greenplum),
> with the default value, you can never expect a good plan.
> 
> Thanks if anyone could give a review.
> 
> Regards,
> Jian
> 
> 
> *From:* Hans Buschmann 
> *Sent:* Wednesday, February 8, 2023 21:55
> *To:* pgsql-hackers@lists.postgresql.org
> 
> *Subject:* Wrong rows estimations with joins of CTEs slows queries by
> more than factor 500
>  
>   
> !! External Email
> 
> During data refactoring of our Application I encountered $subject when
> joining 4 CTEs with left join or inner join.
> 
> 
> 1. Background
> 
> PG 15.1 on Windows x64 (OS seems no to have no meening here)
> 
> 
> I try to collect data from 4 (analyzed) tables (up,li,in,ou) by grouping
> certain data (4 CTEs qup,qli,qin,qou)
> 
> The grouping of the data in the CTEs gives estimated row counts of about
> 1000 (1 tenth of the real value) This is OK for estimation.
> 
> 
> These 4 CTEs are then used to combine the data by joining them.
> 
> 
> 2. Problem
> 
> The 4 CTEs are joined by left joins as shown below:
> 
> 
> from qup
> left join qli on (qli.curr_season=qup.curr_season and
> qli.curr_code=qup.curr_code and qli.ibitmask>0 and
> cardinality(qli.mat_arr) <=8)
> left join qin on (qin.curr_season=qup.curr_season and
> qin.curr_code=qup.curr_code and qin.ibitmask>0 and
> cardinality(qin.mat_arr) <=8)
> left join qou on (qou.curr_season=qup.curr_season and
> qou.curr_code=qup.curr_code and qou.ibitmask>0 and
> cardinality(qou.mat_arr) <=11)
> where qup.ibitmask>0 and cardinality(qup.mat_arr) <=21
> 
> The plan first retrieves qup and qli, taking the estimated row counts of
> 1163 and 1147 respectively
> 
> 
> BUT the result is then hashed and the row count is estimated as 33!
> 
> 
> In a Left join the row count stays always the same as the one of left
> table (here qup with 1163 rows)
> 
> 
> The same algorithm which reduces the row estimation from 1163 to 33 is
> used in the next step to give an estimation of 1 row.
> 
> This is totally wrong.
> 
> 
> Here is the execution plan of the query:
> 
> (search the plan for rows=33)
> 
> 
>                                                                    
> QUERY PLAN
> --
>  Append  (cost=13673.81..17463.30 rows=5734 width=104) (actual
> time=168.307..222.670 rows=9963 loops=1)
>    CTE qup
>      ->  GroupAggregate  (cost=5231.22..6303.78 rows=10320 width=80)
> (actual time=35.466..68.131 rows=10735 loops=1)
>            Group Key: sa_upper.sup_season, sa_upper.sup_sa_code
>            ->  Sort  (cost=5231.22..5358.64 rows=50969 width=18) (actual
> time=35.454..36.819 rows=50969 loops=1)
>                  Sort Key: sa_upper.sup_season, sa_upper.sup_sa_code
> COLLATE "C"
>                  Sort Method: quicksort  Memory: 4722kB
>                  ->  Hash Left Join  (cost=41.71..1246.13 rows=50969
> width=18) (actual time=0.148..10.687 rows=50969 loops=1)
>                        Hash Cond: ((sa_upper.sup_mat_code)::text =
> upper_target.up_mat_code)
>                        ->  Seq Scan on sa_upper  (cost=0.00..884.69
> rows=50969 width=16) (actual time=0.005..1.972 rows=50969 loops=1)
>                        ->  Hash  (cost=35.53..35.53 rows=495 width=6)
> (actual time=0.140..0.140 rows=495 loops=1)
>                              Buckets: 1024  Batches: 1  Memory Usage: 27kB
>                              ->  Seq Scan on upper_target 
> (cost=0.00..35.53 rows=495 width=6) (actual time=0.007..0.103 rows=495
> loops=1)
>                                    Filter: (id_up <= 495)
>                                    Rows Removed by Filter: 1467
>    CTE qli
>      ->  GroupAggregate  (cost=1097.31..1486.56 rows=10469 width=80)
> (actual time=9.446..27.388 rows=10469 loops=1)
>            Group Key: sa_lining.sli_season, sa_lining.sli_sa_code
>            ->  Sort  (cost=1097.31..1126.74 rows=11774 width=18) (actual
> time=9.440..9.811 rows=11774 loops=1)
>                  Sort Key: sa_lining.sli_season, sa_lining.sli_sa_code
> COLLATE "C"
>                  Sort Method: quicksort  Memory: 1120kB
>                  ->  Hash Left Join  (cost=7.34..301.19 rows=11774
> width=18) (actual time=0.045..2.438

Re: Extract numeric filed in JSONB more effectively

2023-08-14 Thread Pavel Stehule

po 14. 8. 2023 v 11:17 odesílatel Andy Fan 
napsal:

>
>> you cannot to use type as parameter. There should be some typed value -
>> like
>>
>> jsonb_object_field, '{"a":10}', 'a', NULL::int)
>>
>> and return type should be anyelement.
>>
>>
> So could we get the inputted type in the body of jsonb_object_field?
> I guess no.  IIUC, our goal will still be missed in this way.
>

why not? You can easily build null constant of any type.

>
> --
> Best Regards
> Andy Fan
>

Re: proposal: jsonb_populate_array

2023-08-14 Thread Pavel Stehule

po 14. 8. 2023 v 11:32 odesílatel Alvaro Herrera 
napsal:

> On 2023-Aug-14, Pavel Stehule wrote:
>
> > jsonb_populate_array(anyarray, jsonb) returns anyarray
> >
> > Usage:
> >
> > select jsonb_populate_array(null::text[],
> '["cust_full_name","cust_email"]')
>
> I don't understand what this does.  Can you be more explicit?
>

example

'["2023-07-13","2023-07-14"]'::jsonb -->  {2023-07-13,2023-07-14}::date[]

Now, I have to transform to table, casting, and back transformation to
array, and I cannot to write generic function. I can run just "slow" query

select array_agg(value::date) from
jsonb_array_elements_text('["2023-07-13","2023-07-14"]'::jsonb);

with proposed function I can write

select jsonb_populate_array(null:date[],
'["2023-07-13","2023-07-14"]'::jsonb)

Regards

Pavel



> --
> Álvaro Herrera   48°01'N 7°57'E  —
> https://www.EnterpriseDB.com/
> Maybe there's lots of data loss but the records of data loss are also lost.
> (Lincoln Yeoh)
>

Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication

2023-08-14 Thread vignesh C

On Thu, 10 Aug 2023 at 10:16, Amit Kapila  wrote:
>
> On Wed, Aug 9, 2023 at 8:28 AM Zhijie Hou (Fujitsu)
>  wrote:
> >
> > On Thursday, August 3, 2023 7:30 PM Melih Mutlu   
> > wrote:
> >
> > > Right. I attached the v26 as you asked.
> >
> > Thanks for posting the patches.
> >
> > While reviewing the patch, I noticed one rare case that it's possible that 
> > there
> > are two table sync worker for the same table in the same time.
> >
> > The patch relies on LogicalRepWorkerLock to prevent concurrent access, but 
> > the
> > apply worker will start a new worker after releasing the lock. So, at the 
> > point[1]
> > where the lock is released and the new table sync worker has not been 
> > started,
> > it seems possible that another old table sync worker will be reused for the
> > same table.
> >
> > /* Now safe to release the LWLock */
> > LWLockRelease(LogicalRepWorkerLock);
> > *[1]
> > /*
> >  * If there are free sync worker slot(s), 
> > start a new sync
> >  * worker for the table.
> >  */
> > if (nsyncworkers < 
> > max_sync_workers_per_subscription)
> > ...
> > 
> > logicalrep_worker_launch(MyLogicalRepWorker->dbid,
> >
>
> Yeah, this is a problem. I think one idea to solve this is by
> extending the lock duration till we launch the tablesync worker but we
> should also consider changing this locking scheme such that there is a
> better way to indicate that for a particular rel, tablesync is in
> progress. Currently, the code in TablesyncWorkerMain() also acquires
> the lock in exclusive mode even though the tablesync for a rel is in
> progress which I guess could easily heart us for larger values of
> max_logical_replication_workers. So, that could be another motivation
> to think for a different locking scheme.

There are couple of ways in which this issue can be solved:
Approach #1) check that the reuse worker has not picked up this table
for table sync from logicalrep_worker_launch while holding a lock on
LogicalRepWorkerLock, if the reuse worker has already picked it up for
processing, simply ignore it and return, nothing has to be done by the
launcher in this case.
Approach #2) a) Applyworker to create a shared memory of all the
relations that need to be synced, b) tablesync worker to take a lock
on this shared memory and pick the next table to be
processed(tablesync worker need not get the subscription relations
again and again) c) tablesync worker to update the status in shared
memory for the relation(since the lock is held there will be no
concurrency issues), also mark the start time in the shared memory,
this will help in not to restart the failed table before
wal_retrieve_retry_interval has expired d) tablesync worker to sync
the table e) subscription relation will be marked as ready and the
tablesync worker to remove the entry from shared memory f) Applyworker
will periodically synchronize the shared memory relations to keep it
in sync with the fetched subscription relation tables  g) when apply
worker exits, the shared memory will be cleared.

Approach #2) will also help in solving the other issue reported by Amit at [1].
I felt we can use Approach #2 to solve the problem as it solves both
the reported issues and also there is an added advantage where the
re-use table sync worker need not scan the pg_subscription_rel to get
the non-ready table for every run, instead we can use the list
prepared by apply worker.
Thoughts?

[1] - 
https://www.postgresql.org/message-id/CAA4eK1KyHfVOVeio28p8CHDnuyKuej78cj_7U9mHAB4ictVQwQ%40mail.gmail.com

Regards,
Vignesh

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-08-14 Thread Jian Guo

Hi hackers,

I have written a patch to add stats info for Vars in CTEs. With this patch, the 
join size estimation on the upper of CTE scans became more accurate.

In the function selfuncs.c:eqjoinsel it uses the number of the distinct values 
of the two join variables to estimate join size, and in the function 
selfuncs.c:get_variable_numdistinct return a default value DEFAULT_NUM_DISTINCT 
(200 in Postgres and 1000 in Greenplum), with the default value, you can never 
expect a good plan.

Thanks if anyone could give a review.

Regards,
Jian


From: Hans Buschmann 
Sent: Wednesday, February 8, 2023 21:55
To: pgsql-hackers@lists.postgresql.org 
Subject: Wrong rows estimations with joins of CTEs slows queries by more than 
factor 500

!! External Email

During data refactoring of our Application I encountered $subject when joining 
4 CTEs with left join or inner join.


1. Background

PG 15.1 on Windows x64 (OS seems no to have no meening here)


I try to collect data from 4 (analyzed) tables (up,li,in,ou) by grouping 
certain data (4 CTEs qup,qli,qin,qou)

The grouping of the data in the CTEs gives estimated row counts of about 1000 
(1 tenth of the real value) This is OK for estimation.


These 4 CTEs are then used to combine the data by joining them.


2. Problem

The 4 CTEs are joined by left joins as shown below:


from qup
left join qli on (qli.curr_season=qup.curr_season and 
qli.curr_code=qup.curr_code and qli.ibitmask>0 and cardinality(qli.mat_arr) <=8)
left join qin on (qin.curr_season=qup.curr_season and 
qin.curr_code=qup.curr_code and qin.ibitmask>0 and cardinality(qin.mat_arr) <=8)
left join qou on (qou.curr_season=qup.curr_season and 
qou.curr_code=qup.curr_code and qou.ibitmask>0 and cardinality(qou.mat_arr) 
<=11)
where qup.ibitmask>0 and cardinality(qup.mat_arr) <=21

The plan first retrieves qup and qli, taking the estimated row counts of 1163 
and 1147 respectively


BUT the result is then hashed and the row count is estimated as 33!


In a Left join the row count stays always the same as the one of left table 
(here qup with 1163 rows)


The same algorithm which reduces the row estimation from 1163 to 33 is used in 
the next step to give an estimation of 1 row.

This is totally wrong.


Here is the execution plan of the query:

(search the plan for rows=33)


QUERY PLAN
--
 Append  (cost=13673.81..17463.30 rows=5734 width=104) (actual 
time=168.307..222.670 rows=9963 loops=1)
   CTE qup
 ->  GroupAggregate  (cost=5231.22..6303.78 rows=10320 width=80) (actual 
time=35.466..68.131 rows=10735 loops=1)
   Group Key: sa_upper.sup_season, sa_upper.sup_sa_code
   ->  Sort  (cost=5231.22..5358.64 rows=50969 width=18) (actual 
time=35.454..36.819 rows=50969 loops=1)
 Sort Key: sa_upper.sup_season, sa_upper.sup_sa_code COLLATE "C"
 Sort Method: quicksort  Memory: 4722kB
 ->  Hash Left Join  (cost=41.71..1246.13 rows=50969 width=18) 
(actual time=0.148..10.687 rows=50969 loops=1)
   Hash Cond: ((sa_upper.sup_mat_code)::text = 
upper_target.up_mat_code)
   ->  Seq Scan on sa_upper  (cost=0.00..884.69 rows=50969 
width=16) (actual time=0.005..1.972 rows=50969 loops=1)
   ->  Hash  (cost=35.53..35.53 rows=495 width=6) (actual 
time=0.140..0.140 rows=495 loops=1)
 Buckets: 1024  Batches: 1  Memory Usage: 27kB
 ->  Seq Scan on upper_target  (cost=0.00..35.53 
rows=495 width=6) (actual time=0.007..0.103 rows=495 loops=1)
   Filter: (id_up <= 495)
   Rows Removed by Filter: 1467
   CTE qli
 ->  GroupAggregate  (cost=1097.31..1486.56 rows=10469 width=80) (actual 
time=9.446..27.388 rows=10469 loops=1)
   Group Key: sa_lining.sli_season, sa_lining.sli_sa_code
   ->  Sort  (cost=1097.31..1126.74 rows=11774 width=18) (actual 
time=9.440..9.811 rows=11774 loops=1)
 Sort Key: sa_lining.sli_season, sa_lining.sli_sa_code COLLATE 
"C"
 Sort Method: quicksort  Memory: 1120kB
 ->  Hash Left Join  (cost=7.34..301.19 rows=11774 width=18) 
(actual time=0.045..2.438 rows=11774 loops=1)
   Hash Cond: ((sa_lining.sli_mat_code)::text = 
lining_target.li_mat_code)
   ->  Seq Scan on sa_lining  (cost=0.00..204.74 rows=11774 
width=16) (actual time=0.008..0.470 rows=11774 loops=1)
   ->  Hash  (cost=5.86..5.86 rows=118 width=6) (actual 
time=0.034..0.034 rows=119 loops=1)
 Buckets: 1024  Batches: 1  Memory Usage: 13kB
 ->  Seq Scan on lining_target

Re: Ignore 2PC transaction GIDs in query jumbling

2023-08-14 Thread Dagfinn Ilmari Mannsåker

Michael Paquier  writes:

> On Sun, Aug 13, 2023 at 02:48:22PM +0800, Julien Rouhaud wrote:
>> On Sun, Aug 13, 2023 at 03:25:33PM +0900, Michael Paquier wrote:
>>> Perhaps not as much, actually, because I was just reminded that
>>> DEALLOCATE is something that pg_stat_statements ignores.  So this
>>> makes harder the introduction of tests.
>> 
>> Maybe it's time to revisit that?  According to [1] the reason why
>> pg_stat_statements currently ignores DEALLOCATE is because it indeed bloated
>> the entries but also because at that time the suggestion for jumbling only 
>> this
>> one was really hackish.
>
> Good point.  The argument of the other thread does not really hold
> much these days now that query jumbling can happen for all the utility
> nodes.
>
>> Now that we do have a sensible approach to jumble utility statements, I think
>> it would be beneficial to be able to track those, for instance to be easily
>> diagnose a driver that doesn't rely on the extended protocol.
>
> Fine by me.  Would you like to write a patch?  I've begun typing an
> embryon of patch a few days ago, and did not look yet at the rest.
> Please see the attached.

As far as I could tell the only thing missing was removing
DeallocateStmt from the list of unhandled utility statement types (and
updating comments to match).  Updated patch attached.

- ilmari

>From 7f11e362ef8c097a78463435a4bd1ab1031ea233 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Mon, 14 Aug 2023 11:59:44 +0100
Subject: [PATCH] Track DEALLOCATE statements in pg_stat_activity

Treat the statement name as a constant when jumbling.
---
 .../pg_stat_statements/expected/utility.out   | 40 +++
 .../pg_stat_statements/pg_stat_statements.c   |  8 +---
 contrib/pg_stat_statements/sql/utility.sql| 13 ++
 src/backend/parser/gram.y |  4 ++
 src/include/nodes/parsenodes.h|  6 ++-
 5 files changed, 63 insertions(+), 8 deletions(-)

diff --git a/contrib/pg_stat_statements/expected/utility.out b/contrib/pg_stat_statements/expected/utility.out
index 93735d5d85..3681374869 100644
--- a/contrib/pg_stat_statements/expected/utility.out
+++ b/contrib/pg_stat_statements/expected/utility.out
@@ -472,6 +472,46 @@ SELECT pg_stat_statements_reset();
  
 (1 row)
 
+-- Execution statements
+SELECT 1 as a;
+ a 
+---
+ 1
+(1 row)
+
+PREPARE stat_select AS SELECT $1 AS a;
+EXECUTE stat_select (1);
+ a 
+---
+ 1
+(1 row)
+
+DEALLOCATE stat_select;
+PREPARE stat_select AS SELECT $1 AS a;
+EXECUTE stat_select (2);
+ a 
+---
+ 2
+(1 row)
+
+DEALLOCATE PREPARE stat_select;
+DEALLOCATE ALL;
+DEALLOCATE PREPARE ALL;
+SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
+ calls | rows | query 
+---+--+---
+ 4 |0 | DEALLOCATE $1
+ 2 |2 | PREPARE stat_select AS SELECT $1 AS a
+ 1 |1 | SELECT $1 as a
+ 1 |1 | SELECT pg_stat_statements_reset()
+(4 rows)
+
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--
+ 
+(1 row)
+
 -- SET statements.
 -- These use two different strings, still they count as one entry.
 SET work_mem = '1MB';
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 55b957d251..06b65aeef5 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -104,8 +104,7 @@ static const uint32 PGSS_PG_MAJOR_VERSION = PG_VERSION_NUM / 100;
  * ignores.
  */
 #define PGSS_HANDLED_UTILITY(n)		(!IsA(n, ExecuteStmt) && \
-	!IsA(n, PrepareStmt) && \
-	!IsA(n, DeallocateStmt))
+	!IsA(n, PrepareStmt))
 
 /*
  * Extension version number, for supporting older extension versions' objects
@@ -830,8 +829,7 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate)
 
 	/*
 	 * Clear queryId for prepared statements related utility, as those will
-	 * inherit from the underlying statement's one (except DEALLOCATE which is
-	 * entirely untracked).
+	 * inherit from the underlying statement's one.
 	 */
 	if (query->utilityStmt)
 	{
@@ -1116,8 +1114,6 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 	 * calculated from the query tree) would be used to accumulate costs of
 	 * ensuing EXECUTEs.  This would be confusing, and inconsistent with other
 	 * cases where planning time is not included at all.
-	 *
-	 * Likewise, we don't track execution of DEALLOCATE.
 	 */
 	if (pgss_track_utility && pgss_enabled(exec_nested_level) &&
 		PGSS_HANDLED_UTILITY(parsetree))
diff --git a/contrib/pg_stat_statements/sql/utility.sql b/contrib/pg_stat_statements/sql/utility.sql
index 87666d9135..5f7d4a467f 100644
--- a/contrib/pg_stat_statements/sql/utility.sql
+++ b/contrib/pg_stat_statements/sql/utility.sql
@@ -237,6 +237,19 @@ DROP DOMAIN domain_stats;
 SELECT calls, rows, query

Re: [PoC] Improve dead tuple storage for lazy vacuum

2023-08-14 Thread John Naylor

On Thu, Jul 13, 2023 at 3:09 PM Masahiko Sawada 
wrote:
>
> 0007, 0008, 0010, and 0011 are straightforward and agree to merge them.

[Part 1 - clear the deck of earlier performance work etc]

Thanks for taking a look! I've merged 0007 and 0008. The others need a
performance test to justify them -- an eyeball check is not enough. I've
now made the time to do that.

 sparse loads

v38 0001-0006 (still using node3 for this test only):

select avg(load_ms) from generate_series(1,100) x(x), lateral (select *
from bench_load_random_int(100 * 1000 * (1+x-x))) a;
 avg
-
 27.1000

select avg(load_ms) from generate_series(1,30) x(x), lateral (select * from
bench_load_random_int(500 * 1000 * (1+x-x))) a;
 avg
--
 165.6333

v38-0007-Optimize-RT_EXTEND_DOWN.patch

select avg(load_ms) from generate_series(1,100) x(x), lateral (select *
from bench_load_random_int(100 * 1000 * (1+x-x))) a;
 avg
-
 25.0900

select avg(load_ms) from generate_series(1,30) x(x), lateral (select * from
bench_load_random_int(500 * 1000 * (1+x-x))) a;
 avg
--
 157.3667

That seems worth doing.

v38-0008-Use-4-children-for-node-4-also-attempt-portable-.patch

This combines two things because I messed up a rebase: Use fanout of 4, and
try some macros for shmem sizes, both 32- and 64-bit. Looking at this much,
I no longer have a goal to have a separate set of size-classes for non-SIMD
platforms, because that would cause global maintenance problems -- it's
probably better to reduce worst-case search time where necessary. That
would be much more localized.

> I have some questions on 0009 patch:

> According to the comment, this optimization is for only gcc?

No, not at all. That tells me the comment is misleading.

> I think this change reduces
> readability and maintainability.

Well, that much is obvious. What is not obvious is how much it gains us
over the alternatives. I do have a simpler idea, though...

 load mostly node4

select * from bench_search_random_nodes(250*1000, '0xFF');
n4 = 42626, n16 = 21492, n32 = 0, n64 = 0, n256 = 257
 mem_allocated | load_ms | search_ms
---+-+---
   7352384 |  25 | 0

v38-0009-TEMP-take-out-search-time-from-bench.patch

This is just to allow LATERAL queries for better measurements.

select avg(load_ms) from generate_series(1,100) x(x), lateral (select *
from bench_search_random_nodes(250*1000 * (1+x-x), '0xFF')) a;

 avg
-
 24.8333

v38-0010-Try-a-simpler-way-to-avoid-memmove.patch

This slightly rewrites the standard loop so that gcc doesn't turn it into a
memmove(). Unlike the patch you didn't like, this *is* gcc-specific. (needs
a comment, which I forgot)

 avg
-
 21.9600

So, that's not a trivial difference. I wasn't a big fan of Andres'
__asm("") workaround, but that may be just my ignorance about it. We need
something like either of the two.

v38-0011-Optimize-add_child_4-take-2.patch
 avg
-
 21.3500

This is possibly faster than v38-0010, but looking like not worth the
complexity, assuming the other way avoids the bug going forward.

> According to the bugzilla ticket
> referred to in the comment, it's realized as a bug in the community,
> so once the gcc bug fixes, we might no longer need this trick, no?

No comment in two years...

v38-0013-Use-constant-for-initial-copy-of-chunks-and-chil.patch

This is the same as v37-0011. I wasn't quite satisfied with it since it
still has two memcpy() calls, but it actually seems to regress:

 avg
-
 22.0900

v38-0012-Use-branch-free-coding-to-skip-new-element-index.patch

This patch uses a single loop for the copy.

 avg
-
 21.0300

Within noise level of v38-0011, but it's small and simple, so I like it, at
least for small arrays.

v38-0014-node48-Remove-need-for-RIGHTMOST_ONE-in-radix-tr.patch
v38-0015-node48-Remove-dead-code-by-using-loop-local-var.patch

Just small cleanups.

v38-0016-Use-memcpy-for-children-when-growing-into-node48.patch

Makes sense, but untested.

===
[Part 2]

Per off-list discussion with Masahiko, it makes sense to take some of the
ideas I've used locally on tidbitmap, and start incorporating them into
earlier vacuum work to get that out the door faster. With that in mind...

v38-0017-Make-tidstore-more-similar-to-tidbitmap.patch

This uses a simplified PagetableEntry (unimaginatively called
BlocktableEntry just to avoid confusion), to be replaced with the real
thing at a later date. This is still fixed size, to be replaced with a
varlen type.

Looking at the tidstore tests again after some months, I'm not particularly
pleased with the amount of code required for how little it seems to be
testing, nor the output

Re: pgbench with libevent?

2023-08-14 Thread Tatsuo Ishii

> It could be refactored to support a different subset of event types --
> maybe just sockets, no latches and obviously no 'postmaster death'.

Ok.

> But figuring out how to make latches work between threads might also
> be interesting for future projects...

Maybe. Some people are working on threading PostgreSQL. They may
already know...

> Maybe Fabien has completion-based I/O in mind (not just "readiness").
> That's something that some of those libraries can do, IIUC.  For
> example, when your thread wakes up, it tells you "your socket read is
> finished, the data is already in your target buffer".  As opposed to
> "you can now call recv() without blocking", so you avoid another trip
> into the kernel.  But that's also something we'll eventually want to
> figure out in the server.

Agreed.
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Re: Synchronizing slots from primary to standby

2023-08-14 Thread shveta malik

On Mon, Aug 14, 2023 at 3:22 PM shveta malik  wrote:
>
> On Tue, Aug 8, 2023 at 11:11 AM Drouvot, Bertrand
>  wrote:
> >
> > Hi,
> >
> > On 8/8/23 7:01 AM, shveta malik wrote:
> > > On Mon, Aug 7, 2023 at 3:17 PM Drouvot, Bertrand
> > >  wrote:
> > >>
> > >> Hi,
> > >>
> > >> On 8/4/23 1:32 PM, shveta malik wrote:
> > >>> On Fri, Aug 4, 2023 at 2:44 PM Drouvot, Bertrand
> > >>>  wrote:
> >  On 7/28/23 4:39 PM, Bharath Rupireddy wrote:
> > >>
> > >
> > > Agreed. That is why in v10,v11 patches, we have different infra for
> > > sync-slot worker i.e. it is not relying on "logical replication
> > > background worker" anymore.
> >
> > yeah saw that, looks like the right way to go to me.
> >
> > >> Maybe we should start some tests/benchmark with only one sync worker to 
> > >> get numbers
> > >> and start from there?
> > >
> > > Yes, we can do that performance testing to figure out the difference
> > > between the two modes. I will try to get some statistics on this.
> > >
> >
> > Great, thanks!
> >
>
> We (myself and Ajin) performed the tests to compute the lag in standby
> slots as compared to primary slots with different number of slot-sync
> workers configured.
>
> 3 DBs were created, each with 30 tables and each table having one
> logical-pub/sub configured. So this made a total of 90 logical
> replication slots to be synced. Then the workload was run for aprox 10
> mins. During this workload, at regular intervals, primary and standby
> slots' lsns were captured (from pg_replication_slots) and compared. At
> each capture, the intent was to know how much is each standby's slot
> lagging behind corresponding primary's slot by taking the distance
> between confirmed_flush_lsn of primary and standby slot. Then we took
> the average (integer value) of this distance over the span of 10 min
> workload and this is what we got:
>

I have attached the scripts for schema-setup, running workload and
capturing lag. Please go through Readme for details.


thanks
Shveta
<>

Re: pgbench with libevent?

2023-08-14 Thread Fabien COELHO





4. libevent development seems slugish, last bugfix was published 3 years ago, 
version
   2.2 has been baking for years, but the development seems lively (+100 
contributors).


Ugh, I would stay away from something like that.  Would we become
hostage to an undelivering group?  No thanks.


Ok.


Or maybe libuv (used by nodejs?).



Note: libev had no updates in 8 years.


libev or libuv?  No updates in 8 years => dead.  No way.


Sorry, it was not a typo, but the information was not very explicit.
I have looked at 3 libraries: libevent, libuv and libev.

libuv is quite lively, last updated 2023-06-30.

libev is an often cited library, which indeed seems quite dead, so I was 
"noting" that I had discarded it, but it looked like a typo.


Reworking based on wait events as proposed downthread sounds more 
promising.


The wait event postgres backend implementation would require a lot of work 
to be usable in a client context.


My current investigation is that libuv could be the reasonable target, if 
any, especially as it seems to provide a portable thread pool as well.


--
Fabien.

Re: pgbench with libevent?

2023-08-14 Thread Fabien COELHO





Interesting. In my understanding this also needs to make Latch
frontend-friendly?


It could be refactored to support a different subset of event types --
maybe just sockets, no latches and obviously no 'postmaster death'.
But figuring out how to make latches work between threads might also
be interesting for future projects...

Maybe Fabien has completion-based I/O in mind (not just "readiness").


Pgbench is really a primitive client on top of libpq. ISTM that 
completion-based I/O would require to enhance libpq asynchronous-ity, not 
just expose its underlying fd to allow asynchronous implementations.

Currently pgbench only actuall "waits" for results from the server
and testing PQisBusy to check whether they are there.


That's something that some of those libraries can do, IIUC.  For
example, when your thread wakes up, it tells you "your socket read is
finished, the data is already in your target buffer".


Indeed, libevent has a higher level "buffer" oriented API.

As opposed to "you can now call recv() without blocking", so you avoid 
another trip into the kernel.  But that's also something we'll 
eventually want to figure out in the server.


--
Fabien.

Re: pgbench with libevent?

2023-08-14 Thread Fabien COELHO



Hello Thomas,


Pgbench is managing clients I/Os manually with select or poll. Much of this
could be managed by libevent.


Or maybe libuv (used by nodejs?).

From preliminary testing libevent seems not too good at fine grain time
management which are used for throttling, whereas libuv advertised that it
is good at it, although what it does is yet to be seen.


Do you think our WaitEventSet stuff could be good here, if made
frontend-friendly?


Interesting question.

AFAICS, the answer is that it could indeed probably fit the task, but it 
would require significant work to make it thread-compatible, and to 
untangle it from IsUnderPosmaster/postmaster death, memory context, 
elog/ereport, and other back-end specific stuff.


If you remove all that with a clean abstraction (quite a task), then once 
done the question could be why not use libevent/libuv/… in the backend 
instead of maintaining more or less the same thing inside postgres?


So ISTM that as far as pgbench is concerned it would be much simpler to 
use libevent/libuv/… directly if the pros are enough and the cons not 
redhibitory, and provided that the needed detailed features are really 
there.


--
Fabien.

Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication

2023-08-14 Thread Amit Kapila

On Thu, Aug 10, 2023 at 10:15 AM Amit Kapila  wrote:
>
> On Wed, Aug 9, 2023 at 8:28 AM Zhijie Hou (Fujitsu)
>  wrote:
> >
> > On Thursday, August 3, 2023 7:30 PM Melih Mutlu   
> > wrote:
> >
> > > Right. I attached the v26 as you asked.
> >
> > Thanks for posting the patches.
> >
> > While reviewing the patch, I noticed one rare case that it's possible that 
> > there
> > are two table sync worker for the same table in the same time.
> >
> > The patch relies on LogicalRepWorkerLock to prevent concurrent access, but 
> > the
> > apply worker will start a new worker after releasing the lock. So, at the 
> > point[1]
> > where the lock is released and the new table sync worker has not been 
> > started,
> > it seems possible that another old table sync worker will be reused for the
> > same table.
> >
> > /* Now safe to release the LWLock */
> > LWLockRelease(LogicalRepWorkerLock);
> > *[1]
> > /*
> >  * If there are free sync worker slot(s), 
> > start a new sync
> >  * worker for the table.
> >  */
> > if (nsyncworkers < 
> > max_sync_workers_per_subscription)
> > ...
> > 
> > logicalrep_worker_launch(MyLogicalRepWorker->dbid,
> >
>
> Yeah, this is a problem. I think one idea to solve this is by
> extending the lock duration till we launch the tablesync worker but we
> should also consider changing this locking scheme such that there is a
> better way to indicate that for a particular rel, tablesync is in
> progress. Currently, the code in TablesyncWorkerMain() also acquires
> the lock in exclusive mode even though the tablesync for a rel is in
> progress which I guess could easily heart us for larger values of
> max_logical_replication_workers. So, that could be another motivation
> to think for a different locking scheme.
>

Yet another problem is that currently apply worker maintains a hash
table for 'last_start_times' to avoid restarting the tablesync worker
immediately upon error. The same functionality is missing while
reusing the table sync worker. One possibility is to use a shared hash
table to remember start times but I think it may depend on what we
decide to solve the previous problem reported by Hou-San.

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2023-08-14 Thread shveta malik

On Tue, Aug 8, 2023 at 11:11 AM Drouvot, Bertrand
 wrote:
>
> Hi,
>
> On 8/8/23 7:01 AM, shveta malik wrote:
> > On Mon, Aug 7, 2023 at 3:17 PM Drouvot, Bertrand
> >  wrote:
> >>
> >> Hi,
> >>
> >> On 8/4/23 1:32 PM, shveta malik wrote:
> >>> On Fri, Aug 4, 2023 at 2:44 PM Drouvot, Bertrand
> >>>  wrote:
>  On 7/28/23 4:39 PM, Bharath Rupireddy wrote:
> >>
> >
> > Agreed. That is why in v10,v11 patches, we have different infra for
> > sync-slot worker i.e. it is not relying on "logical replication
> > background worker" anymore.
>
> yeah saw that, looks like the right way to go to me.
>
> >> Maybe we should start some tests/benchmark with only one sync worker to 
> >> get numbers
> >> and start from there?
> >
> > Yes, we can do that performance testing to figure out the difference
> > between the two modes. I will try to get some statistics on this.
> >
>
> Great, thanks!
>

We (myself and Ajin) performed the tests to compute the lag in standby
slots as compared to primary slots with different number of slot-sync
workers configured.

3 DBs were created, each with 30 tables and each table having one
logical-pub/sub configured. So this made a total of 90 logical
replication slots to be synced. Then the workload was run for aprox 10
mins. During this workload, at regular intervals, primary and standby
slots' lsns were captured (from pg_replication_slots) and compared. At
each capture, the intent was to know how much is each standby's slot
lagging behind corresponding primary's slot by taking the distance
between confirmed_flush_lsn of primary and standby slot. Then we took
the average (integer value) of this distance over the span of 10 min
workload and this is what we got:

With max_slot_sync_workers=1, average-lag =  42290.3563
With max_slot_sync_workers=2, average-lag =  24585.1421
With max_slot_sync_workers=3, average-lag =  14964.9215

This shows that more workers have better chances to keep logical
replication slots in sync for this case.

Another statistics if it interests you is, we ran a frequency test as
well (this by changing code, unit test sort of) to figure out the
'total number of times synchronization done' with different number of
sync-slots workers configured. Same 3 DBs setup with each DB having 30
logical replication slots. With 'max_slot_sync_workers' set at 1, 2
and 3; total number of times synchronization done was 15874, 20205 and
23414 respectively. Note: this is not on the same machine where we
captured lsn-gap data, it is on  a little less efficient machine but
gives almost the same picture.

Next we are planning to capture this data for a lesser number of slots
like 10,30,50 etc. It may happen that the benefit of multi-workers
over single workers in such cases could be less, but let's have the
data to verify that.

Thanks Ajin for jointly working on this.

thanks
Shveta

Re: Add PG CI to older PG releases

2023-08-14 Thread Nazir Bilal Yavuz

Hi,

On Fri, 11 Aug 2023 at 02:00, Andres Freund  wrote:
> At the very least this would need to be combined with
>
> commit 950e64fa46b164df87b5eb7c6e15213ab9880f87
> Author: Andres Freund 
> Date:   2022-07-18 17:06:34 -0700
>
> Use STDOUT/STDERR_FILENO in most of syslogger.

950e64fa46b164df87b5eb7c6e15213ab9880f87 needs to be combined with
92f478657c5544eba560047c39eba8a030ddb83e. So, I combined
59751ae47fd43add30350a4258773537e98d4063,
950e64fa46b164df87b5eb7c6e15213ab9880f87 and
92f478657c5544eba560047c39eba8a030ddb83e in a single commit [1].

v2-0001-windows-Fix-windows-logging-problems.patch is a combination of
[1] and the rest are the same.

Regards,
Nazir Bilal Yavuz
Microsoft
From 213b049f4f62d22e9b87fbaa572f9e35cf518d63 Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz 
Date: Thu, 10 Aug 2023 18:40:35 +0300
Subject: [PATCH v2 2/3] Make PG_TEST_USE_UNIX_SOCKETS work for tap tests on
 windows.

This commit is backpatched version of
45f52709d86ceaaf282a440f6311c51fc526340b for PG 14.
---
 src/bin/pg_ctl/t/001_start_stop.pl |  1 +
 src/test/perl/PostgresNode.pm  | 13 -
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/src/bin/pg_ctl/t/001_start_stop.pl b/src/bin/pg_ctl/t/001_start_stop.pl
index d656a7fe183..110926e40e3 100644
--- a/src/bin/pg_ctl/t/001_start_stop.pl
+++ b/src/bin/pg_ctl/t/001_start_stop.pl
@@ -35,6 +35,7 @@ print $conf TestLib::slurp_file($ENV{TEMP_CONFIG})
 if ($use_unix_sockets)
 {
 	print $conf "listen_addresses = ''\n";
+	$tempdir_short =~ s!\\!/!g if $TestLib::windows_os;
 	print $conf "unix_socket_directories = '$tempdir_short'\n";
 }
 else
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index aae02b18d69..ca2902e6430 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -126,7 +126,18 @@ INIT
 	$use_tcp= !$TestLib::use_unix_sockets;
 	$test_localhost = "127.0.0.1";
 	$last_host_assigned = 1;
-	$test_pghost= $use_tcp ? $test_localhost : TestLib::tempdir_short;
+	if ($use_tcp)
+	{
+		$test_pghost = $test_localhost;
+	}
+	else
+	{
+		# On windows, replace windows-style \ path separators with / when
+		# putting socket directories either in postgresql.conf or libpq
+		# connection strings, otherwise they are interpreted as escapes.
+		$test_pghost = TestLib::tempdir_short;
+		$test_pghost =~ s!\\!/!g if $TestLib::windows_os;
+	}
 	$ENV{PGHOST}= $test_pghost;
 	$ENV{PGDATABASE}= 'postgres';

-- 
2.40.1

From 01606cfec52fe4af7c61cdda32d1dcf15d4268cb Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Tue, 7 Sep 2021 11:56:13 -0700
Subject: [PATCH v2 1/3] windows: Fix windows logging problems

This commit is a combination of
54213cb30968c6679050c005ebd1363b77d797c5,
950e64fa46b164df87b5eb7c6e15213ab9880f87 and
92f478657c5544eba560047c39eba8a030ddb83e. These three commits need to be
backpatched together.

54213cb30968c6679050c005ebd1363b77d797c5:
Previously pgwin32_is_service() would falsely return true when postgres is
started from somewhere within a service, but not as a service. That is
e.g. always the case with windows docker containers, which some CI services
use to run windows tests in.

When postgres falsely thinks its running as a service, no messages are
writting to stdout / stderr. That can be very confusing and causes a few tests
to fail.

To fix additionally check if stderr is invalid in pgwin32_is_service(). For
that to work in backend processes, pg_ctl is changed to pass down handles so
that postgres can do the same check (otherwise "default" handles are created).

While this problem exists in all branches, there have been no reports by
users, the prospective CI usage currently is only for master, and I am not a
windows expert. So doing the change in only master for now seems the sanest
approach.

Author: Andres Freund 
Reviewed-By: Magnus Hagander 
Discussion: https://postgr.es/m/20210305185752.3up5eq2eanb7o...@alap3.anarazel.de

950e64fa46b164df87b5eb7c6e15213ab9880f87:
Use STDOUT/STDERR_FILENO in most of syslogger.

This fixes problems on windows when logging collector is used in a service,
failing with:
FATAL:  could not redirect stderr: Bad file descriptor

This is triggered by 76e38b37a5. The problem is that STDOUT/STDERR_FILENO
aren't defined on windows, which lead us to use _fileno(stdout) etc, but that
doesn't work if stdout/stderr are closed.

Author: Andres Freund 
Reported-By: Sandeep Thakkar 
Message-Id: 20220520164558.ozb7lm6unakqz...@alap3.anarazel.de (on pgsql-packagers)
Backpatch: 15-, where 76e38b37a5 came in

92f478657c5544eba560047c39eba8a030ddb83e:
windows: msvc: Define STDIN/OUT/ERR_FILENO.

Because they are not available we've used _fileno(stdin) in some places, but
that doesn't reliably work, because stdin might be closed. This is the
prerequisite of the subsequent commit, fixing a failure introduced in
76e38b37a5.

Author: Andres Freund 
Reported-By: Sandeep Thakkar 
Message-Id:

Re: pgbench with libevent?

2023-08-14 Thread Alvaro Herrera

On 2023-Aug-13, Fabien COELHO wrote:

> 4. libevent development seems slugish, last bugfix was published 3 years ago, 
> version
>2.2 has been baking for years, but the development seems lively (+100 
> contributors).

Ugh, I would stay away from something like that.  Would we become
hostage to an undelivering group?  No thanks.

On 2023-Aug-14, Fabien COELHO wrote:

> Or maybe libuv (used by nodejs?).

> Note: libev had no updates in 8 years.

libev or libuv?  No updates in 8 years => dead.  No way.

Reworking based on wait events as proposed downthread sounds more
promising.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Siempre hay que alimentar a los dioses, aunque la tierra esté seca" (Orual)

Re: proposal: jsonb_populate_array

2023-08-14 Thread Alvaro Herrera

On 2023-Aug-14, Pavel Stehule wrote:

> jsonb_populate_array(anyarray, jsonb) returns anyarray
> 
> Usage:
> 
> select jsonb_populate_array(null::text[], '["cust_full_name","cust_email"]')

I don't understand what this does.  Can you be more explicit?

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
Maybe there's lots of data loss but the records of data loss are also lost.
(Lincoln Yeoh)

Re: Support to define custom wait events for extensions

2023-08-14 Thread Michael Paquier

On Mon, Aug 14, 2023 at 05:55:42PM +0900, Masahiro Ikeda wrote:
> I'm thinking a name like "contrib name + description summary" would
> be nice. The "contrib name"  is namespace-like and the "description summary"
> is the same as the name of the waiting event name in core. For example,
> "DblinkConnect" for dblink. In the same as the core one, I thought the name
> should be the camel case.

Or you could use something more in line with the other in-core wait
events formatted as camel-case, like DblinkConnect, etc.

> BTW, is it better to discuss this in a new thread because other developers
> might be interested in user-facing wait event names? I also would like to
> add documentation on the wait events for each modules, as they are not 
> mentioned
> now.

Saying that, having some documentation on the page of each extension
is mandatory once these can be customized, in my opinion.  All that
should be discussed on a new, separate thread, to attract the correct
audience.
--
Michael


signature.asc
Description: PGP signature

Re: POC: Extension for adding distributed tracing - pg_tracing

2023-08-14 Thread Nikita Malakhov

Hi!

Storing spans is a tricky question. Monitoring systems often use pull model
and use probes or SQL
API for pulling data, so from this point of view it is much more convenient
to keep spans in separate
table. But in this case we come to another issue - how to flush this data
into the table? Automatic
flushing on a full buffer would randomly (to the user) significantly affect
query performance. I'd rather
make a GUC turned off by default to store spans into the table instead of
buffer.

>3) I'm testing more complex queries. Most of my previous tests were using
simple query protocol but extended protocol introduces
>differences that break some assumptions I did. For example, with multi
statement transaction like
>BEGIN;
>SELECT 1;
>SELECT 2;
>The parse of SELECT 2 will happen before the ExecutorEnd (and the
end_tracing) of SELECT 1. For now, I'm skipping the post parse
>hook if we still have an ongoing tracing.

I've checked this behavior before and haven't noticed the case you
mentioned, but for
loops like
for i in 1..2 loop
StartTime := clock_timestamp();
insert into t2 values (i, a_data);
EndTime := clock_timestamp();
Delta := 1000 * ( extract(epoch from EndTime) - extract(epoch from
StartTime) );
end loop;

I've got the following call sequence:
psql:/home/postgres/tests/trace.sql:52: NOTICE:
 pg_tracing_post_parse_analyze hook <1>
psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_planner_hook <1>
psql:/home/postgres/tests/trace.sql:52: NOTICE:
 pg_tracing_post_parse_analyze hook <2>
psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_planner_hook <2>
psql:/home/postgres/tests/trace.sql:52: NOTICE:
 pg_tracing_post_parse_analyze hook 
psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_planner_hook

psql:/home/postgres/tests/trace.sql:52: NOTICE:
 pg_tracing_post_parse_analyze hook 
psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_planner_hook

psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_ExecutorStart

psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_ExecutorRun
 
psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_ExecutorFinish

psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_ExecutorEnd

psql:/home/postgres/tests/trace.sql:52: NOTICE:
 pg_tracing_post_parse_analyze hook 
psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_planner_hook

psql:/home/postgres/tests/trace.sql:52: NOTICE:
 pg_tracing_post_parse_analyze hook 
psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_planner_hook

psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_planner_hook

psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_ExecutorStart

psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_ExecutorRun
 
psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_ExecutorFinish

psql:/home/postgres/tests/trace.sql:52: NOTICE:  pg_tracing_ExecutorEnd


-- 
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Re: Extract numeric filed in JSONB more effectively

2023-08-14 Thread Andy Fan

>
>
> you cannot to use type as parameter. There should be some typed value -
> like
>
> jsonb_object_field, '{"a":10}', 'a', NULL::int)
>
> and return type should be anyelement.
>
>
So could we get the inputted type in the body of jsonb_object_field?
I guess no.  IIUC, our goal will still be missed in this way.

-- 
Best Regards
Andy Fan

Re: Support to define custom wait events for extensions

2023-08-14 Thread Masahiro Ikeda


On 2023-08-14 15:28, Michael Paquier wrote:

On Mon, Aug 14, 2023 at 12:31:05PM +0900, Masahiro Ikeda wrote:

Thanks! I confirmed the changes, and all tests passed.


Okay, cool.  I got some extra time today and applied that, with a few
more tweaks.


Thanks for applying master branch!


This looks correct, but perhaps we need to think harder about the
custom event names and define a convention when more of this stuff is
added to the core modules.


I checked the source code how many functions use WAIT_EVENT_EXTENSION.
There are 3 contrib modules and a test module use WAIT_EVENT_EXTENSION 
and

there are 8 places where it is called as an argument.

* dblink
 - dblink_get_conn(): the wait event is set until the connection 
establishment succeeded

 - dblink_connect(): same as above

* autoprewarm
 - autoprewarm_main(): the wait event is set until shutdown request is 
received

 - autoprewarm_main(): the wait event is set until the next dump time

* postgres_fdw
 - connect_pg_server(): the wait event is set until connection 
establishment succeeded
 - pgfdw_get_result(): the wait event is set until the results are 
received

 - pgfdw_get_cleanup_result(): same as above except for abort cleanup

* test_sh_mq
 - wait_for_workers_to_become_ready(): the wait event is set until the 
workers become ready


I'm thinking a name like "contrib name + description summary" would
be nice. The "contrib name"  is namespace-like and the "description 
summary"

is the same as the name of the waiting event name in core. For example,
"DblinkConnect" for dblink. In the same as the core one, I thought the 
name

should be the camel case.

BTW, is it better to discuss this in a new thread because other 
developers
might be interested in user-facing wait event names? I also would like 
to add
documentation on the wait events for each modules, as they are not 
mentioned now.


Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

Re: pgsql: Allow tailoring of ICU locales with custom rules

2023-08-14 Thread Peter Eisentraut


On 24.07.23 04:46, Amit Kapila wrote:

On Fri, Mar 10, 2023 at 3:24 PM Peter Eisentraut
 wrote:


On 08.03.23 21:57, Jeff Davis wrote:


* It appears rules IS NULL behaves differently from rules=''. Is that
desired? For instance:
create collation c1(provider=icu,
  locale='und-u-ka-shifted-ks-level1',
  deterministic=false);
create collation c2(provider=icu,
  locale='und-u-ka-shifted-ks-level1',
  rules='',
  deterministic=false);
select 'a b' collate c1 = 'ab' collate c1; -- true
select 'a b' collate c2 = 'ab' collate c2; -- false


I'm puzzled by this.  The general behavior is, extract the rules of the
original locale, append the custom rules, use that.  If the custom rules
are the empty string, that should match using the original rules
untouched.  Needs further investigation.


* Can you document the interaction between locale keywords
("@colStrength=primary") and a rule like '[strength 2]'?


I'll look into that.


This thread is listed on PostgreSQL 16 Open Items list. This is a
gentle reminder to see if there is a plan to move forward with respect
to open points.


I have investigated this.  My assessment is that how PostgreSQL 
interfaces with ICU is correct.  Whether what ICU does is correct might 
be debatable.  I have filed a bug with ICU about this: 
https://unicode-org.atlassian.net/browse/ICU-22456 , but there is no 
response yet.


You can work around this by including the desired attributes in the 
rules string, for example


create collation c3 (provider=icu,
  locale='und-u-ka-shifted-ks-level1',
  rules='[alternate shifted][strength 1]',
  deterministic=false);

So I don't think there is anything we need to do here for PostgreSQL 16.

Re: Fix pg_stat_reset_single_table_counters function

2023-08-14 Thread Masahiro Ikeda


Hi,

On 2023-08-13 04:12, Andres Freund wrote:

On 2023-08-10 17:48:10 +0900, Masahiko Sawada wrote:
Good catch! I've confirmed that the issue has been fixed by your 
patch.


Indeed.


Thanks for your responses!


However, I'm not sure the added regression tests are stable since
autovacuum workers may scan the pg_database and increment the
statistics after resetting the stats.


What about updating the table and checking the update count is reset? 
That'd

not be reset by autovacuum.


Yes. I confirmed that the stats are incremented by autovacuum as you 
said.


I updated the patch to v3.
* remove the code to bump the CATALOG_VERSION_NO because I misunderstood
* change the test logic to check the update count instead of scan count

I changed the table to check the stats from pg_database to 
pg_shdescription

because the stats can update via the SQL interface COMMENT command.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATIONFrom 215ef8ef68af30753cfcd4336b1f6bd9203ac014 Mon Sep 17 00:00:00 2001
From: Masahiro Ikeda 
Date: Mon, 14 Aug 2023 16:48:30 +0900
Subject: [PATCH] Fix pg_stat_reset_single_table_counters function.

This commit revives the feature to reset statistics for a single
relation shared across all databases in the cluster to zero, which
was implemented by the following commit.
* Enhance pg_stat_reset_single_table_counters function(e04267844)

The following commit accidentally deleted the feature.
* pgstat: store statistics in shared memory(5891c7a8e)

Need to backpatch from 15.

Reported-by: Mitsuru Hinata
---
 src/backend/utils/adt/pgstatfuncs.c |  9 +++--
 src/test/regress/expected/stats.out | 31 +
 src/test/regress/sql/stats.sql  | 13 
 3 files changed, 51 insertions(+), 2 deletions(-)

diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 2a4c8ef87f..2b9742ad21 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -17,6 +17,7 @@
 #include "access/htup_details.h"
 #include "access/xlog.h"
 #include "access/xlogprefetcher.h"
+#include "catalog/catalog.h"
 #include "catalog/pg_authid.h"
 #include "catalog/pg_type.h"
 #include "common/ip.h"
@@ -1776,13 +1777,17 @@ pg_stat_reset_shared(PG_FUNCTION_ARGS)
 	PG_RETURN_VOID();
 }
 
-/* Reset a single counter in the current database */
+/*
+ * Reset a statistics for a single object, which may be of current
+ * database or shared across all databases in the cluster.
+ */
 Datum
 pg_stat_reset_single_table_counters(PG_FUNCTION_ARGS)
 {
 	Oid			taboid = PG_GETARG_OID(0);
+	Oid			dboid = (IsSharedRelation(taboid) ? InvalidOid : MyDatabaseId);
 
-	pgstat_reset(PGSTAT_KIND_RELATION, MyDatabaseId, taboid);
+	pgstat_reset(PGSTAT_KIND_RELATION, dboid, taboid);
 
 	PG_RETURN_VOID();
 }
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 319164a5e9..11cb841386 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -764,6 +764,37 @@ FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass;
 2 | t  |3 | t
 (1 row)
 
+-
+-- Test to reset stats for a table shared across all databases (ex. pg_shdescription)
+-
+BEGIN;
+SELECT current_database() as current_database \gset
+COMMENT ON DATABASE :current_database IS 'This is a test comment';  -- insert or update in 'pg_shdescription'
+SELECT pg_stat_force_next_flush();
+ pg_stat_force_next_flush 
+--
+ 
+(1 row)
+
+COMMIT;
+SELECT n_tup_ins + n_tup_upd > 0 FROM pg_stat_all_tables WHERE relid = 'pg_shdescription'::regclass;
+ ?column? 
+--
+ t
+(1 row)
+
+SELECT pg_stat_reset_single_table_counters('pg_shdescription'::regclass);
+ pg_stat_reset_single_table_counters 
+-
+ 
+(1 row)
+
+SELECT n_tup_ins + n_tup_upd FROM pg_stat_all_tables WHERE relid = 'pg_shdescription'::regclass;
+ ?column? 
+--
+0
+(1 row)
+
 -
 -- Test that various stats views are being properly populated
 -
diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql
index 9a16df1c49..d113aed257 100644
--- a/src/test/regress/sql/stats.sql
+++ b/src/test/regress/sql/stats.sql
@@ -376,6 +376,19 @@ COMMIT;
 SELECT seq_scan, :'test_last_seq' = last_seq_scan AS seq_ok, idx_scan, :'test_last_idx' < last_idx_scan AS idx_ok
 FROM pg_stat_all_tables WHERE relid = 'test_last_scan'::regclass;
 
+-
+-- Test to reset stats for a table shared across all databases (ex. pg_shdescription)
+-
+
+BEGIN;
+SELECT current_database() as current_database \gset
+COMMENT ON DATABASE :current_database IS 'This is a test comment';  -- insert or update in 'pg_shdescription'
+SELECT pg_stat_force_next_flush();
+COMMIT;
+
+SELECT n_tup_ins + n_tup_upd > 0 FROM pg_stat_all_tables WHERE relid = 'pg_shdescription'::regclass;
+SELECT pg_stat_reset_single_table_counters('pg_shdescription'::regclass);
+SELECT n_tup_ins + n_tup_upd FROM

Re: Extract numeric filed in JSONB more effectively

2023-08-14 Thread Pavel Stehule

po 14. 8. 2023 v 9:06 odesílatel Andy Fan  napsal:

>
>> We'd still have functions like jsonb_field_as_numeric() under the
>> hood, but there's not an expectation that users call them explicitly.
>>
>
> To avoid the lots of functions like jsonb_field_as_int2/int4, I defined
> Datum jsonb_object_field_type(.., Oid target_oid) at last,  so the
> function must return "internal" or "anyelement".  Then we can see:
>
> select jsonb_object_field_type(tb.a, 'a'::text, 1700) from tb;
> ERROR:  cannot display a value of type anyelement.
>

you cannot to use type as parameter. There should be some typed value - like

jsonb_object_field, '{"a":10}', 'a', NULL::int)

and return type should be anyelement.

Another solution should be more deeper change like implementation of
"coalesce"



>
> The reason is clear to me, but  I'm not sure how to fix that or deserves
> a fix? Or shall I define jsonb_object_field_int2/int8 to avoid this?
>
> This is an unresolved issue at the latest patch.
> --
> Best Regards
> Andy Fan
>

Re: pgbench with libevent?

2023-08-14 Thread Thomas Munro

On Mon, Aug 14, 2023 at 6:07 PM Tatsuo Ishii  wrote:
> Interesting. In my understanding this also needs to make Latch
> frontend-friendly?

It could be refactored to support a different subset of event types --
maybe just sockets, no latches and obviously no 'postmaster death'.
But figuring out how to make latches work between threads might also
be interesting for future projects...

Maybe Fabien has completion-based I/O in mind (not just "readiness").
That's something that some of those libraries can do, IIUC.  For
example, when your thread wakes up, it tells you "your socket read is
finished, the data is already in your target buffer".  As opposed to
"you can now call recv() without blocking", so you avoid another trip
into the kernel.  But that's also something we'll eventually want to
figure out in the server.

Re: Report planning memory in EXPLAIN ANALYZE

2023-08-14 Thread Ashutosh Bapat

On Mon, Aug 14, 2023 at 8:22 AM Andrey Lepikhov
 wrote:
>
> Really, the current approach with the final value of consumed memory
> smooths peaks of memory consumption. I recall examples likewise massive
> million-sized arrays or reparameterization with many partitions where
> the optimizer consumes much additional memory during planning.
> Ideally, to dive into the planner issues, we should have something like
> a report-in-progress in the vacuum, reporting on memory consumption at
> each subquery and join level. But it looks too much for typical queries.

Planner finishes usually finish within a second. When partitioning is
involved it might take a few dozens of seconds but it's still within a
minute and we are working to reduce that as well to a couple hundred
milliseconds at max. Tracking memory usages during this small time may
not be worth it. The tracking itself might make the planning
in-efficient and we might still miss the spikes in memory allocations,
if they are very short lived. If the planner runs for more than a few
minutes, maybe we could add some tracking.

-- 
Best Wishes,
Ashutosh Bapat

Re: Extract numeric filed in JSONB more effectively

2023-08-14 Thread Andy Fan

>
>
> We'd still have functions like jsonb_field_as_numeric() under the
> hood, but there's not an expectation that users call them explicitly.
>

To avoid the lots of functions like jsonb_field_as_int2/int4, I defined
Datum jsonb_object_field_type(.., Oid target_oid) at last,  so the
function must return "internal" or "anyelement".  Then we can see:

select jsonb_object_field_type(tb.a, 'a'::text, 1700) from tb;
ERROR:  cannot display a value of type anyelement.

The reason is clear to me, but  I'm not sure how to fix that or deserves
a fix? Or shall I define jsonb_object_field_int2/int8 to avoid this?

This is an unresolved issue at the latest patch.
-- 
Best Regards
Andy Fan

Re: Report planning memory in EXPLAIN ANALYZE

2023-08-14 Thread Ashutosh Bapat

On Mon, Aug 14, 2023 at 5:23 AM David Rowley  wrote:
>
> On Thu, 10 Aug 2023 at 20:33, Ashutosh Bapat
>  wrote:
> > My point is what's relevant here is how much net memory planner asked
> > for.
>
> But that's not what your patch is reporting. All you're reporting is
> the difference in memory that's *currently* palloc'd from before and
> after the planner ran.  If we palloc'd 600 exabytes then pfree'd it
> again, your metric won't change.

May be I didn't use the right phrase "asked for". But I expected that
to be read in the context of "net memory" - net as an adjective.
Anyway, I will make it more clear below.

>
> I'm struggling a bit to understand your goals here.  If your goal is
> to make a series of changes that reduces the amount of memory that's
> palloc'd at the end of planning, then your patch seems to suit that
> goal, but per the quote above, it seems you care about how many bytes
> are palloc'd during planning and your patch does not seem track that.

There are three metrics here. M1: The total number of bytes that the
planner "requests". M2: The total number of bytes that "remain
palloc'ed" at a given point in time. M3: Maximum number of bytes that
were palloc'ed at any point in time during planning. Following
sequence of operations will explain the difference
p1 palloc: 100 - M1 = 100, M2 = 100, M3 = 100
p2 palloc: 100, M1 = 200, M2 = 200, M3 = 200
p3 pfree: 100, M1 = 200, M2 = 100, M3 = 200
p4 palloc: 100, M1 = 300, M2 = 200, M3 = 200
p5 palloc: 100, M1 = 400, M2 = 300, M3 = 300
p6 pfree: 100, M1 = 400, M2 = 200, M3 = 300

The patch reports M2 at the end of planning.
MemoryContextData::mem_allocated is not exactly the same as M3 but
gives indication of M3.

My goal is to reduce all three M1, M2 and M3. I don't it's easy to
track M1 and also worth the complexity. M2 and M3 instead act as rough
indicators of M1.
>
> With your patch as it is, to improve the metric you're reporting we
> could go off and do things like pfree Paths once createplan.c is done,
> but really, why would we do that? Just to make the "Planning Memory"
> metric looks better doesn't seem like a worthy goal.
>

As I mentioned earlier the CurrentMemoryContext at the time of
planning is also used during query execution. There are other contexts
like per row, per operator contexts but anything which is specific to
the running query and does not fit those contexts is allocated in this
context. If we reduce memory that remains palloc'ed (M2) at the end of
the planning, it can be used during execution. So it looks like a
worthy goal.

> Instead, if we reported the context's mem_allocated, then it would
> give us the flexibility to make changes to the memory context code to
> have the metric look better. It might also alert us to planner
> inefficiencies and problems with new code that may cause a large spike
> in the amount of memory that gets allocated.

This is M3. I agree with your reasoning here. We should report M3 as
well. I will make changes to the patch.

> Now, I'm not saying we
> should add a patch that shows mem_allocated. I'm just questioning if
> your proposed patch meets the goals you're trying to achieve.  I just
> suggested that you might want to consider something else as a metric
> for your memory usage reduction work.

Ok. Thanks for the suggestions. More suggestions are welcome too.

[1] https://www.merriam-webster.com/dictionary/net

-- 
Best Wishes,
Ashutosh Bapat

Re: Adding a LogicalRepWorker type field

2023-08-14 Thread Peter Smith

The main patch for adding the worker type enum has been pushed [1].

Here is the remaining (rebased) patch for changing some previous
cascading if/else to switch on the LogicalRepWorkerType enum instead.

PSA v8.

--
[1] 
https://github.com/postgres/postgres/commit/2a8b40e3681921943a2989fd4ec6cdbf8766566c

Kind Regards,
Peter Smith.
Fujitsu Australia


v8-0001-Switch-on-worker-type.patch
Description: Binary data

Re: Support to define custom wait events for extensions

2023-08-14 Thread Michael Paquier

On Mon, Aug 14, 2023 at 12:31:05PM +0900, Masahiro Ikeda wrote:
> Thanks! I confirmed the changes, and all tests passed.

Okay, cool.  I got some extra time today and applied that, with a few
more tweaks.
--
Michael


signature.asc
Description: PGP signature

Re: pgbench with libevent?

2023-08-14 Thread Tatsuo Ishii

> On Mon, Aug 14, 2023 at 12:35 PM Fabien COELHO  wrote:
>> > Pgbench is managing clients I/Os manually with select or poll. Much of this
>> > could be managed by libevent.
>>
>> Or maybe libuv (used by nodejs?).
>>
>> From preliminary testing libevent seems not too good at fine grain time
>> management which are used for throttling, whereas libuv advertised that it
>> is good at it, although what it does is yet to be seen.
> 
> Do you think our WaitEventSet stuff could be good here, if made
> frontend-friendly?

Interesting. In my understanding this also needs to make Latch
frontend-friendly?

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

87 matches

Mail list logo