Re: bgwrite process is too lazy

2024-10-06 Thread wenhui qiu
Hi Andres Freund
  Thank you for explanation

> I doubt that slowdown is caused by bgwriter not being active enough. I
suspect
> what you're seeing is one or more of:

> a) The overhead of doing full page writes (due to increasing the WAL
>   volume). You could verify whether that's the case by turning
>  full_page_writes off (but note that that's not generally safe!) or see if
>  the overhead shrinks if you set wal_compression=zstd or
wal_compression=lz4
>   (don't use pglz, it's too slow).

> b) The overhead of renaming WAL segments during recycling. You could see
if
>  this is related by specifying --wal-segsize 512 or such during initdb.
I am aware of these optimizations, and these optimizations only mitigate
the impact, I didn't turn on wal log compression on purpose during stress
test ,

shared_buffers = '32GB'
bgwriter_delay = '10ms'
bgwriter_lru_maxpages = '8192'
bgwriter_lru_multiplier = '10.0'
wal_buffers = '64MB'
checkpoint_completion_target = '0.999'
checkpoint_timeout = '600'
max_wal_size = '32GB'
min_wal_size = '16GB'

I think in business scenarios where there are many reads and few writes, it
is indeed desirable to keep as many dirty pages in memory as
possible,However, in scenarios such as push systems and task scheduling
systems, which also have a lot of reads and writes, the impact of
checkpoints will be more obvious,Adaptive bgwrite or bgwrite triggered when
a dirty page reaches a certain watermark eliminates the impact of
checkpoints on performance jitter.From what I understand, quite a few
commercial databases residing in postgresql have added the adaptive refresh
dirty page feature, and from their internal reports, the whole stress
testing process was very smooth! Since it's a trade secret, I don't know
how they implemented this feature.


Re: Psql meta-command conninfo+

2024-10-06 Thread Hunaid Sohail
Hi David,

Thank you for your feedback.

On Fri, Oct 4, 2024 at 11:56 AM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> It seems to me a more useful definition for what this command should print
> out is basically the entire contents of:
>
> https://www.postgresql.org/docs/current/libpq-status.html
>
> That page has three sections:
> Connection Invariants
> Current Status
> Encryption (TLS)
>
> I would suggest that we thus produce three tables - one for each.  In the
> case of SSL, a message saying “not used” instead of a table full of blanks
> probably suffices, though I’d lean to print all of what is available at all
> times.
>

We can try this approach. I would also like to have other's opinions on
this approach.

Most functions are already used, while some are not required (IMO). I have
listed all the functions from the doc link you provided, along with my
brief comments based on the latest patch (v35).

PQdb - already used
PQuser - already used
PQpass - no need
PQhost - already used
PQhostaddr - already used
PQport - already used
PQtty - no need
PQoptions - can be used

PQstatus - no need
PQtransactionStatus - can be used
PQparameterStatus - already used
PQprotocolVersion - already used
PQserverVersion  - no need
PQerrorMessage  - no need
PQsocket  - no need
PQbackendPID - already used
PQconnectionNeedsPassword  - no need
PQconnectionUsedPassword - can be used
PQconnectionUsedGSSAPI - already used

PQsslInUse - already used
PQsslAttribute - only key_bits attribute not used
PQsslAttributeNames  - no need
PQsslStruct  - no need
PQgetssl  - no need

For PQparameterStatus, some parameters are already used.
server_version and application_name were already discussed and removed in
v12 and v29 respectively. Do we need other parameters?


> Within that framework having \conninfo[+[CSE][…]] be the command -
> printing out only the table specified would be the behavior (specifying no
> suffix letters prints all three) - would be an option.
>

3 separate tables without suffix?

If others are okay with this, I can work on this approach and will provide
a patch before the next CF.

Regards,
Hunaid Sohail


Re: Add parallel columns for seq scan and index scan on pg_stat_all_tables and _indexes

2024-10-06 Thread Guillaume Lelarge
Le lun. 7 oct. 2024 à 02:41, Michael Paquier  a écrit :

> On Mon, Oct 07, 2024 at 12:43:18AM +0300, Alena Rybakina wrote:
> > Maybe I'm not aware of the whole context of the thread and maybe my
> > questions will seem a bit stupid, but honestly
> > it's not entirely clear to me how this statistics will help to adjust the
> > number of parallel workers.
> > We may have situations when during overestimation of the cardinality
> during
> > query optimization a several number of parallel workers were
> unjustifiably
> > generated and vice versa -
> > due to a high workload only a few number of workers were generated.
> > How do we identify such cases so as not to increase or decrease the
> number
> > of parallel workers when it is not necessary?
>
> Well.  For spiky workloads, only these numbers are not going to help.
> If you can map them with the number of times a query related to these
> tables has been called, something that pg_stat_statements would be
> able to show more information about.
>
> FWIW, I have doubts that these numbers attached to this portion of the
> system are always useful.  For OLTP workloads, parallel workers would
> unlikely be spawned because even with JOINs we won't work with a high
> number of tuples that require them.  This could be interested with
> analytics, however complex query sequences mean that we'd still need
> to look at all the plans involving the relations where there is an
> unbalance of planned/spawned workers, because these can usually
> involve quite a few gather nodes.  At the end of the day, it seems to
> me that we would still need data that involves statements to track
> down specific plans that are starving.  If your application does not
> have that many statements, looking at individial plans is OK, but if
> you have hundreds of them to dig into, this is time-consuming and
> stats at table/index level don't offer data in terms of stuff that
> stands out and needs adjustments.
>
> And this is without the argument of bloating more the stats entries
> for each table, even if it matters less now that these stats are in
> shmem lately.
>

We need granularity because we have granularity in the config. There is
pg_stat_database because it gives the whole picture and it is easier to
monitor. And then, there is pg_stat_statements to analyze problematic
statements. And finally there is pg_stat_all* because you can set
parallel_workers on a specific table.

Anyway, offering various ways of getting the same information is not
unheard of. Pretty much like temp_files/temp_bytes in pg_stat_database,
temp_blks_read/temp_blks_written in pg_stat_statements and log_temp_files
in log files if you ask me :)


-- 
Guillaume.


Re: Should rolpassword be toastable?

2024-10-06 Thread Hannu Krosing
On Fri, Oct 4, 2024 at 4:48 PM Nathan Bossart  wrote:
>
..
> Since BLCKSZ can be as low as 1024, I think 512 would be a good choice.
>


Where did you get the minimal value of 1024 from ?

I vaguely remember someone testing with 256 at some point in the past

---
Hannu




System views for versions reporting

2024-10-06 Thread Dmitry Dolgov
Hi,

Based on the feedback in [1], here is my attempt at implementing system
views for versions reporting. It adds pg_system_versions for showing
things like core version, compiler, LLVM, etc, and pg_system_libraries
for showing linked shared objects. I think everyone has ageed that the
first was a good idea, where the second was only suggested -- after some
thinking I find shared obects useful enough to include here as well.

The main idea is to facilitate bug reporting. In particular, in many JIT
related bug reports one of the first questions is often "which LLVM
version is used?". Gathering such information is a manual process,
mistakes could be made when veryfing llvm-config output or anywhere
else. Having a system view for such purposes makes the process a bit
more robust.

The first three patches are essential for this purpose, the fourth one
is somewhat independent and could be concidered in isolation. The output
looks like this :

=# select * from pg_system_versions;
   name   |   version| type
--+--+--
 Arch | x86_64-linux | Compile Time
 ICU  | 15.1 | Compile Time
 Core | 18devel  | Compile Time
 Compiler | gcc-14.0.1   | Compile Time
 LLVM | 18.1.6   | Run Time

=# select * from pg_system_libraries;
name
-
 /lib64/libkrb5.so.3
 /lib64/libz.so.1
 linux-vdso.so.1
 /lib64/libxml2.so.2
 [...]

Any feedback is appreciated.

0001-Add-infrastructure-for-pg_system_versions-view

Prepares the infrastructure, adds the view without populating it with
the data just yet. The view is backed by a hash table, which contains
callbacks returning a version string for a particular component. The
idea is to allow some flexibility in reporting, making components
responsible for how and when the information is exposed. E.g. it allows
to say that the version is not available.

0002-Add-core-versions-to-pg_system_versions.patch

Actually populates the view with some predefined compile time versions.
The versions are registered in PostgresMain, right after
BeginReportingGUCOptions -- there is no strong reasoning for that except
that it looks fine to me, so feel free to suggest a better place.

0003-Add-JIT-provider-version-to-pg_system_versions.patch

Finally adds LLVM version into the view via extending set of JIT
provider callbacks. The registration is happening together with the core
versions from the previous patch, because the JIT provider itself is
initialized only when a first expression is getting compiled.

0004-Add-pg_system_libraries-view.patch

Strictly speaking independent from the above patches. Adds the view to
show linked shared objects by iterating dl_phdr_info with
dl_iterate_phdr. It belongs to the standard C library, so I assume it's
portable enough.

[1]: 
https://www.postgresql.org/message-id/flat/znc72ymyoelvk5rjk5ub254v3qvcczfrk6autygjdobfvx2e7p%40s3dssvf34twa
>From d903142d178dd8a9c03d07ee4809ac582b9b7818 Mon Sep 17 00:00:00 2001
From: Dmitrii Dolgov <9erthali...@gmail.com>
Date: Sat, 5 Oct 2024 18:27:57 +0200
Subject: [PATCH v1 1/4] Add infrastructure for pg_system_versions view

Introduce a unified way of reporting versions (PostgreSQL itself, the
compiler, the host system, compile and runtime dependencies, etc.) via a
new system view pg_system_versions. This is going to be useful for
troubleshooting and should enhance bug reports, replacing manual
bug-prone collecting of the same information.

The view is backed by a hash table, that contains callbacks returning
version string for a particular component. The idea is to allow some
flexibility in reporting, making components responsible for how and when
the information is exposed.
---
 doc/src/sgml/system-views.sgml  |  56 
 src/backend/catalog/system_views.sql|   8 ++
 src/backend/utils/misc/Makefile |   3 +-
 src/backend/utils/misc/meson.build  |   1 +
 src/backend/utils/misc/system_version.c | 108 
 src/include/catalog/pg_proc.dat |   6 ++
 src/include/utils/system_version.h  |  40 +
 src/test/regress/expected/rules.out |   8 ++
 8 files changed, 229 insertions(+), 1 deletion(-)
 create mode 100644 src/backend/utils/misc/system_version.c
 create mode 100644 src/include/utils/system_version.h

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 634a4c0..df0b9d3 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -226,6 +226,11 @@
   wait events
  
 
+ 
+  pg_system_versions
+  system versions
+ 
+
 

   
@@ -5064,4 +5069,55 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
   
  
 
+ 
+  pg_system_versions
+
+  
+   pg_system_versions
+  
+
+  
+   The view pg_system_versions provides description
+   about versions of various

Re: System views for versions reporting

2024-10-06 Thread Joe Conway

On 10/6/24 11:36, Dmitry Dolgov wrote:

Hi,

Based on the feedback in [1], here is my attempt at implementing system
views for versions reporting. It adds pg_system_versions for showing
things like core version, compiler, LLVM, etc, and pg_system_libraries
for showing linked shared objects. I think everyone has ageed that the
first was a good idea, where the second was only suggested -- after some
thinking I find shared obects useful enough to include here as well.

The main idea is to facilitate bug reporting. In particular, in many JIT
related bug reports one of the first questions is often "which LLVM
version is used?". Gathering such information is a manual process,
mistakes could be made when veryfing llvm-config output or anywhere
else. Having a system view for such purposes makes the process a bit
more robust.

The first three patches are essential for this purpose, the fourth one
is somewhat independent and could be concidered in isolation. The output
looks like this :

=# select * from pg_system_versions;
   name   |   version| type
--+--+--
 Arch | x86_64-linux | Compile Time
 ICU  | 15.1 | Compile Time
 Core | 18devel  | Compile Time
 Compiler | gcc-14.0.1   | Compile Time
 LLVM | 18.1.6   | Run Time


I'm not sure why ICU is "Compile Time" rather than "Run Time" when it is 
not statically linked. Also, if we are going to include ICU here, 
shouldn't we also include libc version?



=# select * from pg_system_libraries;
name
-
 /lib64/libkrb5.so.3
 /lib64/libz.so.1
 linux-vdso.so.1
 /lib64/libxml2.so.2
 [...]

Any feedback is appreciated.


I think it would be nice to include a sha256 hash (or something similar) 
of the libraries as well, so that they can be checked against known good 
values.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Add minimal C example and SQL registration example for custom table access methods.

2024-10-06 Thread Michael Paquier
On Fri, May 24, 2024 at 03:59:08PM -0300, Fabrízio de Royes Mello wrote:
> Nice... LGTM!

I have noticed that this was still in the CF.  After fixing a couple
of inconsistencies in the markups and the names, trimming down the
list of headers to avoid rot and adding a static in from of the const,
the result looked pretty much OK, so applied.
--
Michael


signature.asc
Description: PGP signature


RE: Make default subscription streaming option as Parallel

2024-10-06 Thread Hayato Kuroda (Fujitsu)
Dear Vignesh,

> One key point to consider is that the lock on transaction objects will
> be held for a longer duration when using streaming in parallel. This
> occurs because the parallel apply worker initiates the transaction as
> soon as streaming begins, maintaining the lock until the transaction
> is fully completed. As a result, for long-running transactions, this
> extended lock can hinder concurrent access that requires a lock.

So, the current default is the most conservative and can run anywhere; your
proposal is a bit optimistic, right? Since long transactions should be avoided
in any case, and everyone knows it, I agree with your point.

One comment for your patch;
Shouldn't we add a streaming=off case for pg_dump test? You added lines but no 
one
passes it.

Best regards,
Hayato Kuroda
FUJITSU LIMITED


Re: [PROPOSAL] : Disallow use of empty column name in (column_name '') in ALTER or CREATE of foreign table.

2024-10-06 Thread Michael Paquier
On Thu, Aug 22, 2024 at 04:00:13PM +0530, Nishant Sharma wrote:
> I may be wrong, but just wanted to share my thoughts on the differences.
> So, it
> can be considered a different issue/mistake and can be handled separately in
> another email thread.

+else if (strcmp(def->defname, "column_name") == 0)
+{
+char   *col_name_opt = defGetString(def);
+
+/*
+ * PostgresSQL follows SQL syntax, so we do not allow empty
+ * column_name option.
+ */
+if (col_name_opt && col_name_opt[0] == '\0')
+ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("colum_name option cannot be empty for 
postgres_fdw")));
+}

If we begin to care about empty names in column_name in the FDW
command, shouldn't we also care about empry values in schema_name and
table_name?

Typos: PostgresSQL -> PostgreSQL and colum_name -> column_name.

Once you associate table_name and schema_name, you can save in
translation by rewording the errmsg like that (no need to mention
postgres_fdw, note the quotes around the option name):
errmsg("cannot use empty value for option \"%s\"",
   "column_name");
--
Michael


signature.asc
Description: PGP signature


Re: Set query_id for query contained in utility statement

2024-10-06 Thread Michael Paquier
On Fri, Oct 04, 2024 at 08:16:00PM +0800, jian he wrote:
> about v5 0001
> select_with_parens:
> '(' select_no_parens ')'{ $$ = $2; }
> | '(' select_with_parens ')'{ $$ = $2; }
> ;
> 
> 
>  toplevel | calls | query
> --+---+---
>  t| 1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
>  t| 0 | SELECT toplevel, calls, query FROM pg_stat_statements+
>   |   |   ORDER BY query COLLATE "C", toplevel
>  t| 1 | explain (select $1)
>  f| 1 | select $1);
> 
> query "select $1);" looks weird. not sure how to improve it,
> or this should be the expected behavior?

GOod point, this is confusing.  The point is that having only
stmt_location is not enough to detect where the element in the query
you want to track is because it only points at its start location in
the full query string.  In an ideal world, what we should have is its
start and end, pass it down to pgss_store(), and store only this
subquery between the start and end positions in the stats entry.
Making that right through the parser may be challenging, though.

This concept is something that's perhaps larger than this thread?  I
think that we want the same kind of thing for values in IN() and ANY()
clauses, where we want to track an area for a single normalization
parameter, perhaps with a separate node_attr.  I am not sure if using
the same trackers would make sense, so I am just waving hands a bit
here, but the concepts required are quite close.

Saying that, a patch set implemented this way would ensure a strict
1:1 mapping between a query ID and the internal query in these EXPLAIN
and CREATE commands, which would be good.

The first step should be IMO to expand the tests of pgss and track all
the behaviors we have historically in the tree about all that.  Then,
it becomes much easier to track how much we want to tweak them
depending on if pgss.track is set to "top" or "all", and easier to see
how a behavior changes when manipulating the parse node structures
with location data.
--
Michael


signature.asc
Description: PGP signature


Make default subscription streaming option as Parallel

2024-10-06 Thread vignesh C
Hi,

By default, currently streaming of in-progress transactions for
subscriptions is disabled. All transactions are fully decoded on the
publisher before being sent to the subscriber. This approach can lead
to increased latency and reduced performance, particularly under heavy
load. By default, we could enable the parallel streaming option for
subscriptions. By doing this, incoming changes will be directly
applied by one of the available parallel apply workers. This method
significantly improves the performance of commit operations.

I conducted a series of tests using logical replication, comparing SQL
execution times with streaming set to both parallel and off. The tests
varied the logical_decoding_work_mem setting and included the
following scenarios: a) Insert, b) Delete, c) Update, d) rollback 5%
records, e) rollback 10% records, f) rollback 20% records, g) rollback
50% records. I have written tap tests for the same, the attached files
can be copied to src/test/subscription/t and the
logical_decoding_work_mem configuration and streaming option in create
subscription command should be changed accordingly before running the
tests. The tests were executed 5 times and the average of them was
taken.
The execution time is in seconds.

Insert 5kk records
Logical Decoding mem |  Parallel |  off|  % Improvement
---|-|---|
 64 KB   | 37.304   | 69.465  | 46.298
256 KB  | 36.327   | 70.671  | 48.597
 64 MB  | 41.173   | 69.228  | 40.526

Delete 5kk records
Logical Decoding mem |  Parallel |  off|  % Improvement
---|-|---|
 64 KB   | 42.322   | 69.404  | 39.021
256 KB  | 43.250   | 66.973  | 35.422
 64 MB   | 44.183   | 67.873  | 34.903

Update 5kk records
Logical Decoding mem |  Parallel |  off|  % Improvement
---|-|---|
64 KB   | 93.953| 127.691| 26.422
256 KB | 94.166| 128.541| 26.743
 64 MB | 93.367| 134.275| 30.465

Rollback 05% records
Logical Decoding mem |  Parallel |  off|  % Improvement
---|-|---|
64 KB   | 36.968| 67.161  | 44.957
256 KB | 38.059| 68.021  | 44.047
 64 MB | 39.431| 66.878  | 41.041

Rollback 10% records
Logical Decoding mem |  Parallel |  off|  % Improvement
---|-|---|
64 KB   | 35.966| 63.968  | 43.775
256 KB | 36.597| 64.836  | 43.554
 64 MB | 39.069| 64.357  | 39.292

Rollback 20% records
Logical Decoding mem |  Parallel |  off|  % Improvement
---|-|---|
64 KB   | 37.616| 58.903  | 36.139
256 KB | 37.330| 58.606  | 36.303
 64 MB | 38.720| 60.236  | 35.720

Rollback 50% records
Logical Decoding mem |  Parallel |  off|  % Improvement
---|-|---|
64 KB   | 38.999| 44.776  | 12.902
256 KB | 36.567| 44.530  | 17.882
 64 MB | 38.592 | 45.346  | 14.893

The machine configuration that was used is also attached.

The tests demonstrate a significant performance improvement when using
the parallel streaming option, insert shows 40-48 %improvement, delete
shows 34-39 %improvement, update shows 26-30 %improvement. In the case
of rollback the improvement is between 12-44%, the improvement
slightly reduces with larger amounts of data being rolled back in this
case. If there's a significant amount of data to roll back, the
performance of streaming in parallel may be comparable to or slightly
lower in some instances. However, this is acceptable since commit
operations are generally more frequent than rollback operations.

One key point to consider is that the lock on transaction objects will
be held for a longer duration when using streaming in parallel. This
occurs because the parallel apply worker initiates the transaction as
soon as streaming begins, maintaining the lock until the transaction
is fully completed. As a result, for long-running transactions, this
extended lock c

Re: GUC names in messages

2024-10-06 Thread Michael Paquier
On Tue, Sep 10, 2024 at 05:11:13PM +1000, Peter Smith wrote:
> I have rebased the two remaining patches. See v12 attached.

I've looked over the patch set again, and applied 0002.

0001 could be more ambitious and more consistent, like:
- The GUC name coming from the table's record is only used for
PGC_ENUM, let's use conf->gen.name across the board so as this counts
for custom GUCs.
- Once we do the first bullet point, parse_and_validate_value() can be
simplified as it does not need its "name" argument anymore.
- Once the second bullet point is simplified, going one way up
reveals more in set_config_with_handle().  I was wondering about
pg_parameter_aclcheck() for a bit until I've noticed
convert_GUC_name_for_parameter_acl() that applies a conversion with
the contents of the catalogs when checking for a parameter ACL,
similar to guc_name_compare().

One limitation is in AlterSystemSetConfigFile() where the parameter
name comes from the command and not a GUC record as the ACL check
happens before the GUC table lookup, but we could live with that.

At the end, I get the attached revised 0001.  WDYT?
--
Michael
From 3b3f4a241a2d127b421a968f5fb19d555e55b91c Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Mon, 7 Oct 2024 13:17:01 +0900
Subject: [PATCH v13] Apply GUC name from central table in more places of guc.c

---
 src/backend/utils/misc/guc.c  | 65 +++
 src/test/regress/expected/guc.out |  4 ++
 src/test/regress/sql/guc.sql  |  3 ++
 3 files changed, 39 insertions(+), 33 deletions(-)

diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 13527fc258..507a5d329a 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3115,7 +3115,6 @@ config_enum_get_options(struct config_enum *record, const char *prefix,
  * and also calls any check hook the parameter may have.
  *
  * record: GUC variable's info record
- * name: variable name (should match the record of course)
  * value: proposed value, as a string
  * source: identifies source of value (check hooks may need this)
  * elevel: level to log any error reports at
@@ -3127,7 +3126,7 @@ config_enum_get_options(struct config_enum *record, const char *prefix,
  */
 static bool
 parse_and_validate_value(struct config_generic *record,
-		 const char *name, const char *value,
+		 const char *value,
 		 GucSource source, int elevel,
 		 union config_var_val *newval, void **newextra)
 {
@@ -3142,7 +3141,7 @@ parse_and_validate_value(struct config_generic *record,
 	ereport(elevel,
 			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 			 errmsg("parameter \"%s\" requires a Boolean value",
-	name)));
+	conf->gen.name)));
 	return false;
 }
 
@@ -3162,7 +3161,7 @@ parse_and_validate_value(struct config_generic *record,
 	ereport(elevel,
 			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 			 errmsg("invalid value for parameter \"%s\": \"%s\"",
-	name, value),
+	conf->gen.name, value),
 			 hintmsg ? errhint("%s", _(hintmsg)) : 0));
 	return false;
 }
@@ -3181,7 +3180,7 @@ parse_and_validate_value(struct config_generic *record,
 			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 			 errmsg("%d%s%s is outside the valid range for parameter \"%s\" (%d%s%s .. %d%s%s)",
 	newval->intval, unitspace, unit,
-	name,
+	conf->gen.name,
 	conf->min, unitspace, unit,
 	conf->max, unitspace, unit)));
 	return false;
@@ -3203,7 +3202,7 @@ parse_and_validate_value(struct config_generic *record,
 	ereport(elevel,
 			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 			 errmsg("invalid value for parameter \"%s\": \"%s\"",
-	name, value),
+	conf->gen.name, value),
 			 hintmsg ? errhint("%s", _(hintmsg)) : 0));
 	return false;
 }
@@ -3222,7 +3221,7 @@ parse_and_validate_value(struct config_generic *record,
 			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 			 errmsg("%g%s%s is outside the valid range for parameter \"%s\" (%g%s%s .. %g%s%s)",
 	newval->realval, unitspace, unit,
-	name,
+	conf->gen.name,
 	conf->min, unitspace, unit,
 	conf->max, unitspace, unit)));
 	return false;
@@ -3278,7 +3277,7 @@ parse_and_validate_value(struct config_generic *record,
 	ereport(elevel,
 			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 			 errmsg("invalid value for parameter \"%s\": \"%s\"",
-	name, value),
+	conf->gen.name, value),
 			 hintmsg ? errhint("%s", _(hintmsg)) : 0));
 
 	if (hintmsg)
@@ -3460,7 +3459,7 @@ set_config_with_handle(const char *name, config_handle *handle,
 		ereport(elevel,
 (errcode(ERRCODE_INVALID_TRANSACTION_STATE),
  errmsg("parameter \"%s\" cannot be set during a parallel operation",
-		name)));
+		record->name)));
 		return 0;
 	}
 
@@ -3476,7 +3475,7 @@ set_config_with_handle(const char *name, config_handle *handle,
 	

Re: Add parallel columns for pg_stat_statements

2024-10-06 Thread Guillaume Lelarge
Hi Michael,

Le jeu. 3 oct. 2024 à 09:15, Michael Paquier  a écrit :

> On Thu, Aug 29, 2024 at 10:08:23PM +0200, Guillaume Lelarge wrote:
> > This patch was a bit discussed on [1], and with more details on [2]. It's
> > based on another patch sent in 2022 (see [3]). It introduces seven new
> > columns in pg_stat_statements:
> >
> >  * parallelized_queries_planned, number of times the query has been
> planned
> > to be parallelized,
> >  * parallelized_queries_launched, number of times the query has been
> > executed with parallelization,
>
> Comparing the numbers of workers planned and launched with the number
> of times a query has been called and planned should provide a rather
> good equivalent, no?  I am not sure that these two are mandatory to
> have.
>
>
I'm not sure I follow. That would mean that every time a query is executed,
it always gets the same amount of workers. Which is not guaranteed to be
true.

I would agree, though, that parallelized_queries_launched is probably not
that interesting. I could get rid of it if you think it should go away.

>  * parallelized_workers_planned, number of parallel workers planned for
> > this query,
> >  * parallelized_workers_launched, number of parallel workers executed for
> > this query,
>
> Yep.  Definitely OK with these two.  There is an overlap with what
> Benoit has sent here when it comes to publish this data to the
> executor state:
>
> https://www.postgresql.org/message-id/783bc7f7-659a-42fa-99dd-ee0565644...@dalibo.com
>
>
Well, I don't see this as an overlap. Rather more information.


> >  * parallelized_nodes, number of parallelized nodes,
> >  * parallelized_nodes_all_workers, number of parallelized nodes which had
> > all requested workers,
> >
> >  * parallelized_nodes_no_worker, number of parallelized nodes which had
> no
> > requested workers.
>
> I can see why you want to register this extra data on a node-basis,
> but how does that help when it comes to tune the parallel GUCs?  We
> cannot control them at node level and the launched/planned ratio
> offers an equivalent of that.  Not exactly, but that's enough to get a
> picture if there is a draught.
>
>
On this, I would agree with you. They are not that particularly useful to
get better setting for parallel GUCs. I can drop them if you want.

> A test script (test2.sql) is attached. You can execute it with "psql -Xef
> > test2.sql your_database" (your_database should not contain a t1 table as
> it
> > will be dropped and recreated).
>
> Let's add proper regression tests instead, including
> oldextversions.sql as this bumps the version of the module.  See for
> example the tests of 6fd5071909a2 that can force workers to spawn
> for BRIN and btree queries, validating some of the stats published
> here.
>

Did this on the v2 version of the patch (attached here).

Thanks for your review. If you want the parallelized_queries_launched
column and the parallelized_nodes_* columns dropped, I can do that on a v3
patch.

Regards.


-- 
Guillaume.
From 8228f8d521eb9853cba6b5d185e4d9965f29d398 Mon Sep 17 00:00:00 2001
From: Guillaume Lelarge 
Date: Wed, 28 Aug 2024 15:30:05 +0200
Subject: [PATCH v2] Add parallel columns to pg_stat_statements

There are seven new columns:
 * parallelized_queries_planned (number of times the query has been planned to
   be parallelized),
 * parallel_ized_querieslaunched (number of times the query has been executed
   with parallelization),
 * parallelized_workers_planned (number of parallel workers planned for
   this query),
 * parallelized_workers_launched (number of parallel workers executed for
   this query),
 * parallelized_nodes (number of parallelized nodes),
 * parallelized_nodes_all_workers (number of parallelized nodes which
   had all requested workers),
 * parallelized_nodes_no_worker (number of parallelized nodes which had
   no requested workers).

These new columns will help to monitor and better configure query
parallelization.
---
 contrib/pg_stat_statements/Makefile   |   4 +-
 .../expected/oldextversions.out   |  69 +++
 .../pg_stat_statements/expected/parallel.out  |  38 ++
 .../pg_stat_statements--1.11--1.12.sql|  80 +
 .../pg_stat_statements/pg_stat_statements.c   | 108 --
 .../pg_stat_statements.control|   2 +-
 .../pg_stat_statements/sql/oldextversions.sql |   5 +
 contrib/pg_stat_statements/sql/parallel.sql   |  27 +
 doc/src/sgml/pgstatstatements.sgml|  63 ++
 src/backend/executor/execUtils.c  |   7 ++
 src/backend/executor/nodeGather.c |   9 +-
 src/backend/executor/nodeGatherMerge.c|   8 ++
 src/include/nodes/execnodes.h |   6 +
 13 files changed, 415 insertions(+), 11 deletions(-)
 create mode 100644 contrib/pg_stat_statements/expected/parallel.out
 create mode 100644 contrib/pg_stat_statements/pg_stat_statements--1.11--1.12.sql
 create mode 100644 contrib/pg_stat_statements/sql/parall

Re: Should CSV parsing be stricter about mid-field quotes?

2024-10-06 Thread Andrew Dunstan


On 2024-10-04 Fr 12:19 PM, Joel Jacobson wrote:

On Sun, Jul 2, 2023, at 07:45, Noah Misch wrote:

On Sat, May 20, 2023 at 09:16:30AM +0200, Joel Jacobson wrote:

On Fri, May 19, 2023, at 18:06, Daniel Verite wrote:

COPY FROM file CSV somewhat differs as your example shows,
but it still mishandle \. when unquoted. For instance, consider this
file to load with COPY  t FROM '/tmp/t.csv' WITH CSV
$ cat /tmp/t.csv
line 1
\.
line 3
line 4

It results in having only "line 1" being imported.

Hmm, this is a problem for one of the new use-cases I brought up that would be
possible with DELIMITER NONE QUOTE NONE, i.e. to import unstructured log files,
where each raw line should be imported "as is" into a single text column.

Is there a valid reason why \. is needed for COPY FROM filename?

No.


It seems to me it would only be necessary for the COPY FROM STDIN case,
since files have a natural end-of-file and a known file size.

Right.  Even for COPY FROM STDIN, it's not needed anymore since the
removal of
protocol v2.  psql would still use it to find the end of inline COPY
data,
though.  Here's another relevant thread:
https://postgr.es/m/flat/bfcd57e4-8f23-4c3e-a5db-2571d09208e2%40beta.fastmail.com

I was very pleased to see commit 7702337:

 Do not treat \. as an EOF marker in CSV mode for COPY IN.

Great job!

Thanks to this fix, maybe there is now interest to resume the discussion on
the ideas discussed in this thread?

Recap of ideas:

1. Stricter parsing, reject mid-field quotes

The idea is to prevent balanced mid-field quotes from being silently removed.

Example:

% cat example.csv
id,post
1,Hello there!
2,http://example.com";>Click me!

% psql
# \copy posts from example.csv with csv header;
COPY 2
# SELECT * FROM posts;
  id |   post
+--
   1 | Hello there!
   2 | http://example.com>Click me!
(2 rows)

Note how the quotes around the URL disappeared.

2. Avoid needing hacks like using E'\x01' as quoting char.

Introduce QUOTE NONE and DELIMITER NONE,
to allow raw lines to be imported "as is" into a single text column.




As I think I previously indicated, I'm perfectly happy about 2, because 
it replaces a far from obvious hack, but I am at best dubious about 1.



cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: Should rolpassword be toastable?

2024-10-06 Thread Nathan Bossart
On Sun, Oct 06, 2024 at 11:42:53AM +0200, Hannu Krosing wrote:
> On Fri, Oct 4, 2024 at 4:48 PM Nathan Bossart  
> wrote:
>> Since BLCKSZ can be as low as 1024, I think 512 would be a good choice.
> 
> Where did you get the minimal value of 1024 from ?


https://www.postgresql.org/docs/devel/install-make.html#CONFIGURE-OPTION-WITH-BLOCKSIZE

-- 
nathan




Re: [BUG FIX]Connection fails with whitespace after keepalives parameter value

2024-10-06 Thread Michael Paquier
On Thu, Oct 03, 2024 at 08:12:28PM -0400, Tom Lane wrote:
> OK, if there's no objections let's push both remaining patches
> to HEAD only.

Done as of f22e84df1dea and 430ce189fc45.
--
Michael


signature.asc
Description: PGP signature


Re: Add parallel columns for seq scan and index scan on pg_stat_all_tables and _indexes

2024-10-06 Thread Alena Rybakina

Hi!

Finally found some time to work on this. Tests added on v5 patch 
(attached).
Maybe I'm not aware of the whole context of the thread and maybe my 
questions will seem a bit stupid, but honestly
it's not entirely clear to me how this statistics will help to adjust 
the number of parallel workers.
We may have situations when during overestimation of the cardinality 
during query optimization a several number of parallel workers were 
unjustifiably generated and vice versa -

due to a high workload only a few number of workers were generated.
How do we identify such cases so as not to increase or decrease the 
number of parallel workers when it is not necessary?


--
Regards,
Alena Rybakina
Postgres Professional


Re: Converting tab-complete.c's else-if chain to a switch

2024-10-06 Thread Thomas Munro
On Mon, Oct 7, 2024 at 12:11 PM Tom Lane  wrote:
> Hmm, I should think that if you break anything in tab-complete.in.c,
> the breakage would propagate to tab-complete.c without difficulty.
> Do you have an example of something that the preprocessor would mask?

Fair point, and nope.  Thought withdrawn.




Re: Add parallel columns for pg_stat_statements

2024-10-06 Thread Michael Paquier
On Sun, Oct 06, 2024 at 03:32:02PM +0200, Guillaume Lelarge wrote:
> I'm not sure I follow. That would mean that every time a query is executed,
> it always gets the same amount of workers. Which is not guaranteed to be
> true.
>
> I would agree, though, that parallelized_queries_launched is probably not
> that interesting. I could get rid of it if you think it should go away.

My point is that these stats are useful to know which action may have
to be taken when reaching a mean, and numbers in pg_stat_statements
offer hints that something is going wrong and that a closer lookup at
an EXPLAIN plan may be required, particularly if the total number of
workers planned and launched aggregated in the counters is unbalanced
across queries.  If the planned/launched ratio is balanced across most
queries queries, a GUC adjustment may be OK.  If the ratio is very
unbalanced in a lower set of queries, I'd also look at tweaking GUCs
instead like the per_gather.  These counters give information that one
or the other may be required.

> Well, I don't see this as an overlap. Rather more information.

Later versions of Benoit's patch have been accumulating this data in
the executor state.  v4 posted at [1] has the following diffs:
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -724,6 +724,9 @@ typedef struct EState
 */
List   *es_insert_pending_result_relations;
List   *es_insert_pending_modifytables;
+
+   int es_workers_launched;
+   int es_workers_planned;
 } EState;

Your v2 posted on this thread has that:
@@ -707,6 +707,12 @@ typedef struct EState
struct EPQState *es_epq_active;
 
booles_use_parallel_mode;   /* can we use parallel workers? 
*/
+   booles_used_parallel_mode;  /* was executed in parallel */
+   int es_parallelized_workers_launched;
+   int es_parallelized_workers_planned;
+   int es_parallelized_nodes; /* # of parallelized 
nodes */
+   int es_parallelized_nodes_all_workers; /* # of 
nodes with all workers launched */
+   int es_parallelized_nodes_no_worker; /* # of nodes 
with no workers launched */
 
es_parallelized_workers_launched and es_workers_launched are the same
thing in both.

> On this, I would agree with you. They are not that particularly useful to
> get better setting for parallel GUCs. I can drop them if you want.

Yep.  I would remove them for now.  This leads to more bloat.

> Did this on the v2 version of the patch (attached here).
> 
> Thanks for your review. If you want the parallelized_queries_launched
> column and the parallelized_nodes_* columns dropped, I can do that on a v3
> patch.

I'd recommend to split that into more independent patches:
- Introduce the two counters in EState with the incrementations done
in nodeGatherMerge.c and nodeGather.c (mentioned that at [2], you may
want to coordinate with Benoit to avoid duplicating the work). 
- Expand pg_stat_statements to use them for DMLs, SELECTs, well where
they matter. 
- Look at expanding that for utilities that can do parallel jobs:
CREATE INDEX and VACUUM, but this has lower priority to me, and this
can reuse the same counters as the ones added by patch 2.

[1]: 
https://www.postgresql.org/message-id/6ecad3ad-835c-486c-9ebd-da87a9a97...@dalibo.com
[2]: https://www.postgresql.org/message-id/zv46wtmjltuu2...@paquier.xyz
--
Michael


signature.asc
Description: PGP signature


Re: Add parallel columns for seq scan and index scan on pg_stat_all_tables and _indexes

2024-10-06 Thread Michael Paquier
On Mon, Oct 07, 2024 at 12:43:18AM +0300, Alena Rybakina wrote:
> Maybe I'm not aware of the whole context of the thread and maybe my
> questions will seem a bit stupid, but honestly
> it's not entirely clear to me how this statistics will help to adjust the
> number of parallel workers.
> We may have situations when during overestimation of the cardinality during
> query optimization a several number of parallel workers were unjustifiably
> generated and vice versa -
> due to a high workload only a few number of workers were generated.
> How do we identify such cases so as not to increase or decrease the number
> of parallel workers when it is not necessary?

Well.  For spiky workloads, only these numbers are not going to help.
If you can map them with the number of times a query related to these
tables has been called, something that pg_stat_statements would be
able to show more information about.

FWIW, I have doubts that these numbers attached to this portion of the
system are always useful.  For OLTP workloads, parallel workers would
unlikely be spawned because even with JOINs we won't work with a high
number of tuples that require them.  This could be interested with
analytics, however complex query sequences mean that we'd still need
to look at all the plans involving the relations where there is an
unbalance of planned/spawned workers, because these can usually
involve quite a few gather nodes.  At the end of the day, it seems to
me that we would still need data that involves statements to track
down specific plans that are starving.  If your application does not
have that many statements, looking at individial plans is OK, but if
you have hundreds of them to dig into, this is time-consuming and
stats at table/index level don't offer data in terms of stuff that
stands out and needs adjustments.

And this is without the argument of bloating more the stats entries
for each table, even if it matters less now that these stats are in
shmem lately.
--
Michael


signature.asc
Description: PGP signature


Re: BUG #18641: Logical decoding of two-phase commit fails with TOASTed default values

2024-10-06 Thread Amit Kapila
On Tue, Oct 1, 2024 at 7:07 PM Takeshi Ideriha  wrote:
>
> >We forgot to set/unset the flag in functions
> >systable_beginscan_ordered and systable_endscan_ordered. BTW,
>
> Thank you for the clarification.
>
> >shouldn't this occur even without prepare transaction? If so, we need
> >to backpatch this till 14.
>
> Yes, it occurred also at PG14.
> I did some tests using 015_stream.pl with some modification like
> below, which tests the subscription about stream is on but two-phase
> is off.
> The same issue occurred at both current head source code and PG14.
>
> ```
> --- a/src/test/subscription/t/015_stream.pl
> +++ b/src/test/subscription/t/015_stream.pl
> @@ -134,9 +134,11 @@ my $node_subscriber =
> PostgreSQL::Test::Cluster->new('subscriber');
>  $node_subscriber->init;
>  $node_subscriber->start;
>
> +my $default = join('', map {chr(65 + rand 26)} (1 .. 1));
> +
>  # Create some preexisting content on publisher
>  $node_publisher->safe_psql('postgres',
> -   "CREATE TABLE test_tab (a int primary key, b bytea)");
> +   "CREATE TABLE test_tab (a int primary key, b bytea, t text
> DEFAULT '$default')");
>  $node_publisher->safe_psql('postgres',
> "INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");
>
> @@ -144,7 +146,7 @@ $node_publisher->safe_psql('postgres', "CREATE
> TABLE test_tab_2 (a int)");
>
>  # Setup structure on subscriber
>  $node_subscriber->safe_psql('postgres',
> -   "CREATE TABLE test_tab (a int primary key, b bytea, c
> timestamptz DEFAULT now(), d bigint DEFAULT 999)"
> +   "CREATE TABLE test_tab (a int primary key, b bytea, t text
> DEFAULT '$default', c timestamptz DEFAULT now(), d bigint DEFAULT
> 999)"
>  );
> ```
>
> Logs from head source:
> ```
> 2024-10-01 12:34:56.694 UTC [575202] LOG:  starting PostgreSQL 18devel
> on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1 20230605 (Red Hat
> 11.4.1-2), 64-bit
> ...
> 2024-10-01 12:34:57.506 UTC [575258] 015_stream.pl LOG:  statement: BEGIN;
> 2024-10-01 12:34:57.506 UTC [575258] 015_stream.pl LOG:  statement:
> INSERT INTO test_tab SELECT i, sha256(i::text::bytea) FROM
> generate_series(3, 5000) s(i);
> 2024-10-01 12:34:57.524 UTC [575242] tap_sub ERROR:  unexpected
> table_index_fetch_tuple call during logical decoding
> 2024-10-01 12:34:57.524 UTC [575242] tap_sub STATEMENT:
> START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
> streaming 'on', origin 'any', publication_names '"tap_pub"')
> 2024-10-01 12:34:57.525 UTC [575242] tap_sub LOG:  released logical
> replication slot "tap_sub"
> 2024-10-01 12:34:57.525 UTC [575242] tap_sub LOG:  could not send data
> to client: Broken pipe
> 2024-10-01 12:34:57.525 UTC [575242] tap_sub FATAL:  connection to client lost
> ...
> 2024-10-01 12:34:57.829 UTC [575260] tap_sub STATEMENT:
> START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
> streaming 'on', origin 'any', publication_names '"tap_pub"')
> 2024-10-01 12:34:57.829 UTC [575260] tap_sub LOG:  starting logical
> decoding for slot "tap_sub"
> 2024-10-01 12:34:57.829 UTC [575260] tap_sub DETAIL:  Streaming
> transactions committing after 0/1583A68, reading WAL from 0/1583A30.
> 2024-10-01 12:34:57.829 UTC [575260] tap_sub STATEMENT:
> START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
> streaming 'on', origin 'any', publication_names '"tap_pub"')
> 2024-10-01 12:34:57.829 UTC [575260] tap_sub LOG:  logical decoding
> found consistent point at 0/1583A30
> 2024-10-01 12:34:57.829 UTC [575260] tap_sub DETAIL:  There are no
> running transactions.
> 2024-10-01 12:34:57.829 UTC [575260] tap_sub STATEMENT:
> START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
> streaming 'on', origin 'any', publication_names '"tap_pub"')
> 2024-10-01 12:34:57.831 UTC [575260] tap_sub ERROR:  unexpected
> table_index_fetch_tuple call during logical decoding
> 2024-10-01 12:34:57.831 UTC [575260] tap_sub STATEMENT:
> START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
> streaming 'on', origin 'any', publication_names '"tap_pub"')
> 2024-10-01 12:34:57.832 UTC [575260] tap_sub LOG:  released logical
> replication slot "tap_sub"
> 2024-10-01 12:34:57.832 UTC [575260] tap_sub LOG:  could not send data
> to client: Broken pipe
> 2024-10-01 12:34:57.832 UTC [575260] tap_sub FATAL:  connection to client lost
> ```
>
> Logs from PG14 source:
>
> ```
> 2024-10-01 13:20:29.409 UTC [593696] LOG:  starting PostgreSQL 14.9 on
> x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1 20230605 (Red Hat
> 11.4.1-2), 64-bit
> ...
> 2024-10-01 13:20:31.285 UTC [593750] 015_stream.pl LOG:  statement: BEGIN;
> 2024-10-01 13:20:31.285 UTC [593750] 015_stream.pl LOG:  statement:
> INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(3,
> 5000) s(i);
> 2024-10-01 13:20:31.301 UTC [593740] tap_sub ERROR:  unexpected
> table_index_fetch_tuple call during logical decoding
> 2024-10-01 13:20:31.301 UTC [593740] tap_sub STATEMENT:
> START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_ve

Re: Converting tab-complete.c's else-if chain to a switch

2024-10-06 Thread Thomas Munro
On Sat, Jul 27, 2024 at 8:03 AM Tom Lane  wrote:
> I wrote:
> > I modified the preprocessor to work like that, and I like the results
> > better than what I had.
>
> I thought this version of the patch would be less subject to merge
> conflicts than v1, but it didn't take long for somebody to break it.
> Rebased v3 attached -- no nontrivial changes from v2.
>
> I'd like to get this merged soon ...

+1

This works nicely for me and I couldn't find anything that doesn't
work as expected, the mechanism seems reasonable, etc.  I like how
it's moving slowly towards a grammar DSL.

Should we try to make sure that tab-complete.in.c remains compilable
directly with a build mode that does that with 'cp' instead of the
perl script?

One very trivial thing, where you say:

+# where N is the replacement case label, "HeadMatch" is the original
+# function name shorn of "es", and the rest are the function arguments.

... maybe simpler language: 'with "es" removed'?  (Thinking of
non-native speakers; I understood that perfectly but I'm anglophone in
a country overrun with sheep.)




Re: On disable_cost

2024-10-06 Thread Alena Rybakina

On 06.10.2024 02:22, David Rowley wrote:

To be honest, I don’t understand at all why we don’t count disabled nodes for 
append here? As I understand it, this is due to the fact that the partitioned 
table can also be scanned by an index. Besides mergeappend, in general it’s 
difficult for me to generalize for which nodes this rule applies, can you 
explain here?

There are no special rules here of what to display based on the node
type. Maybe you think there are some special rules because of the
special cases for Append and MergeAppend in the patch? Those are
handled specially as they don't use the Plan's lefttree and righttree
fields.


To be honest, I didn't quite understand initially why we don't display 
information about disabled nodes for Append and MergeAppend, therefore I 
had a question about other cases.  Thank you for your explanation it was 
helpful!


I also checked the code to see what parameters there are for these nodes 
(Append and MergeAppend) and how they are processed.
To sum up, they provide for collecting information from child nodes. I 
agree that they do not need additional display about disabled nodes.



Are you saying that the "Disabled: true" should propagate to the root
of the plan tree?  That fact that master does that is what Laurenz and
I are complaining about. I'm not sure if I follow what you're asking.


I agree that it's better to display such information for a specific 
disabled node. It's clearer what's going on and what it means.


--
Regards,
Alena Rybakina
Postgres Professional


Re: Converting tab-complete.c's else-if chain to a switch

2024-10-06 Thread Tom Lane
Thomas Munro  writes:
> On Sat, Jul 27, 2024 at 8:03 AM Tom Lane  wrote:
>> I thought this version of the patch would be less subject to merge
>> conflicts than v1, but it didn't take long for somebody to break it.
>> Rebased v3 attached -- no nontrivial changes from v2.

> +1

Thanks for looking at it!

> Should we try to make sure that tab-complete.in.c remains compilable
> directly with a build mode that does that with 'cp' instead of the
> perl script?

Hmm, I should think that if you break anything in tab-complete.in.c,
the breakage would propagate to tab-complete.c without difficulty.
Do you have an example of something that the preprocessor would mask?

> One very trivial thing, where you say:
> +# where N is the replacement case label, "HeadMatch" is the original
> +# function name shorn of "es", and the rest are the function arguments.
> ... maybe simpler language: 'with "es" removed'?  (Thinking of
> non-native speakers; I understood that perfectly but I'm anglophone in
> a country overrun with sheep.)

Sure, that's changeable.

regards, tom lane




Re: Use heap scan routines directly in vac_update_datfrozenxid()

2024-10-06 Thread Tom Lane
Soumyadeep Chakraborty  writes:
> Attached is a simple patch to directly use heap scan routines in
> vac_update_datfrozenxid(), avoiding the multilayer overhead from the
> sysscan infrastructure.

I would think the overhead of that is minuscule.  If it isn't,
we should try to make it so, not randomly hack up some of the callers
to avoid it.  The intention certainly was that it wouldn't cost
anything compared to what happens within the actual table access.

regards, tom lane




Re: Function for listing pg_wal/summaries directory

2024-10-06 Thread Michael Paquier
On Fri, Oct 04, 2024 at 10:02:11AM -0500, Nathan Bossart wrote:
> Could you explain why you feel the existing support functions are
> insufficient?

Because it is not possible to outsource the scan of pg_wal/summaries/
to a different role, no?

On HEAD, one would require a full access to the data folder, as
pg_ls_waldir() does not recurse.  Perhaps, rather than introducing a
new function, we should just extend pg_ls_dir_files() so as it can
optionally recurse, then use this option in the existing
pg_ls_waldir() to show stats for pg_wal/ and pg_wal/summary/?  You'd
optinally need to print the relative path of the file in the output of
the function, meaning that any summary file in should be shown as
summaries/blah.summary.  Just an idea for the bucket of ideas.

This would be unconsistent with the existing
pg_ls_archive_statusdir(), though, so using a new function may be just
better and simpler.
--
Michael


signature.asc
Description: PGP signature


Re: System username in pg_stat_activity

2024-10-06 Thread Michael Paquier
On Tue, Feb 20, 2024 at 10:32:53PM +0100, Magnus Hagander wrote:
> In a way, that's yet another different type of values though -- it
> contains accumulated stats. So we really have 3 types -- "info" that's
> not really stats (username, etc), "current state" (query, wait events,
> state) and "accumulated stats" (counters since start).If we don't want
> to combine them all, we should perhaps not combine any and actually
> have 3 views?

This is the last message sent on this thread, and an entry was still
registered in the CF app, so I am marking that as RwF for now.
--
Michael


signature.asc
Description: PGP signature


Re: POC, WIP: OR-clause support for indexes

2024-10-06 Thread Alexander Korotkov
On Fri, Oct 4, 2024 at 4:34 PM Robert Haas  wrote:
> On Mon, Sep 23, 2024 at 7:11 AM Alexander Korotkov  
> wrote:
> > Makes sense.  Please, check the attached patch freeing the consts list
> > while returning NULL from match_orclause_to_indexcol().
>
> Some review comments:
>
> I agree with the comments already given to the effect that the patch
> looks much better now. I was initially surprised to see this happening
> in match_clause_to_indexcol() but after studying it I think it looks
> like the right place. I think it makes sense to think about moving
> forward with this, although it would be nice to get Tom's take if we
> can.
>
> I see that the patch makes no update to the header comment for
> match_clause_to_indexcol() nor to the comment just above the cascade
> of if-statements. I think both need to be updated.
>
> More generally, many of the comments in this patch seem to just
> explain what the code does, and I'd like to reiterate my usual
> complaint: as far as possible, comments should explain WHY the code
> does what it does. Certainly, in some cases there's nothing to be said
> about that e.g. /* Lookup for operator to fetch necessary information
> for the SAOP node */ isn't really saying anything non-obvious but it's
> reasonable to have the comment here anyway. However, when there is
> something more interesting to be said, then we should do that rather
> than just reiterate what the reader who knows C can anyway see. For
> instance, the lengthy comment beginning with "Iterate over OR
> entries." could either be shorter and recapitulate less of the code
> that follows, or it could say something more interesting about why
> we're doing it like that.
>
> + /* We allow constant to be Const or Param */
> + if (!IsA(constExpr, Const) && !IsA(constExpr, Param))
> + break;
>
> This restriction is a lot tighter than the one mentioned in the header
> comment of match_clause_to_indexcol ("Our definition of const is
> exceedingly liberal"). If there's a reason for that, the comments
> should talk about it. If there isn't, it's better to be consistent.
>
>  + /*
> + * Check operator is present in the opfamily, expression collation
> + * matches index collation.  Also, there must be an array type in
> + * order to construct an array later.
> + */
> + if (!IndexCollMatchesExprColl(index->indexcollations[indexcol],
> inputcollid) ||
> + !op_in_opfamily(matchOpno, index->opfamily[indexcol]) ||
> + !OidIsValid(arraytype))
> + break;
>
> I spent some time wondering whether this was safe. The
> IndexCollMatchesExprColl() guarantees that either the input collation
> is equal to the index collation, or the index collation is 0. If the
> index collation is 0 then that I *think* that guarantees that the
> indexed type is non-collatable, but this could be a cross-type
> comparison, and it's possible that the other type is collatable. In
> that case, I don't think anything would prevent us from merging a
> bunch of OR clauses with different collations into a single SAOP. I
> don't really see how that could be a problem, because if the index is
> of a non-collatable type, then presumably the operator doesn't care
> about what the collation is, so it should all be fine, I guess? But
> I'm not very confident about that conclusion.
>
> I'm unclear what the current thinking is about the performance of this
> patch, both as to planning and as to execution. Do we believe that
> this transformation is a categorical win at execution-time? In theory,
> OR format alllows for short-circuit execution, but because of the
> Const-or-Param restriction above, I don't think that's mostly a
> non-issue. But maybe not completely, because I can see from the
> regression test changes that it's possible for us to apply this
> transformation when the Param is set by an InitPlan or SubPlan. If we
> have something like WHERE tenthous = 1 OR tenthous =
> (very_expensive_computation() + 1), maybe the patch could lose,
> because we'll have to do the very expensive calculation to evaluate
> the SAOP, and the OR could stop as soon as we establish that tenthous
> != 1. If we only did the transformation when the Param is an external
> parameter, then we wouldn't have this issue. Maybe this isn't worth
> worrying about; I'm not sure. Are there any other cases where the
> transformation can produce something that executes more slowly?
>
> As far as planning time is concerned, I don't think this is going to
> be too bad, because most of the work only needs to be done if there
> are OR-clauses, and my intuition is that the optimization will often
> apply in such cases, so it seems alright. But I wonder how much
> testing has been done of adversarial cases, e.g. lots of non-indexable
> clause in the query; or lots of OR clauses in the query but all of
> them turn out on inspection to be non-indexable. My expectation would
> be that there's no real problem here, but it would be good to verify
> that experimentally.

Thank you so much for the review.  I'

Re: System views for versions reporting

2024-10-06 Thread Tom Lane
Joe Conway  writes:
> I think it would be nice to include a sha256 hash (or something similar) 
> of the libraries as well, so that they can be checked against known good 
> values.

That seems well outside the charter of this patch.  Also, how would
we even get that information?  A typical application doesn't know
exactly what libraries it's linked with or where they came from on
the filesystem.  Maybe one could find that out with sufficient
platform-specific hackery, but I don't believe we could do it
portably.

regards, tom lane




Re: Add parallel columns for seq scan and index scan on pg_stat_all_tables and _indexes

2024-10-06 Thread Guillaume Lelarge
Hello,

Le jeu. 5 sept. 2024 à 08:19, Guillaume Lelarge  a
écrit :

> Le jeu. 5 sept. 2024 à 07:36, Bertrand Drouvot <
> bertranddrouvot...@gmail.com> a écrit :
>
>> Hi,
>>
>> On Wed, Sep 04, 2024 at 04:37:19PM +0200, Guillaume Lelarge wrote:
>> > Hi,
>> >
>> > Le mer. 4 sept. 2024 à 16:18, Bertrand Drouvot <
>> bertranddrouvot...@gmail.com>
>> > a écrit :
>> > > What about adding a comment instead of this extra check?
>> > >
>> > >
>> > Done too in v3.
>>
>> Thanks!
>>
>> 1 ===
>>
>> +   /*
>> +* Don't check counts.parallelnumscans because counts.numscans
>> includes
>> +* counts.parallelnumscans
>> +*/
>>
>> "." is missing at the end of the comment.
>>
>>
> Fixed in v4.
>
>
>> 2 ===
>>
>> -   if (t > tabentry->lastscan)
>> +   if (t > tabentry->lastscan && lstats->counts.numscans)
>>
>> The extra check on lstats->counts.numscans is not needed as it's already
>> done
>> a few lines before.
>>
>>
> Fixed in v4.
>
>
>> 3 ===
>>
>> +   if (t > tabentry->parallellastscan &&
>> lstats->counts.parallelnumscans)
>>
>> This one makes sense.
>>
>> And now I'm wondering if the extra comment added in v3 is really worth it
>> (and
>> does not sound confusing)? I mean, the parallel check is done once we
>> passe
>> the initial test on counts.numscans. I think the code is clear enough
>> without
>> this extra comment, thoughts?
>>
>>
> I'm not sure I understand you here. I kinda like the extra comment though.
>
>
>> 4 ===
>>
>> What about adding a few tests? or do you want to wait a bit more to see
>> if "
>> there's an agreement on this patch" (as you stated at the start of this
>> thread).
>>
>>
> Guess I can start working on that now. It will take some time as I've
> never done it before. Good thing I added the patch on the November commit
> fest :)
>
>

Finally found some time to work on this. Tests added on v5 patch (attached).

Regards.


-- 
Guillaume.
From 92474720b3178f74517958fededcf6797de58552 Mon Sep 17 00:00:00 2001
From: Guillaume Lelarge 
Date: Sun, 6 Oct 2024 21:50:17 +0200
Subject: [PATCH v5] Add parallel columns for pg_stat_all_tables,indexes

pg_stat_all_tables gets 4 new columns: parallel_seq_scan,
last_parallel_seq_scan, parallel_idx_scan, last_parallel_idx_scan.

pg_stat_all_indexes gets 2 new columns: parallel_idx_scan,
last_parallel_idx_scan.
---
 doc/src/sgml/monitoring.sgml |  69 ++-
 src/backend/access/brin/brin.c   |   2 +-
 src/backend/access/gin/ginscan.c |   2 +-
 src/backend/access/gist/gistget.c|   4 +-
 src/backend/access/hash/hashsearch.c |   2 +-
 src/backend/access/heap/heapam.c |   2 +-
 src/backend/access/nbtree/nbtsearch.c|  12 +-
 src/backend/access/spgist/spgscan.c  |   2 +-
 src/backend/catalog/system_views.sql |   6 +
 src/backend/utils/activity/pgstat_relation.c |   8 +
 src/backend/utils/adt/pgstatfuncs.c  |   6 +
 src/include/catalog/pg_proc.dat  |   8 +
 src/include/pgstat.h |  17 +-
 src/test/regress/expected/rules.out  |  18 ++
 src/test/regress/expected/stats.out  | 194 +++
 src/test/regress/sql/stats.sql   |  92 +
 16 files changed, 421 insertions(+), 23 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 331315f8d3..aeaabb0ffe 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3803,7 +3803,7 @@ description | Waiting for a newly initialized WAL file to reach durable storage
seq_scan bigint
   
   
-   Number of sequential scans initiated on this table
+   Number of sequential scans (including parallel ones) initiated on this table
   
  
 
@@ -3812,7 +3812,26 @@ description | Waiting for a newly initialized WAL file to reach durable storage
last_seq_scan timestamp with time zone
   
   
-   The time of the last sequential scan on this table, based on the
+   The time of the last sequential scan (including parallel ones) on this table, based on the
+   most recent transaction stop time
+  
+ 
+
+ 
+  
+   parallel_seq_scan bigint
+  
+  
+   Number of parallel sequential scans initiated on this table
+  
+ 
+
+ 
+  
+   last_parallel_seq_scan timestamp with time zone
+  
+  
+   The time of the last parallel sequential scan on this table, based on the
most recent transaction stop time
   
  
@@ -3831,7 +3850,7 @@ description | Waiting for a newly initialized WAL file to reach durable storage
idx_scan bigint
   
   
-   Number of index scans initiated on this table
+   Number of index scans (including parallel ones) initiated on this table
   
  
 
@@ -3840,7 +3859,26 @@ description | Waiting for a newly initialized WAL file to reach durable storage
last_idx_scan times

Use heap scan routines directly in vac_update_datfrozenxid()

2024-10-06 Thread Soumyadeep Chakraborty
Hi hackers,

Attached is a simple patch to directly use heap scan routines in
vac_update_datfrozenxid(), avoiding the multilayer overhead from the
sysscan infrastructure. The speedup can be noticeable in databases
containing a large number of relations (perhaps due to heavy partition
table usage). This was proposed in [1].

Experiment setup:

* Use -O3 optimized build without asserts, with fsync and autovacuum off,
on my laptop. Other gucs are all at defaults.

* Create tables using pgbench to inflate pg_class's to a decent size.

$ cat << EOF > bench.sql
> select txid_current() AS txid \gset
> CREATE TABLE t:txid(a int);
> EOF

$ pgbench -f ./bench.sql -t 20 -c 100 -n bench

select pg_size_pretty(pg_relation_size('pg_class'));
 pg_size_pretty

 3508 MB
(1 row)

* Use instr_time to record the scan time. See attached instr_vac.diff.

* Run vacuum on any of the created empty tables in the database bench:

Results:

* main as of 68dfecbef2:

bench=# vacuum t1624;
NOTICE:  scan took 796.862142 ms
bench=# vacuum t1624;
NOTICE:  scan took 793.730688 ms
bench=# vacuum t1624;
NOTICE:  scan took 793.963655 ms

* patch:

bench=# vacuum t1624;
NOTICE:  scan took 682.283366 ms
bench=# vacuum t1624;
NOTICE:  scan took 670.816975 ms
bench=# vacuum t1624;
NOTICE:  scan took 683.821717 ms

Regards,
Soumyadeep (Broadcom)


[1] 
https://www.postgresql.org/message-id/20221229030329.fbpiitatmowzza6c%40awork3.anarazel.de
From 320e54894a1ad45e2c25a4ee88a6409a9dc1a527 Mon Sep 17 00:00:00 2001
From: Soumyadeep Chakraborty 
Date: Sat, 5 Oct 2024 16:22:56 -0700
Subject: [PATCH v1 1/1] Use heap_getnext() in vac_update_datfrozenxid()

Since we are going to do a full sequential scan without a filter, we can
avoid overhead from the extra layers of sysscan.
---
 src/backend/commands/vacuum.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index ac8f5d9c25..717e310054 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -1588,7 +1588,7 @@ vac_update_datfrozenxid(void)
 	HeapTuple	tuple;
 	Form_pg_database dbform;
 	Relation	relation;
-	SysScanDesc scan;
+	TableScanDesc scan;
 	HeapTuple	classTup;
 	TransactionId newFrozenXid;
 	MultiXactId newMinMulti;
@@ -1638,10 +1638,9 @@ vac_update_datfrozenxid(void)
 	 */
 	relation = table_open(RelationRelationId, AccessShareLock);
 
-	scan = systable_beginscan(relation, InvalidOid, false,
-			  NULL, 0, NULL);
+	scan = table_beginscan_catalog(relation, 0, NULL);
 
-	while ((classTup = systable_getnext(scan)) != NULL)
+	while ((classTup = heap_getnext(scan, ForwardScanDirection)) != NULL)
 	{
 		volatile FormData_pg_class *classForm = (Form_pg_class) GETSTRUCT(classTup);
 		TransactionId relfrozenxid = classForm->relfrozenxid;
@@ -1707,7 +1706,7 @@ vac_update_datfrozenxid(void)
 	}
 
 	/* we're done with pg_class */
-	systable_endscan(scan);
+	table_endscan(scan);
 	table_close(relation, AccessShareLock);
 
 	/* chicken out if bogus data found */
-- 
2.43.0

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 717e310054..db3e8a4baf 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -1599,6 +1599,9 @@ vac_update_datfrozenxid(void)
 	ScanKeyData key[1];
 	void	   *inplace_state;
 
+	instr_time	before;
+	instr_time	after;
+
 	/*
 	 * Restrict this task to one backend per database.  This avoids race
 	 * conditions that would move datfrozenxid or datminmxid backward.  It
@@ -1636,6 +1639,7 @@ vac_update_datfrozenxid(void)
 	 *
 	 * See vac_truncate_clog() for the race condition to prevent.
 	 */
+	INSTR_TIME_SET_CURRENT(before);
 	relation = table_open(RelationRelationId, AccessShareLock);
 
 	scan = table_beginscan_catalog(relation, 0, NULL);
@@ -1708,7 +1712,9 @@ vac_update_datfrozenxid(void)
 	/* we're done with pg_class */
 	table_endscan(scan);
 	table_close(relation, AccessShareLock);
-
+	INSTR_TIME_SET_CURRENT(after);
+	INSTR_TIME_SUBTRACT(after, before);
+	elog(NOTICE, "scan took %lf", INSTR_TIME_GET_MILLISEC(after));
 	/* chicken out if bogus data found */
 	if (bogus)
 		return;