date:20240930

On Mon, 30 Sep 2024 18:03:44 +0900 (JST)
Tatsuo Ishii  wrote:

> >>> I think there's an unnecessary underscore in config.sgml.
> 
> I was wrong. The particular byte sequences just looked an underscore
> on my editor but the byte sequence is actually 0xc2a0, which must be a
> "non breaking space" encoded in UTF-8. I guess someone mistakenly
> insert a non breaking space while editing config.sgml.
> 
> However the mistake does not affect the patch.

It looks like we've crisscrossed our mail.
Anyway, I agree with removing non breaking spaces, as well as
one found in line 85 of ref/drop_extension.sgml.

Regards,
Yugo Nagata

> 
> Best reagards,
> --
> Tatsuo Ishii
> SRA OSS K.K.
> English: http://www.sraoss.co.jp/index_en/
> Japanese:http://www.sraoss.co.jp


-- 
Yugo NAGATA

Re: Pgoutput not capturing the generated columns

2024-09-30 Thread Peter Smith

Hi Vignesh,

On Mon, Sep 23, 2024 at 10:49 PM vignesh C  wrote:
>

> 3) In create publication column list/publish_generated_columns
> documentation we should mention that if generated column is mentioned
> in column list, generated columns mentioned in column list will be
> replication irrespective of publish_generated_columns option.
>

v34-0003 introduced a new Chapter 29 (Logical Replication) section for
"Generated Column Replication"
- This version also added a link from CREATE PUBLICATION
'publish_generated_column' parameter to this new section

To address your column list point, in v35-0003 I added more
information about Generate Columns in the Chapter 29 section "Column
List". The CREATE PUBLICATION column lists docs already linked to
that. See [1]

==
[1] v35-0003 - 
https://www.postgresql.org/message-id/CAHut%2BPvoQS9HjcGFZrTHrUQZ8vzyfAcSgeTgQEoO_-f8CrhW4A%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Doc: typo in config.sgml

>>> I think there's an unnecessary underscore in config.sgml.

I was wrong. The particular byte sequences just looked an underscore
on my editor but the byte sequence is actually 0xc2a0, which must be a
"non breaking space" encoded in UTF-8. I guess someone mistakenly
insert a non breaking space while editing config.sgml.

However the mistake does not affect the patch.

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Re: Doc: typo in config.sgml

On Mon, 30 Sep 2024 17:23:24 +0900 (JST)
Tatsuo Ishii  wrote:

> >> I think there's an unnecessary underscore in config.sgml.
> >> Attached patch fixes it.
> > 
> > I could not apply the patch with an error.
> > 
> >  error: patch failed: doc/src/sgml/config.sgml:9380
> >  error: doc/src/sgml/config.sgml: patch does not apply
> 
> Strange. I have no problem applying the patch here.
> 
> > I found your patch contains an odd character (ASCII Code 240?)
> > by performing `od -c` command on the file. See the attached file.
> 
> Yes, 240 in octal (== 0xc2) is in the patch but it's because current
> config.sgml includes the character. You can check it by looking at
> line 9383 of config.sgml.

Yes, you are right, I can find the 0xc2 char in config.sgml using od -c,
although I still could not apply the patch. 

I think this is non-breaking space of (C2A0) of utf-8. I guess my
terminal normally regards this as a space, so applying patch fails.

I found it also in line 85 of ref/drop_extension.sgml.


> 
> I think it was introduced by 28e858c0f95.
> 
> Best reagards,
> --
> Tatsuo Ishii
> SRA OSS K.K.
> English: http://www.sraoss.co.jp/index_en/
> Japanese:http://www.sraoss.co.jp


-- 
Yugo NAGATA

Re: Doc: typo in config.sgml

2024-09-30 Thread Yugo Nagata

On Mon, 30 Sep 2024 15:34:04 +0900 (JST)
Tatsuo Ishii  wrote:

> I think there's an unnecessary underscore in config.sgml.
> Attached patch fixes it.

I could not apply the patch with an error.

 error: patch failed: doc/src/sgml/config.sgml:9380
 error: doc/src/sgml/config.sgml: patch does not apply

I found your patch contains an odd character (ASCII Code 240?)
by performing `od -c` command on the file. See the attached file.

Regards,
Yugo Nagata

> 
> Best reagards,
> --
> Tatsuo Ishii
> SRA OSS K.K.
> English: http://www.sraoss.co.jp/index_en/
> Japanese:http://www.sraoss.co.jp


-- 
Yugo Nagata 


fix_config.patch.od
Description: Binary data

Re: ACL_MAINTAIN, Lack of comment content

> On 30 Sep 2024, at 10:29, btsugieyuusuke  
> wrote:
> 
> Hi hackers,
> I found a flaw in the ACL_MAINTAIN comment.
> 
> Commands such as VACUUM are listed as commands that are allowed to be 
> executed by the MAINTAIN privilege.
> However, LOCK TABLE is missing from the comment.
> 
>> /*
>> * Check if ACL_MAINTAIN is being checked and, if so, and not already set
>> * as part of the result, then check if the user is a member of the
>> * pg_maintain role, which allows VACUUM, ANALYZE, CLUSTER, REFRESH
>> * MATERIALIZED VIEW, and REINDEX on all relations.
>> */
> 
> Therefore, shouldn't LOCK TABLE be added to the comment?

That's correct, for the list to be complete LOCK TABLE should be added as per
the attached.

--
Daniel Gustafsson



acl_maintain_comment.diff
Description: Binary data

Re: allowing extensions to control planner behavior

2024-09-30 Thread Andrei Lepikhov


On 18/9/2024 17:48, Robert Haas wrote:

Comments?

Let me share my personal experience on path management in the planner.
The main thing important for extensions is flexibility - I would discuss 
a decision that is not limited by join ordering but could be applied to 
implement an index picking strategy, Memoize/Material choice versus a 
non-cached one, choice of custom paths, etc.


The most flexible way I have found to this moment is a collaboration 
between the get_relation_info_hook and add_path hook. In 
get_relation_info, we have enough context and can add some information 
to RelOptInfo - I added an extra list to this structure where extensions 
can add helpful info. Using extensible nodes, we can tackle interference 
between extensions.
The add_path hook can analyse new and old paths and also look into the 
extensible data inside RelOptInfo. The issue with lots of calls eases by 
quick return on the out-of-scope paths: usually extensions manage some 
specific path type or relation and quick test of RelOptInfo::extra_list 
allow to sort out unnecessary cases.


Being flexible, this approach is less invasive. Now, I use it to 
implement heuristics demanded by clients for cases when the estimator 
predicts only one row - usually, it means that the optimiser 
underestimates cardinality. For example, in-place switch-off of NestLoop 
if it uses too many clauses, or rules to pick up index scan if we have 
alternative scans, each of them predicts only one tuple.


Positive outcomes includes: we don't alter path costs; extension may be 
sure that core doesn't remove path from the list if the extension 
forbids it.


In attachment - hooks for add_path and add_partial_path. As you can see, 
because of differences in these routines hooks also implemented 
differently. Also the compare_path_costs_fuzzily is exported, but it is 
really good stuff for an extension.


--
regards, Andrei Lepikhov
From af8f5bd65e33c819723231ce433dcd51438b7ef0 Mon Sep 17 00:00:00 2001
From: "Andrei V. Lepikhov" 
Date: Thu, 26 Sep 2024 10:34:08 +0200
Subject: [PATCH 1/2] Introduce compare_path_hook.

The add_path function is the only interface to safely add and remove paths in
the pathlist. Postgres core provides a few optimisation hooks that an extension
can use to offer additional paths. However, it is still uncertain whether such
a path will be replaced until the end of the path population process.
This hook allows the extension to control the comparison of a new path with
each pathlist's path and denote the optimiser which path is better.
It doesn't point to the optimiser directly about what exactly to do with the
incoming path and paths, already existed in pathlist.

Tag: optimizer.
caused by: PGPRO-10445, PGPRO-11017.
---
 src/backend/optimizer/util/pathnode.c | 21 +++--
 src/include/optimizer/pathnode.h  | 17 +
 2 files changed, 28 insertions(+), 10 deletions(-)

diff --git a/src/backend/optimizer/util/pathnode.c 
b/src/backend/optimizer/util/pathnode.c
index fc97bf6ee2..ae57932862 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -31,13 +31,6 @@
 #include "utils/memutils.h"
 #include "utils/selfuncs.h"
 
-typedef enum
-{
-   COSTS_EQUAL,/* path costs are fuzzily equal 
*/
-   COSTS_BETTER1,  /* first path is cheaper than 
second */
-   COSTS_BETTER2,  /* second path is cheaper than 
first */
-   COSTS_DIFFERENT,/* neither path dominates the 
other on cost */
-} PathCostComparison;
 
 /*
  * STD_FUZZ_FACTOR is the normal fuzz factor for compare_path_costs_fuzzily.
@@ -46,6 +39,9 @@ typedef enum
  */
 #define STD_FUZZ_FACTOR 1.01
 
+/* Hook for plugins to get control over the add_path decision */
+compare_path_hook_type compare_path_hook = NULL;
+
 static List *translate_sub_tlist(List *tlist, int relid);
 static int append_total_cost_compare(const ListCell *a, const ListCell *b);
 static int append_startup_cost_compare(const ListCell *a, const ListCell 
*b);
@@ -178,7 +174,7 @@ compare_fractional_path_costs(Path *path1, Path *path2,
  * (But if total costs are fuzzily equal, we compare startup costs anyway,
  * in hopes of eliminating one path or the other.)
  */
-static PathCostComparison
+PathCostComparison
 compare_path_costs_fuzzily(Path *path1, Path *path2, double fuzz_factor)
 {
 #define CONSIDER_PATH_STARTUP_COST(p)  \
@@ -490,8 +486,13 @@ add_path(RelOptInfo *parent_rel, Path *new_path)
/*
 * Do a fuzzy cost comparison with standard fuzziness limit.
 */
-   costcmp = compare_path_costs_fuzzily(new_path, old_path,
-   
 STD_FUZZ_FACTOR);
+
+   if (compare_path_hook)
+   costcmp = (*compare_path_hook) (parent_rel, new_path, 
old_path,
+

Re: Doc: typo in config.sgml

> On 30 Sep 2024, at 11:03, Tatsuo Ishii  wrote:
> 
 I think there's an unnecessary underscore in config.sgml.
> 
> I was wrong. The particular byte sequences just looked an underscore
> on my editor but the byte sequence is actually 0xc2a0, which must be a
> "non breaking space" encoded in UTF-8. I guess someone mistakenly
> insert a non breaking space while editing config.sgml.

I wonder if it would be worth to add a check for this like we have to tabs?
The attached adds a rule to "make -C doc/src/sgml check" for trapping nbsp
(doing so made me realize we don't have an equivalent meson target).

--
Daniel Gustafsson

check_nbsp.diff
Description: Binary data

Re: ALTER TABLE ONLY .. DROP CONSTRAINT on partitioned tables

2024-09-30 Thread Alvaro Herrera

Hello,

On 2024-Sep-27, Amit Langote wrote:

> On Fri, Sep 27, 2024 at 2:52 AM Alvaro Herrera  
> wrote:
> > While studying a review note from Jian He on not-null constraints, I
> > came across some behavior introduced by commit 9139aa19423b[1] that I
> > think is mistaken.

> Yeah, I don’t quite recall why I thought the behavior for both ADD and
> DROP had to be the same. I went back and reviewed the thread, trying
> to understand why DROP was included in the decision, but couldn’t find
> anything that explained it. It also doesn’t seem to be related to the
> pg_dump issue that was being discussed at the time.

Right.

> So, I think you might be right that the restriction on DROP is
> overkill, and we should consider removing it, at least in the master
> branch.

Thanks for looking!  I have pushed the removal now.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
 really, I see PHP as like a strange amalgamation of C, Perl, Shell
 inflex: you know that "amalgam" means "mixture with mercury",
   more or less, right?
 i.e., "deadly poison"

RE: long-standing data loss bug in initial sync of logical replication

2024-09-30 Thread Hayato Kuroda (Fujitsu)

Dear Shlok,

> I have addressed the comment for 0002 patch and attached the patches.
> Also, I have moved the tests in the 0002 to 0001 patch.

Thanks for updating the patch. 0002 patch seems to remove cache invalidations
from publication_invalidation_cb(). Related with it, I found an issue and had a 
concern.

1.
The replication continues even after ALTER PUBLICATION RENAME is executed.
For example - assuming that a subscriber subscribes only "pub":

```
pub=# INSERT INTO tab values (1);
INSERT 0 1
pub=# ALTER PUBLICATION pub RENAME TO pub1;
ALTER PUBLICATION
pub=# INSERT INTO tab values (2);
INSERT 0 1

sub=# SELECT * FROM tab ; -- (2) should not be replicated however...
 a 
---
 1
 2
(2 rows)
```

This happens because 1) ALTER PUBLICATION RENAME statement won't be invalidate 
the
relation cache, and 2) publications are reloaded only when invalid 
RelationSyncEntry
is found. In given example, the first INSERT creates the valid cache and second
INSERT reuses it. Therefore, the pubname-check is skipped.

For now, the actual renaming is done at AlterObjectRename_internal(), a generic
function. I think we must implement a dedecated function to publication and must
invalidate relcaches there.

2.
Similarly with above, the relcache won't be invalidated when ALTER PUBLICATION
OWNER TO is executed. This means that privilage checks may be ignored if the 
entry
is still valid. Not sure, but is there a possibility this causes an 
inconsistency?

Best regards,
Hayato Kuroda
FUJITSU LIMITED

Re: Doc: typo in config.sgml

>> I think there's an unnecessary underscore in config.sgml.
>> Attached patch fixes it.
> 
> I could not apply the patch with an error.
> 
>  error: patch failed: doc/src/sgml/config.sgml:9380
>  error: doc/src/sgml/config.sgml: patch does not apply

Strange. I have no problem applying the patch here.

> I found your patch contains an odd character (ASCII Code 240?)
> by performing `od -c` command on the file. See the attached file.

Yes, 240 in octal (== 0xc2) is in the patch but it's because current
config.sgml includes the character. You can check it by looking at
line 9383 of config.sgml.

I think it was introduced by 28e858c0f95.

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Re: documentation structure

In, 0001-func.sgml-Consistently-use-optional-to-indicate-opti.patch

-format(formatstr
text [, formatarg
"any" [, ...] ])
+format(formatstr
text , formatarg
"any" [, ...] )

i change it further to
+format(formatstr
text , formatarg
"any" , ... )

i did these kind of change to format,
concat_ws, concat



I've rebased your patch,
added a commitfest entry: https://commitfest.postgresql.org/50/5278/
it seems I cannot mark you as the author in commitfest.
anyway, you ( Dagfinn Ilmari Mannsåker ) are the author of it.
From bc4d0b217da13fc212cb9eddf5a958887d5c7dc6 Mon Sep 17 00:00:00 2001
From: jian he 
Date: Mon, 30 Sep 2024 15:54:57 +0800
Subject: [PATCH v2 1/1] func.sgml: Consistently use  to indicate
 optional parameters

Some functions were using square brackets instead, replace them all with .

discussion: https://www.postgresql.org/message-id/87sezjj0ny.fsf%40wibble.ilmari.org
---
 doc/src/sgml/func.sgml | 48 +-
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index d6acdd3059..5bfd73951d 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -3036,7 +3036,7 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in
  concat
 
 concat ( val1 "any"
- [, val2 "any" [, ...] ] )
+ , val2 "any" , ...  )
 text


@@ -3056,7 +3056,7 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in
 
 concat_ws ( sep text,
 val1 "any"
-[, val2 "any" [, ...] ] )
+, val2 "any" , ...  )
 text


@@ -3076,7 +3076,7 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in
  format
 
 format ( formatstr text
-[, formatarg "any" [, ...] ] )
+, formatarg "any" , ...  )
 text


@@ -3170,7 +3170,7 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in
  parse_ident
 
 parse_ident ( qualified_identifier text
-[, strict_mode boolean DEFAULT true ] )
+, strict_mode boolean DEFAULT true  )
 text[]


@@ -3309,8 +3309,8 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in
  regexp_count
 
 regexp_count ( string text, pattern text
- [, start integer
- [, flags text ] ] )
+ , start integer
+ , flags text   )
 integer


@@ -3331,11 +3331,11 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in
  regexp_instr
 
 regexp_instr ( string text, pattern text
- [, start integer
- [, N integer
- [, endoption integer
- [, flags text
- [, subexpr integer ] ] ] ] ] )
+ , start integer
+ , N integer
+ , endoption integer
+ , flags text
+ , subexpr integer  )
 integer


@@ -3360,7 +3360,7 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in
  regexp_like
 
 regexp_like ( string text, pattern text
- [, flags text ] )
+ , flags text  )
 boolean


@@ -3380,7 +3380,7 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in
 
  regexp_match
 
-regexp_match ( string text, pattern text [, flags text ] )
+regexp_match ( string text, pattern text , flags text  )
 text[]


@@ -3400,7 +3400,7 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in
 
  regexp_matches
 
-regexp_matches ( string text, pattern text [, flags text ] )
+regexp_matches ( string text, pattern text , flags text  )
 setof text[]


@@ -3473,7 +3473,7 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in
 
  regexp_split_to_array
 
-regexp_split_to_array ( string text, pattern text [, flags text ] )
+regexp_split_to_array ( string text, pattern text , flags text  )
 text[]


@@ -3492,7 +3492,7 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in
 
  regexp_split_to_table
 
-regexp_split_to_table ( string text, pattern text [, flags text ] )
+regexp_split_to_table ( string text, pattern text , flags text  )
 setof text


@@ -3516,10 +3516,10 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in
  regexp_substr
 
 regexp_substr ( string text, pattern text
- [, start integer
- [, N integer
- [, flags text
- [, subexpr integer ] ]

ACL_MAINTAIN, Lack of comment content

2024-09-30 Thread btsugieyuusuke


Hi hackers,
I found a flaw in the ACL_MAINTAIN comment.

Commands such as VACUUM are listed as commands that are allowed to be 
executed by the MAINTAIN privilege.

However, LOCK TABLE is missing from the comment.


/*
	 * Check if ACL_MAINTAIN is being checked and, if so, and not already 
set

 * as part of the result, then check if the user is a member of the
 * pg_maintain role, which allows VACUUM, ANALYZE, CLUSTER, REFRESH
 * MATERIALIZED VIEW, and REINDEX on all relations.
 */


Therefore, shouldn't LOCK TABLE be added to the comment?

Best regards,
Yusuke Sugie

Possibilities on code change to implement pseudodatatypes

2024-09-30 Thread Vinícius Abrahão

Greetings

I understand the complexity of implementing a pseudo data type when
passing it over parameters, or using it when creating an object.
vide: git grep pseudo | egrep -v -e "po|out|sgml|sql" | more

My initial problem was saving history (for backup to be used during
troubleshoot analysis) of plan changing and so on..  of the pg_statistic
table/object.

I was surprised - in a good way - to observe so much effort when handling
it for that specific purpose. I started to wonder if/when I want to create
an object of other *pseudatatypes *or pass it through a function parameter
that the same level of effort during code implementation would be the same.

I understand this would be much more a code architecture discretion rather
than a punctual question.

I have posted in pgsql-admin a question which is "simple problem" when
creating a table using anyarray and indeed the problem is simple - the
solution might not be.

What folks, more experienced in this subject, would recommend as a starting
point to achieve that objective?

Kind regards,

Bazana Schmidt, Vinícius

PS.: apologize in advance for the HTML email.

Re: XMLSerialize: version and explicit XML declaration

2024-09-30 Thread Jim Jones

Hi Tom

On 25.09.24 18:02, Tom Lane wrote:
> AFAICS, all we do with an embedded XML version string is pass it to
> libxml2's xmlNewDoc(), which is the authority on whether it means
> anything.  I'd be inclined to do the same here.

Thanks. I used xml_is_document(), which calls xmlNewDoc(), to check if
the returned document is valid or not. It then decides if an unexpected
version deserves an error or just a warning.

Attached v1 with the first attempt to implement these features.

 INCLUDING / EXCLUDING XMLDECLARATION (SQL/XML X078) 

The flags INCLUDING XMLDECLARATION and EXCLUDING XMLDECLARATION include
or remove the XML declaration in the XMLSerialize output of the given
DOCUMENT or CONTENT, respectively.

SELECT
  xmlserialize(
    DOCUMENT '42'::xml AS text
    INCLUDING XMLDECLARATION);

 xmlserialize
---
 42
(1 row)

SELECT
  xmlserialize(
    DOCUMENT '42'::xml AS text
    EXCLUDING XMLDECLARATION);

   xmlserialize
--
 42
(1 row)

If omitted, the output will contain an XML declaration only if the given
XML value had one.

SELECT
  xmlserialize(
    DOCUMENT '42'::xml AS text);

  xmlserialize  

 42
(1 row)

SELECT
  xmlserialize(
    DOCUMENT '42'::xml AS text);
   xmlserialize   
--
 42
(1 row)

 VERSION (SQL/XML X076)

VERSION can be used to specify the version in the XML declaration of the
serialized DOCUMENT or CONTENT.

SELECT
  xmlserialize(
    DOCUMENT '42'::xml AS text
    VERSION '1.0'
    INCLUDING XMLDECLARATION);

 xmlserialize  
---
 42
(1 row)

In case of XML values of type DOCUMENT, the version will be validated by
libxml2's xmlNewDoc(), which will raise an error for invalid
versions or a warning for unsupported ones. For CONTENT values no
validation is performed.

SELECT
  xmlserialize(
    DOCUMENT '42'::xml AS text
    VERSION '1.1'
    INCLUDING XMLDECLARATION);

WARNING:  line 1: Unsupported version '1.1'
42
   ^
 xmlserialize
---
 42
(1 row)

SELECT
  xmlserialize(
    DOCUMENT '42'::xml AS text
    VERSION '2.0'
    INCLUDING XMLDECLARATION);

ERROR:  Invalid XML declaration: VERSION '2.0'

SELECT
  xmlserialize(
    CONTENT '42'::xml AS text
    VERSION '2.0'
    INCLUDING XMLDECLARATION);

 xmlserialize
---
 42
(1 row)

This option is ignored if the XML value had no XML declaration and
INCLUDING XMLDECLARATION was not used.

SELECT
  xmlserialize(
    CONTENT '42'::xml AS text
    VERSION '');

   xmlserialize
--
 42
(1 row)

Best, Jim
From 8bcb91f8b163a9efda1de33ebb7538767c860ad9 Mon Sep 17 00:00:00 2001
From: Jim Jones 
Date: Thu, 26 Sep 2024 13:35:28 +0200
Subject: [PATCH v1] Add XMLSerialize: version and explicit XML declaration

* XMLSerialize: explicit XML declaration (SQL/XML X078)

This adds the options INCLUDING XMLDECLARATION and EXCLUDING
XMLDECLARATION to XMLSerialize, which includes or remove the
XML declaration of the given DOCUMENT or CONTENT, respectively.
If not specified, the output will contain an XML declaration
only if the given XML value had one.

* XMLSerialize: version (SQL/XML X076)

The option VERSION can be used to specify the version in the
XML declaration of the serialized DOCUMENT or CONTENT. In case
of XML values of type DOCUMENT, the version will be validated
by libxml2's xmlNewDoc(), which will raise an error for invalid
versions or a warning for unsupported ones. For CONTENT values
no validation is performed. This option is ignored if the XML
value had no XML deldaration and INCLUDING XMLDECLARATION was
not used.

Usage examples:

SELECT
  xmlserialize(
DOCUMENT xmlval AS text
VERSION '1.0'
INCLUDING XMLDECLARATION);

SELECT
  xmlserialize(
DOCUMENT xmlval AS text
EXCLUDING XMLDECLARATION);

This patch also includes regression tests and documentation.
---
 doc/src/sgml/datatype.sgml|  48 ++-
 src/backend/catalog/sql_features.txt  |   4 +-
 src/backend/executor/execExprInterp.c |   4 +-
 src/backend/parser/gram.y |  29 +-
 src/backend/parser/parse_expr.c   |   2 +
 src/backend/utils/adt/xml.c   | 119 +--
 src/include/nodes/parsenodes.h|   2 +
 src/include/nodes/primnodes.h |  11 +
 src/include/parser/kwlist.h   |   1 +
 src/include/utils/xml.h   |   3 +-
 src/test/regress/expected/xml.out | 428 +-
 src/test/regress/expected/xml_1.out   | 275 +
 src/test/regress/expected/xml_2.out   | 401 ++

Re: [Bug Fix]standby may crash when switching-over in certain special cases

2024-09-30 Thread px shi

Thanks for responding.


> It is odd that the standby server crashes when

replication fails because the standby would keep retrying to get the

next record even in such case.


 As I mentioned earlier, when replication fails, it retries to establish
streaming replication. At this point, the value of *walrcv->flushedUpto *is
not necessarily the data actually flushed to disk. However, the startup
process mistakenly believes that the latest flushed LSN is
*walrcv->flushedUpto* and attempts to open the corresponding WAL file,
which doesn't exist, leading to a file open failure and causing the startup
process to PANIC.

Regards,
Pixian Shi

Yugo NAGATA  于2024年9月30日周一 13:47写道：

> On Wed, 21 Aug 2024 09:11:03 +0800
> px shi  wrote:
>
> > Yugo Nagata  于2024年8月21日周三 00:49写道：
> >
> > >
> > >
> > > > Is s1 a cascading standby of s2? If otherwise s1 and s2 is the
> standbys
> > > of
> > > > the primary server respectively, it is not surprising that s2 has
> > > progressed
> > > > far than s1 when the primary fails. I believe that this is the case
> you
> > > should
> > > > use pg_rewind. Even if flushedUpto is reset as proposed in your
> patch,
> > > s2 might
> > > > already have applied a WAL record that s1 has not processed yet, and
> > > there
> > > > would be no gurantee that subsecuent applys suceed.
> > >
> > >
> >  Thank you for your response. In my scenario, s1 and s2 is the standbys
> of
> > the primary server respectively, and s1 a synchronous standby and s2 is
> an
> > asynchronous standby. You mentioned that if s2's replay progress is ahead
> > of s1, pg_rewind should be used. However, what I'm trying to address is
> an
> > issue where s2 crashes during replay after s1 has been promoted to
> primary,
> > even though s2's progress hasn't surpassed s1.
>
> I understood your point. It is odd that the standby server crashes when
> replication fails because the standby would keep retrying to get the
> next record even in such case.
>
> Regards,
> Yugo Nagata
>
> >
> > Regards,
> > Pixian Shi
>
>
> --
> Yugo NAGATA 
>

Re: Pgoutput not capturing the generated columns

2024-09-30 Thread Peter Smith

Hi Shubham. Here are my review comment for patch v34-0002.

==
doc/src/sgml/ref/create_publication.sgml

1.
+ 
+ This parameter can only be set true if
copy_data is
+ set to false.
+ 

Huh? AFAIK the patch implements COPY for generated columns, so why are
you saying this limitation?

==
src/backend/replication/logical/tablesync.c

2. reminder

Previously (18/9) [1 #4] I wrote maybe that other copy_data=false
"missing" case error can be improved to share the same error message
that you have in make_copy_attnamelist. And you replied [2] it would
be addressed in the next patchset, but that was at least 2 versions
back and I don't see any change yet.

==
[1] 18/9 review
https://www.postgresql.org/message-id/CAHut%2BPusbhvPrL1uN1TKY%3DFd4zu3h63eDebZvsF%3Duy%2BLBKTwgA%40mail.gmail.com
[2] 
https://www.postgresql.org/message-id/CAHv8RjJ5_dmyCH58xQ0StXMdPt9gstemMMWytR79%2BLfOMAHdLw%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: documentation structure

turns out I hardcoded some line number information,
which makes the script v7_split_func_sgml.py cannot apply anymore.
this time, it should be bullet-proof.

same as before:
step1. python3  v8_split_func_sgml.py
step2. git apply v8-0001-all-filelist-for-directory-doc-src-sgml-func.patch
(step2, cannot use "git am").

I don't know perl, but written in perl, I guess the logic will be the
same, based on line number, do the  copy, delete on func.sgml.
From d123a7c9ef6ad45e3b697aa20bcfc831f594b45d Mon Sep 17 00:00:00 2001
From: jian he 
Date: Sun, 21 Jul 2024 20:43:45 +0800
Subject: [PATCH v6 1/1] all filelist for directory doc/src/sgml/func

---
 doc/src/sgml/filelist.sgml  |  5 -
 doc/src/sgml/func/allfiles.sgml | 40 +
 2 files changed, 44 insertions(+), 1 deletion(-)
 create mode 100644 doc/src/sgml/func/allfiles.sgml

diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml
index a7ff5f82..d9f36933 100644
--- a/doc/src/sgml/filelist.sgml
+++ b/doc/src/sgml/filelist.sgml
@@ -17,7 +17,10 @@
 
 
 
-
+
+
+%allfiles_func;
+
 
 
 
diff --git a/doc/src/sgml/func/allfiles.sgml b/doc/src/sgml/func/allfiles.sgml
new file mode 100644
index ..34eec608
--- /dev/null
+++ b/doc/src/sgml/func/allfiles.sgml
@@ -0,0 +1,40 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
\ No newline at end of file
-- 
2.34.1

import subprocess
import os
import re
top_dir = subprocess.check_output(["git", "rev-parse", "--show-toplevel"])
top_dir = top_dir.decode().rstrip('\n')
top_src_dir = top_dir + "/doc/src/sgml"
#goto doc/src/sgml directory, exit when not exists
try:
os.chdir(top_src_dir)
except FileNotFoundError:
print(f'{top_src_dir} not does not exist. SKIPPED')
quit()

target_file="func.sgml"
#
func_logical="func-logical.sgml"
func_comparison="func-comparison.sgml"
func_math="func-math.sgml"
func_string="func-string.sgml"
func_binarystring="func-binarystring.sgml"
func_bitstring="func-bitstring.sgml"
func_matching="func-matching.sgml"
func_formatting="func-formatting.sgml"
func_datetime="func-datetime.sgml"
func_enum="func-enum.sgml"
func_geometry="func-geometry.sgml"
func_net="func-net.sgml"
func_textsearch="func-textsearch.sgml"
func_uuid="func-uuid.sgml"
func_xml="func-xml.sgml"
func_json="func-json.sgml"
func_sequence="func-sequence.sgml"
func_conditional="func-conditional.sgml"
func_array="func-array.sgml"
func_range="func-range.sgml"
func_aggregate="func-aggregate.sgml"
func_window="func-window.sgml"
func_merge_support ="func-merge-support.sgml"
func_subquery="func-subquery.sgml"
func_comparisons="func-comparisons.sgml"
func_srf="func-srf.sgml"
func_info="func-info.sgml"
func_admin="func-admin.sgml"
func_trigger="func-trigger.sgml"
func_event_triggers="func-event-triggers.sgml"
func_statistics="func-statistics.sgml"
#
func_filenames = list()
func_filenames.append(func_logical)
func_filenames.append(func_comparison)
func_filenames.append(func_math)
func_filenames.append(func_string)
func_filenames.append(func_binarystring)
func_filenames.append(func_bitstring)
func_filenames.append(func_matching)
func_filenames.append(func_formatting)
func_filenames.append(func_datetime)
func_filenames.append(func_enum)
func_filenames.append(func_geometry)
func_filenames.append(func_net)
func_filenames.append(func_textsearch)
func_filenames.append(func_uuid)
func_filenames.append(func_xml)
func_filenames.append(func_json)
func_filenames.append(func_sequence)
func_filenames.append(func_conditional)
func_filenames.append(func_array)
func_filenames.append(func_range)
func_filenames.append(func_aggregate)
func_filenames.append(func_window)
func_filenames.append(func_merge_support )
func_filenames.append(func_subquery)
func_filenames.append(func_comparisons)
func_filenames.append(func_srf)
func_filenames.append(func_info)
func_filenames.append(func_admin)
func_filenames.append(func_trigger)
func_filenames.append(func_event_triggers)
func_filenames.append(func_statistics)
#
func_string_begin_lineno= -1
func_logical_begin_lineno=-1
func_comparison_begin_lineno=-1
func_math_begin_lineno=-1
func_string_begin_lineno=-1
func_binarystring_begin_lineno=-1
func_bitstring_begin_lineno=-1
func_matching_begin_lineno=-1
func_formatting_begin_lineno=-1
func_datetime_begin_lineno=-1
func_enum_begin_lineno=-1
func_geometry_begin_lineno=-1
func_net_begin_lineno=-1
func_textsearch_begin_lineno=-1
func_uuid_begin_lineno=-1
func_xml_begin_lineno=-1
func_json_begin_lineno=-1
func_sequence_begin_lineno=-1
func_conditional_begin_lineno=-1
func_array_begin_lineno=-1
func_range_begin_lineno=-1
func_aggregate_begin_lineno=-1
func_window_begin_lineno=-1
func_merge_support_begin_lineno=-1
func_subquery_begin_lineno=-1
func_comparisons_begin_lineno=-1
func_srf_begin_lineno=-1
func_info_begin_lineno=-1
func_admin_begin_lineno=-1
func_trigger_begin_lineno=-1
func_event_triggers_begin_lineno=-1
func_statistics_begin_lineno=-1
#
func_logical_end_lineno=-1
func_comparison_end_lineno=-1
fu

Re: msys inet_pton strangeness

2024-09-30 Thread Alexander Lakhin


Hello Andrew and Thomas,

29.09.2024 18:47, Andrew Dunstan пишет:


I'm inclined to think we might need to reverse the order of the last two. TBH I don't really understand how this has 
worked up to now.




I've looked at the last successful run [1] and discovered that
fe-secure-common.c didn't compile cleanly too:
ccache gcc ... 
/home/pgrunner/bf/root/REL_15_STABLE/pgsql.build/../pgsql/src/interfaces/libpq/fe-secure-common.c
C:/tools/nmsys64/home/pgrunner/bf/root/REL_15_STABLE/pgsql/src/interfaces/libpq/fe-secure-common.c: In function 
'pq_verify_peer_name_matches_certificate_ip':
C:/tools/nmsys64/home/pgrunner/bf/root/REL_15_STABLE/pgsql/src/interfaces/libpq/fe-secure-common.c:219:21: warning: 
implicit declaration of function 'inet_pton'; did you mean 'inet_aton'? [-Wimplicit-function-declaration]

  219 | if (inet_pton(AF_INET6, host, &addr) == 1)
  | ^
  | inet_aton

So it worked just because that missing declaration generated just a
warning, not an error.

30.09.2024 01:28, Thomas Munro wrote:

Just an idea...

--- a/src/include/port/win32.h
+++ b/src/include/port/win32.h
@@ -16,7 +16,7 @@
   * get support for GetLocaleInfoEx() with locales. For everything else
   * the minimum version is Windows XP (0x0501).
   */
-#if defined(_MSC_VER) && _MSC_VER >= 1900
+#if !defined(_MSC_VER) || _MSC_VER >= 1900
  #define MIN_WINNT 0x0600
  #else
  #define MIN_WINNT 0x0501


This change works for me in the msys case. I have no VS 2013 on hand to
test the other branch, but it looks like HAVE_INET_PTON set to 1
unconditionally in src/tools/msvc/Solution.pm, so we probably will stumble
upon the same issue with _MSC_VER = 1800. What if we just set
MIN_WINNT 0x0600 for REL_15_STABLE? Or may be it would make sense to get
that old Visual Studio and recheck?

The other question that I still have is: where we expect to get system
_WIN32_WINNT from? As far as I can see, in the fe-secure-common.c case we
have the following include chain:
#include "postgres_fe.h"
    #include "c.h" // no other includes above
        #include "postgres_ext.h"
            #include "pg_config_ext.h"
            ...
            #include "pg_config.h"
            #include "pg_config_manual.h"    /* must be after pg_config.h */
            #include "pg_config_os.h"        /* must be before any system 
header files */
                // checks _WIN32_WINNT:
                #if defined(_WIN32_WINNT) && _WIN32_WINNT < MIN_WINNT

So if pg_config_os.h is really included before any system headers,
checking _WIN32_WINNT makes sense only when that define passed with
-D_WIN32_WINNT, no?

[1] 
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=fairywren&dt=2024-09-19%2023%3A10%3A10&stg=build

Best regards,
Alexander

Re: msys inet_pton strangeness




On 2024-09-29 Su 6:28 PM, Thomas Munro wrote:

Just an idea...

--- a/src/include/port/win32.h
+++ b/src/include/port/win32.h
@@ -16,7 +16,7 @@
   * get support for GetLocaleInfoEx() with locales. For everything else
   * the minimum version is Windows XP (0x0501).
   */
-#if defined(_MSC_VER) && _MSC_VER >= 1900
+#if !defined(_MSC_VER) || _MSC_VER >= 1900
  #define MIN_WINNT 0x0600
  #else
  #define MIN_WINNT 0x0501



This seems reasonable as just about the most minimal change we can make 
work.



cheers


andrew



--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: AIO v2.0

2024-09-30 Thread Andres Freund

Hi,

On 2024-09-17 11:08:19 -0700, Noah Misch wrote:
> > - I am worried about the need for bounce buffers for writes of checksummed
> >   buffers. That quickly ends up being a significant chunk of memory,
> >   particularly when using a small shared_buffers with a higher than default
> >   number of connection. I'm currently hacking up a prototype that'd prevent 
> > us
> >   from setting hint bits with just a share lock. I'm planning to start a
> >   separate thread about that.
> 
> AioChooseBounceBuffers() limits usage to 256 blocks (2MB) per MaxBackends.
> Doing better is nice, but I don't consider this a blocker.  I recommend
> dealing with the worry by reducing the limit initially (128 blocks?).  Can
> always raise it later.

On storage that has nontrivial latency, like just about all cloud storage,
even 256 will be too low. Particularly for checkpointer.

Assuming 1ms latency - which isn't the high end of cloud storage latency - 256
blocks in flight limits you to <= 256MByte/s, even on storage that can have a
lot more throughput. With 3ms, which isn't uncommon, it's 85MB/s.

Of course this could be addressed by tuning, but it seems like something that
shouldn't need to be tuned by the majority of folks running postgres.

We also discussed the topic at 
https://postgr.es/m/20240925020022.c5.nmisch%40google.com
> ... neither BM_SETTING_HINTS nor keeping bounce buffers looks like a bad
> decision.  From what I've heard so far of the performance effects, if it were
> me, I would keep the bounce buffers.  I'd pursue BM_SETTING_HINTS and bounce
> buffer removal as a distinct project after the main AIO capability.  Bounce
> buffers have an implementation.  They aren't harming other design decisions.
> The AIO project is big, so I'd want to err on the side of not designating
> other projects as its prerequisites.

Given the issues that modifying pages while in flight causes, not just with PG
level checksums, but also filesystem level checksum, I don't feel like it's a
particularly promising approach.

However, I think this doesn't have to mean that the BM_SETTING_HINTS stuff has
to be completed before we can move forward with AIO. If I split out the write
portion from the read portion a bit further, the main AIO changes and the
shared-buffer read user can be merged before there's a dependency on the hint
bit stuff being done.

Does that seem reasonable?

Greetings,

Andres Freund

Do not lock temp relations

2024-09-30 Thread Maxim Orlov

Hi!

Working with temp relations is some kind of bottleneck in Postgres, in my
view.
There are no problems if you want to handle it from time to time, not
arguing
that. But if you have to make a massive temp tables creation/deletion,
you'll
soon step into a performance degradation.

To the best of my knowledge, there are two obvious problems:
1. We have to add or remove an entry in pg_class when temp table created
and
   deleted, resulting in "bloating" of pg_class. Thus, auto-vacuum is
needed, but
   it will acquire a lock, slowing things down.
2. Temp tables almost universally treated as regular tables. And this is
100%
correct and makes code much simpler. But also involve all the locking
mechanism.

As for the first issue, I do not see how any significant improvements can
be made,
unfortunately.

But for the second one: do we really need any lock for temp relations?
AFAICU
they are backend only, apart from pg_class entries.

I do not have any particular solution for now, only some kind of concept:
we can
put checks for temp relations in LockAcquire/LockRelease in order to skip
locking.

Do I miss something and idea is doomed or there are no visible obstacles
here
and it's worth the effort to make a POC patch?

-- 
Best regards,
Maxim Orlov.

Re: pg_basebackup and error messages dependent on the order of the arguments

I wrote:
> As this example shows, we really ought to validate the compression
> argument on sight in order to get sensible error messages.  The
> trouble is that for server-side compression the code wants to just
> pass the string through to the server and not form its own opinion
> as to whether it's a known algorithm.

> Perhaps it would help if we simply rejected strings beginning
> with a dash?  I haven't tested, but roughly along the lines of

Taking a closer look, many of the other switches-requiring-an-argument
also just absorb "optarg" without checking its value till much later,
so I'm not sure how far we could move the needle by special-casing
--compress.

regards, tom lane

Re: May be BUG. Periodic burst growth of the checkpoint_req counter on replica.





On 2024/09/30 16:00, Anton A. Melnikov wrote:


On 30.09.2024 06:26, Fujii Masao wrote:

Thanks for the review! I've pushed the 0001 patch.


Thanks a lot!


As for switching in the pg_proc.dat entries the idea was to put them in order
so that the pg_stat_get_checkpointer* functions were grouped together.
I don't know if this is the common and accepted practice. Simply i like it 
better this way.
Sure, if you think it's unnecessary, let it stay as is with minimal diff.


I understand your point, but I didn't made that change to keep the diff minimal,
which should make future back-patching easier.


Agreed. Its quite reasonable. I've not take into account the backporting
possibility at all. This is of course wrong.


In addition, checkpoints may be skipped due to "checkpoints are occurring
too frequently" error. Not sure, but maybe add this information to
the new description?


 From what I can see in the code, that error message doesn’t seem to indicate
the checkpoint is being skipped. In fact, checkpoints are still happening
actually when that message appears. Am I misunderstanding something?


No, you are right! This is my oversight. I didn't notice that elevel is just a 
log
not a error. Thanks!


Ok, so I pushed 0002.patch. Thanks for the review!

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: On disable_cost

2024-09-30 Thread Laurenz Albe

On Sat, 2024-09-28 at 00:04 +1200, David Rowley wrote:
> On Fri, 27 Sept 2024 at 20:42, Laurenz Albe  wrote:
> > 2. The "disabled nodes" are not only shown at the nodes where nodes
> >were actually disabled, but also at every nodes above these nodes.
> 
> I'm also not a fan either and I'd like to see this output improved.
> 
> It seems like it's easy enough to implement some logic to detect when
> a given node is disabled just by checking if the disable_nodes count
> is higher than the sum of the disabled_node field of the node's
> children.  If there are no children (a scan node) and disabed_nodes >
> 0 then it must be disabled. There's even a nice fast path where we
> don't need to check the children if disabled_nodes == 0.
> 
> Here's a POC grade patch of how I'd rather see it looking.
> 
> I opted to have a boolean field as I didn't see any need for an
> integer count. I also changed things around so we always display the
> boolean property in non-text EXPLAIN. Normally, we don't mind being
> more verbose there.
> 
> I also fixed a bug in make_sort() where disabled_nodes isn't being set
> properly. I'll do an independent patch for that if this goes nowhere.

Thanks, and the patch looks good.

Why did you change "Disabled" from an integer to a boolean?
If you see a join where two plans were disabled, that's useful information.

I would still prefer to see the disabled nodes only in VERBOSE explain,
but I'm satisfied if the disabled nodes don't show up all over the place.

Yours,
Laurenz Albe

Re: pg_upgrade check for invalid databases

Nathan Bossart  writes:
> Should we have pg_upgrade skip invalid databases?  If the only valid action
> is to drop them, IMHO it seems unnecessary to require users to manually
> drop them before retrying pg_upgrade.

I was thinking the same.  But I wonder if there is any chance of
losing data that could be recoverable.  It feels like this should
not be a default behavior.

TBH I'm not finding anything very much wrong with the current
behavior... this has to be a rare situation, do we need to add
debatable behavior to make it easier?

regards, tom lane

Re: pg_basebackup and error messages dependent on the order of the arguments





On 2024/09/30 20:10, Daniel Westermann (DWE) wrote:

Hi,

shouldn't this give the same error message?

$ pg_basebackup --checkpoint=fast --format=t --compress --pgdata=/var/tmp/dummy
pg_basebackup: error: must specify output directory or backup target
pg_basebackup: hint: Try "pg_basebackup --help" for more information.

$ pg_basebackup --pgdata=/var/tmp/dummy  --checkpoint=fast --format=t --compress
pg_basebackup: option '--compress' requires an argument
pg_basebackup: hint: Try "pg_basebackup --help" for more information.

I don't see why the first case gives not the same message as the second, especially as 
the "output directory" is specified.


I guess because "--pgdata=/var/tmp/dummy" is processed as the argument of
--compress option in the first case, but not in the second case.
You can see the same error if you specify other optoin requiring an argument,
such as --label, in the first case.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: msys inet_pton strangeness




On 2024-09-30 Mo 10:08 AM, Tom Lane wrote:

Andrew Dunstan  writes:

Ah, so this is because gcc 14.1.0 treats this as an error but gcc 12.2.0
treats it as a warning. Now it makes sense.

Not entirely ... if fairywren had been generating that warning all
along, I would have noticed it long ago, because I periodically
scrape the BF database for compiler warnings.  There has to have
been some recent change in the system include files.



here's what I see on vendikar:


pgbfprod=> select min(snapshot) from build_status_log where log_stage in 
('build.log', 'make.log') and branch = 'REL_15_STABLE' and sysname = 
'fairywren' and snapshot > now() - interval '1500 days' and log_text ~ 
'inet_pton';

 min
-
 2022-06-30 18:04:08
(1 row)


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: pg_verifybackup: TAR format backup verification

On Sat, Sep 28, 2024 at 6:59 PM Tom Lane  wrote:
> Now, manifest_file.size is in fact a size_t, so %zu is the correct
> format spec for it.  But astreamer_verify.checksum_bytes is declared
> uint64.  This leads me to question whether size_t was the correct
> choice for manifest_file.size.  If it is, is it OK to use size_t
> for checksum_bytes as well?  If not, your best bet is likely
> to write %lld and add an explicit cast to long long, as we do in
> dozens of other places.  I see that these messages are intended to be
> translatable, so INT64_FORMAT is not usable here.

It seems that manifest_file.size is size_t because parse_manifest.h is
using size_t for json_manifest_per_file_callback. What's happening is
that json_manifest_finalize_file() is parsing the file size
information out of the manifest. It uses strtoul to do that and
assigns the result to a size_t. I don't think I had any principled
reason for making that decision; size_t is, I think, the size of an
object in memory, and this is not that. This is just a string in a
JSON file that represents an integer which will hopefully turn out to
be the size of the file on disk. I guess I don't know what type I
should be using here. Most things in PostgreSQL use a type like uint32
or uint64, but technically what we're going to be comparing against in
the end is probably an off_t produced by stat(), but the return value
of strtoul() or strtoull() is unsigned long or unsigned long long,
which is not any of those things. If you have a suggestion here, I'm
all ears.

> Aside from that, I'm unimpressed with expending a five-line comment
> at line 376 to justify casting control_file_bytes to int, when you
> could similarly cast it to long long, avoiding the need to justify
> something that's not even in tune with project style.

I don't know what you mean by this. The comment explains that I used
%d here because that's what pg_rewind does, and %d corresponds to int,
not long long. If you think it should be some other way, you can
change it, and perhaps you'd like to change pg_rewind to match while
you're at it. But the fact that there's a comment here explaining the
reasoning is a feature, not a bug. It's weird to me to get criticized
for failing to follow project style when I literally copied something
that already exists.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: msys inet_pton strangeness

Andrew Dunstan  writes:
> On 2024-09-30 Mo 10:08 AM, Tom Lane wrote:
>> Not entirely ... if fairywren had been generating that warning all
>> along, I would have noticed it long ago, because I periodically
>> scrape the BF database for compiler warnings.  There has to have
>> been some recent change in the system include files.

> here's what I see on vendikar:

Oh, wait, I forgot this is only about the v15 branch.  I seldom
search for warnings except on HEAD.  Still, I'd have expected to
notice it while v15 was development tip.  Maybe we changed something
since then?

Anyway, it's pretty moot, I see no reason not to push forward
with the proposed fix.

regards, tom lane

Re: Do not lock temp relations

Maxim Orlov  writes:
> But for the second one: do we really need any lock for temp relations?

Yes.  Our implementation restrictions preclude access to the contents
of another session's temp tables, but it is not forbidden to do DDL
on them so long as no content access is required.  (Without this,
it'd be problematic for example to clean out a crashed session's temp
tables.  See the "orphan temporary tables" logic in autovacuum.c.)
You need fairly realistic locking to ensure that's OK.

regards, tom lane

Re: general purpose array_sort

On Mon, Sep 30, 2024 at 1:01 PM Junwang Zhao  wrote:
>
> > I would suggest accepting:
> > asc
> > desc
> > asc nulls first
> > asc nulls last *
> > desc nulls first *
> > desc nulls last
> >
> > As valid inputs for "dir" - and that the starred options are the defaults 
> > when null position is omitted.
> >
> > In short, mimic create index.
> >
> > David J.
> >
>
> PFA v3 with David's suggestion addressed.
>

I think just adding 2 bool arguments (asc/desc, nulls last/not nulls
last) would be easier.
but either way, (i don't have a huge opinion)
but document the second argument, imagine case
SELECT array_sort('{a,B}'::text[] , E'aSc NulLs  LaST \t\r\n');
would be tricky?


errmsg("multidimensional arrays sorting are not supported")));
write a sql test to trigger the error message that would be great.

you can add two or one example to collate.icu.utf8.sql to demo that it
actually works with COLLATE  collation_name
like:
SELECT array_sort('{a,B}'::text[] COLLATE case_insensitive);
SELECT array_sort('{a,B}'::text[] COLLATE "C");


#define WHITESPACE " \t\n\r"
you may also check function scanner_isspace


+ typentry = (TypeCacheEntry *) fcinfo->flinfo->fn_extra;
+ if (typentry == NULL || typentry->type_id != elmtyp)
+ {
+ typentry = lookup_type_cache(elmtyp, sort_asc ? TYPECACHE_LT_OPR :
TYPECACHE_GT_OPR);
+ fcinfo->flinfo->fn_extra = (void *) typentry;
+ }
you need to one-time check typentry->lt_opr or typentry->gt_opr exists?
see CreateStatistics.
/* Disallow data types without a less-than operator */
type = lookup_type_cache(attForm->atttypid, TYPECACHE_LT_OPR);
if (type->lt_opr == InvalidOid)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg("column \"%s\" cannot be used in
statistics because its type %s has no default btree operator class",
attname, format_type_be(attForm->atttypid;

Re: pg_verifybackup: TAR format backup verification

On Sun, Sep 29, 2024 at 1:03 PM Tom Lane  wrote:
> *** CID 1620458:  Resource leaks  (RESOURCE_LEAK)
> /srv/coverity/git/pgsql-git/postgresql/src/bin/pg_verifybackup/pg_verifybackup.c:
>  1025 in verify_tar_file()
> 1019relpath);
> 1020
> 1021/* Close the file. */
> 1022if (close(fd) != 0)
> 1023report_backup_error(context, "could not close file 
> \"%s\": %m",
> 1024relpath);
> >>> CID 1620458:  Resource leaks  (RESOURCE_LEAK)
> >>> Variable "buffer" going out of scope leaks the storage it points to.
> 1025 }
> 1026
> 1027 /*
> 1028  * Scan the hash table for entries where the 'matched' flag is not 
> set; report
> 1029  * that such files are present in the manifest but not on disk.
> 1030  */

This looks like a real leak. It can only happen once per tarfile when
verifying a tar backup so it can never add up to much, but it makes
sense to fix it.

> *** CID 1620457:  Memory - illegal accesses  (OVERRUN)
> /srv/coverity/git/pgsql-git/postgresql/src/bin/pg_verifybackup/astreamer_verify.c:
>  349 in member_copy_control_data()
> 343  */
> 344 if (mystreamer->control_file_bytes <= sizeof(ControlFileData))
> 345 {
> 346 int remaining;
> 347
> 348 remaining = sizeof(ControlFileData) - 
> mystreamer->control_file_bytes;
> >>> CID 1620457:  Memory - illegal accesses  (OVERRUN)
> >>> Overrunning array of 296 bytes at byte offset 296 by dereferencing 
> >>> pointer "(char *)&mystreamer->control_file + 
> >>> mystreamer->control_file_bytes".
> 349 memcpy(((char *) &mystreamer->control_file)
> 350+ mystreamer->control_file_bytes,
> 351data, Min(len, remaining));
> 352 }
> 353
> 354 /* Remember how many bytes we saw, even if we didn't buffer 
> them. */

I think this might be complaining about a potential zero-length copy.
Seems like perhaps the <= sizeof(ControlFileData) test should actually
be < sizeof(ControlFileData).

> *** CID 1620456:  Null pointer dereferences  (FORWARD_NULL)
> /srv/coverity/git/pgsql-git/postgresql/src/bin/pg_verifybackup/pg_verifybackup.c:
>  939 in precheck_tar_backup_file()
> 933 "file 
> \"%s\" is not expected in a tar format backup",
> 934 
> relpath);
> 935 tblspc_oid = (Oid) num;
> 936 }
> 937
> 938 /* Now, check the compression type of the tar */
> >>> CID 1620456:  Null pointer dereferences  (FORWARD_NULL)
> >>> Passing null pointer "suffix" to "strcmp", which dereferences it.
> 939 if (strcmp(suffix, ".tar") == 0)
> 940 compress_algorithm = PG_COMPRESSION_NONE;
> 941 else if (strcmp(suffix, ".tgz") == 0)
> 942 compress_algorithm = PG_COMPRESSION_GZIP;
> 943 else if (strcmp(suffix, ".tar.gz") == 0)
> 944 compress_algorithm = PG_COMPRESSION_GZIP;

This one is happening, I believe, because report_backup_error()
doesn't perform a non-local exit, but we have a bit of code here that
acts like it does.

Patch attached.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


0001-Fix-issues-reported-by-Coverity.patch
Description: Binary data

Re: pg_verifybackup: TAR format backup verification

Robert Haas  writes:
> On Sat, Sep 28, 2024 at 6:59 PM Tom Lane  wrote:
>> Now, manifest_file.size is in fact a size_t, so %zu is the correct
>> format spec for it.  But astreamer_verify.checksum_bytes is declared
>> uint64.  This leads me to question whether size_t was the correct
>> choice for manifest_file.size.

> It seems that manifest_file.size is size_t because parse_manifest.h is
> using size_t for json_manifest_per_file_callback. What's happening is
> that json_manifest_finalize_file() is parsing the file size
> information out of the manifest. It uses strtoul to do that and
> assigns the result to a size_t. I don't think I had any principled
> reason for making that decision; size_t is, I think, the size of an
> object in memory, and this is not that.

Correct, size_t is defined to measure in-memory object sizes.  It's
the argument type of malloc(), the result type of sizeof(), etc.
It does not restrict the size of disk files.

> This is just a string in a
> JSON file that represents an integer which will hopefully turn out to
> be the size of the file on disk. I guess I don't know what type I
> should be using here. Most things in PostgreSQL use a type like uint32
> or uint64, but technically what we're going to be comparing against in
> the end is probably an off_t produced by stat(), but the return value
> of strtoul() or strtoull() is unsigned long or unsigned long long,
> which is not any of those things. If you have a suggestion here, I'm
> all ears.

I don't know if it's realistic to expect that this code might be used
to process JSON blobs exceeding 4GB.  But if it is, I'd be inclined to
use uint64 and strtoull for these purposes, if only to avoid
cross-platform hazards with varying sizeof(long) and sizeof(size_t).

Um, wait ... we do have strtou64(), so you should use that.

>> Aside from that, I'm unimpressed with expending a five-line comment
>> at line 376 to justify casting control_file_bytes to int,

> I don't know what you mean by this.

I mean that we have a widely-used, better solution.  If you copied
this from someplace else, the someplace else could stand to be
improved too.

regards, tom lane

Re: Set query_id for query contained in utility statement

PREPARE test_prepare_pgss1(int, int) AS select generate_series($1, $2);
SELECT pg_stat_statements_reset() IS NOT NULL AS t;
EXECUTE test_prepare_pgss1 (1,2);
EXECUTE test_prepare_pgss1 (1,3);
SELECT toplevel, calls, query, queryid, rows FROM pg_stat_statements
ORDER BY query COLLATE "C", toplevel;
SELECT pg_stat_statements_reset() IS NOT NULL AS t;

---the above works just fine. just for demo purpose

explain(verbose) EXECUTE test_prepare_pgss1(1, 2);
explain(verbose) EXECUTE test_prepare_pgss1(1, 3);


SELECT toplevel, calls, query, queryid, rows FROM pg_stat_statements
ORDER BY query COLLATE "C", toplevel;
 toplevel | calls | query
|   queryid| rows
--+---++--+--
 f| 2 | PREPARE test_prepare_pgss1(int, int) AS select
generate_series($1, $2) | -3421048434214482065 |0
 t| 1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
|  3366652201587963567 |1
 t| 0 | SELECT toplevel, calls, query, queryid, rows FROM
pg_stat_statements  +| -6410939316132384446 |0
  |   | ORDER BY query COLLATE "C", toplevel
|  |
 t| 1 | explain(verbose) EXECUTE test_prepare_pgss1(1, 2)
|  7618807962395633001 |0
 t| 1 | explain(verbose) EXECUTE test_prepare_pgss1(1, 3)
| -2281958002956676857 |0

Is it possible to normalize top level utilities explain query, make
these two have the same queryid?
explain(verbose) EXECUTE test_prepare_pgss1(1, 2);
explain(verbose) EXECUTE test_prepare_pgss1(1, 3);


I guess this is a corner case.

Re: pg_verifybackup: TAR format backup verification

Robert Haas  writes:
> On Sun, Sep 29, 2024 at 1:03 PM Tom Lane  wrote:
>>> CID 1620458:  Resource leaks  (RESOURCE_LEAK)
>>> Variable "buffer" going out of scope leaks the storage it points to.

> This looks like a real leak. It can only happen once per tarfile when
> verifying a tar backup so it can never add up to much, but it makes
> sense to fix it.

+1

>>> CID 1620457:  Memory - illegal accesses  (OVERRUN)
>>> Overrunning array of 296 bytes at byte offset 296 by dereferencing pointer 
>>> "(char *)&mystreamer->control_file + mystreamer->control_file_bytes".

> I think this might be complaining about a potential zero-length copy.
> Seems like perhaps the <= sizeof(ControlFileData) test should actually
> be < sizeof(ControlFileData).

That's clearly an improvement, but I was wondering if we should also
change "len" and "remaining" to be unsigned (probably size_t).
Coverity might be unhappy about the off-the-end array reference,
but perhaps it's also concerned about what happens if len is negative.

>>> CID 1620456:  Null pointer dereferences  (FORWARD_NULL)
>>> Passing null pointer "suffix" to "strcmp", which dereferences it.

> This one is happening, I believe, because report_backup_error()
> doesn't perform a non-local exit, but we have a bit of code here that
> acts like it does.

Check.

> Patch attached.

WFM, modulo the suggestion about changing data types.

regards, tom lane

Re: Add on_error and log_verbosity options to file_fdw




On 2024/09/26 21:57, torikoshia wrote:

Updated the patches.


Thanks for updating the patches! I’ve made some changes based on your work, 
which are attached.
Barring any objections, I'm thinking to push these patches.

For patches 0001 and 0003, I ran pgindent and updated the commit message.

Regarding patch 0002:

- I updated the regression test to run ANALYZE on the file_fdw foreign table
  since the on_error option also affects the ANALYZE command. To ensure test
  stability, the test now runs ANALYZE with log_verbosity = 'silent'.

- I removed the code that updated the count of skipped rows for
  the pg_stat_progress_copy view. As far as I know, file_fdw doesn’t
  currently support tracking pg_stat_progress_copy.tuples_processed.
  Supporting only tuples_skipped seems inconsistent, so I suggest creating
  a separate patch to extend file_fdw to track both tuples_processed and
  tuples_skipped in this view.

- I refactored the for-loop handling on_error = 'ignore' in 
fileIterateForeignScan()
  by replacing it with a goto statement for improved readability.

- I modified file_fdw to log a NOTICE message about skipped rows at the end of
  ANALYZE if any rows are skipped due to the on_error = 'ignore' setting.

  Regarding the "file contains XXX rows" message reported by the ANALYZE VERBOSE
  command on the file_fdw foreign table, what number should be reflected in XXX,
  especially when some rows are skipped due to on_error = 'ignore'?
  Currently, the count only includes valid rows, excluding any skipped rows.
  I haven't modified this code yet. Should we instead count all rows
  (valid and erroneous) and report that total?

  I noticed the code for reporting the number of skipped rows due to
  on_error = 'ignore' appears in three places. I’m considering creating
  a common function for this reporting to eliminate redundancy but haven’t
  implemented it yet.

- I’ve updated the commit message and run pgindent.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
From 6bcf56dc0556b3e9ded7200229c05c69e9c4fd6a Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi 
Date: Wed, 25 Sep 2024 21:28:15 +0900
Subject: [PATCH v6 1/3] Add log_verbosity = 'silent' support to COPY command.

Previously, when the on_error option was set to ignore, the COPY command
would always log NOTICE messages for input rows discarded due to
data type incompatibility. Users had no way to suppress these messages.

This commit introduces a new log_verbosity setting, 'silent',
which prevents the COPY command from emitting NOTICE messages
when on_error = 'ignore' is used, even if rows are discarded.
This feature is particularly useful when processing malformed files
frequently, where a flood of NOTICE messages can be undesirable.

For example, when frequently loading malformed files via the COPY command
or querying foreign tables using file_fdw (with an upcoming patch to
add on_error support for file_fdw), users may prefer to suppress
these messages to reduce log noise and improve clarity.

Author: Atsushi Torikoshi
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/ab59dad10490ea3734cf022b16c24...@oss.nttdata.com
---
 doc/src/sgml/ref/copy.sgml  | 10 +++---
 src/backend/commands/copy.c |  4 +++-
 src/backend/commands/copyfrom.c |  3 ++-
 src/bin/psql/tab-complete.c |  2 +-
 src/include/commands/copy.h |  4 +++-
 src/test/regress/expected/copy2.out |  4 +++-
 src/test/regress/sql/copy2.sql  |  4 
 7 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 1518af8a04..d87684a5be 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -407,6 +407,8 @@ COPY { table_name [ ( verbose, a NOTICE message
   containing the line of the input file and the column name whose input
   conversion has failed is emitted for each discarded row.
+  When it is set to silent, no message is emitted
+  regarding ignored rows.
  
 

@@ -428,9 +430,11 @@ COPY { table_name [ ( 
  
   Specify the amount of messages emitted by a COPY
-  command: default or verbose. If
-  verbose is specified, additional messages are emitted
-  during processing.
+  command: default, verbose, or
+  silent.
+  If verbose is specified, additional messages are
+  emitted during processing.
+  silent suppresses both verbose and default messages.
  
  
   This is currently used in COPY FROM command when
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 3bb579a3a4..03eb7a4eba 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -427,9 +427,11 @@ defGetCopyLogVerbosityChoice(DefElem *def, ParseState 
*pstate)
char   *sval;
 
/*
-* Allow "default", or "verbose" values.
+* Allow "silent", "default", or "verbose" values.

Re: ACL_MAINTAIN, Lack of comment content

Nathan Bossart  writes:
> On Mon, Sep 30, 2024 at 04:13:55PM +0200, Daniel Gustafsson wrote:
>> I'm not a native speaker so I'm not sure which is right, but grepping for 
>> other
>> lists of items shows that the last "and" item is often preceded by a comma so
>> I'll do that.

> I'm not aware of a project policy around the Oxford comma [0], but I tend
> to include one.

Yeah, as that wikipedia article suggests, you can find support for
either choice.  I'd say do what looks best in context.

regards, tom lane

Re: msys inet_pton strangeness



On 2024-09-30 Mo 11:11 AM, Tom Lane wrote:

Andrew Dunstan  writes:

On 2024-09-30 Mo 10:08 AM, Tom Lane wrote:

Not entirely ... if fairywren had been generating that warning all
along, I would have noticed it long ago, because I periodically
scrape the BF database for compiler warnings.  There has to have
been some recent change in the system include files.

here's what I see on vendikar:

Oh, wait, I forgot this is only about the v15 branch.  I seldom
search for warnings except on HEAD.  Still, I'd have expected to
notice it while v15 was development tip.  Maybe we changed something
since then?

Anyway, it's pretty moot, I see no reason not to push forward
with the proposed fix.





Thanks, done.


cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

Re: AIO v2.0

2024-09-30 Thread Matthias van de Meent

On Mon, 30 Sept 2024 at 16:49, Andres Freund  wrote:
> On 2024-09-17 11:08:19 -0700, Noah Misch wrote:
> > > - I am worried about the need for bounce buffers for writes of checksummed
> > >   buffers. That quickly ends up being a significant chunk of memory,
> > >   particularly when using a small shared_buffers with a higher than 
> > > default
> > >   number of connection. I'm currently hacking up a prototype that'd 
> > > prevent us
> > >   from setting hint bits with just a share lock. I'm planning to start a
> > >   separate thread about that.
> >
> > AioChooseBounceBuffers() limits usage to 256 blocks (2MB) per MaxBackends.
> > Doing better is nice, but I don't consider this a blocker.  I recommend
> > dealing with the worry by reducing the limit initially (128 blocks?).  Can
> > always raise it later.
>
> On storage that has nontrivial latency, like just about all cloud storage,
> even 256 will be too low. Particularly for checkpointer.
>
> Assuming 1ms latency - which isn't the high end of cloud storage latency - 256
> blocks in flight limits you to <= 256MByte/s, even on storage that can have a
> lot more throughput. With 3ms, which isn't uncommon, it's 85MB/s.

FYI, I think you're off by a factor 8, i.e. that would be 2GB/sec and
666MB/sec respectively, given a normal page size of 8kB and exactly
1ms/3ms full round trip latency:

1 page/1 ms * 8kB/page * 256 concurrency = 256 pages/ms * 8kB/page =
2MiB/ms ~= 2GiB/sec.
for 3ms divide by 3 -> ~666MiB/sec.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

Re: [PoC] Federated Authn/z with OAUTHBEARER

2024-09-30 Thread Antonin Houska

Antonin Houska  wrote:

> Jacob Champion  wrote:
> > Now, the token introspection endpoint I mentioned upthread
> 
> Can you please point me to the particular message?

Please ignore this dumb question. You probably referred to the email I was
responding to.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com

Request for Insights on ID Column Migration Approach

2024-09-30 Thread Aditya Singh

I am just contacting you to talk about a current issue with our database.
We have run out of a positive sequence in one of our tables and are now
operating with negative sequences. To address this, we plan to migrate from
the int4 ID column to an int8 ID column.

The plan involves renaming the int8 column to the id column and setting it
as the primary key. However, this process will require downtime, which may
be substantial in a production environment. Fortunately, we have noted that
other tables do not use the id column as a foreign key, which may help
mitigate some concerns.
Our Approach:

   1.

   *Copy Data: *Copy all the data to the new  *id_copy* column.
   2.

   *C**reate a Unique Index*: We will then create a unique index on the new
   ID column before renaming it ,and alter it to non-nullable. This step will
   necessitate scanning the entire table to verify uniqueness and
   non-nullability.
   3.

   *A**dd Primary Key*: After ensuring the uniqueness, we will add the ID
   column as the primary key. By doing this, we hope to bypass the additional
   scanning for uniqueness and nullability, as the column will already be set
   as not nullable and will have the uniqueness constraint from the unique
   index.

We want to confirm if this approach will work as expected. If we should be
aware of any potential pitfalls or considerations, could you please provide
insights or point us toward relevant documentation?

Thank you so much for your help, and I look forward to your guidance.

Best regards,

Aditya Narayan Singh
Loyalty Juggernaut Inc.

-- 

*Confidentiality Warning:*
This message and any attachments are intended 
only for the use of the intended recipient(s), are confidential, and may be 
privileged. If you are not the intended recipient, you are hereby notified 
that any disclosure, copying, distribution, or other use of this message 
and any attachments is strictly prohibited. If received in error, please 
notify the sender immediately and permanently delete it.

Re: Doc: typo in config.sgml

>> I wonder if it would be worth to add a check for this like we have to tabs?

+1.

>> The attached adds a rule to "make -C doc/src/sgml check" for trapping nbsp
>> (doing so made me realize we don't have an equivalent meson target).
> 
> Your patch couldn't detect 0xA0 in config.sgml in my machine, but it works
> when I use `grep -P "[\xA0]"` instead of `grep -e "\xA0"`.
> 
> However, it also detects the following line in charset.sgml.
> (https://www.postgresql.org/docs/current/collation.html)
> 
>  For example, locale und-u-kb sorts 'àe' before 'aé'.
> 
> This is not non-breaking space, so should not be detected as an error.

That's because non-breaking space (nbsp) is not encoded as 0xa0 in
UTF-8. nbsp in UTF-8 is "0xc2 0xa0" (2 bytes) (A 0xa0 is a nbsp's code
point in Unicode. i.e. U+00A0).
So grep -P "[\xC2\xA0]" should work to detect nbsp.

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Re: msys inet_pton strangeness

Andrew Dunstan  writes:
> Ah, so this is because gcc 14.1.0 treats this as an error but gcc 12.2.0 
> treats it as a warning. Now it makes sense.

Not entirely ... if fairywren had been generating that warning all
along, I would have noticed it long ago, because I periodically
scrape the BF database for compiler warnings.  There has to have
been some recent change in the system include files.

regards, tom lane

Re: pg_upgrade check for invalid databases

2024-09-30 Thread Nathan Bossart

On Sun, Sep 29, 2024 at 08:45:50PM -0400, Thomas Krennwallner wrote:
> if a cluster contains invalid databases that we cannot connect to anymore,
> pg_upgrade would currently fail when trying to connect to the first
> encountered invalid database with
> 
> [...]
> 
> If there is more than one invalid database, we need to run pg_upgrade more
> than once (unless the user inspects pg_database).
> 
> I attached two small patches for PG 17 and PG 18 (can be easily backported
> to all previous versions upon request).  Instead of just failing to connect
> with an error, we collect all invalid databases in a report file
> invalid_databases.txt:

Should we have pg_upgrade skip invalid databases?  If the only valid action
is to drop them, IMHO it seems unnecessary to require users to manually
drop them before retrying pg_upgrade.

-- 
nathan

Re: pg_basebackup and error messages dependent on the order of the arguments

"Daniel Westermann (DWE)"  writes:
> shouldn't this give the same error message?

> $ pg_basebackup --checkpoint=fast --format=t --compress 
> --pgdata=/var/tmp/dummy
> pg_basebackup: error: must specify output directory or backup target
> pg_basebackup: hint: Try "pg_basebackup --help" for more information.

> $ pg_basebackup --pgdata=/var/tmp/dummy  --checkpoint=fast --format=t 
> --compress
> pg_basebackup: option '--compress' requires an argument
> pg_basebackup: hint: Try "pg_basebackup --help" for more information.

> I don't see why the first case gives not the same message as the second, 
> especially as the "output directory" is specified.

It appears that the first case treats "--pgdata=/var/tmp/dummy"
as the argument of --compress, and it doesn't bother to check that
that specifies a valid compression algorithm until much later.

As this example shows, we really ought to validate the compression
argument on sight in order to get sensible error messages.  The
trouble is that for server-side compression the code wants to just
pass the string through to the server and not form its own opinion
as to whether it's a known algorithm.

Perhaps it would help if we simply rejected strings beginning
with a dash?  I haven't tested, but roughly along the lines of

case 'Z':
+   /* we don't want to check the algorithm yet, but ... */
+   if (optarg[0] == '-')
+   pg_fatal("invalid compress option \"%s\", optarg);
backup_parse_compress_options(optarg, &compression_algorithm,
  &compression_detail, 
&compressloc);
break;

regards, tom lane

Re: [PoC] Federated Authn/z with OAUTHBEARER

2024-09-30 Thread Antonin Houska

Jacob Champion  wrote:

> On Fri, Sep 27, 2024 at 10:58 AM Antonin Houska  wrote:
> > Have you considered sending the token for validation to the server, like 
> > this
> >
> > curl -X GET "https://www.googleapis.com/oauth2/v3/userinfo"; -H 
> > "Authorization: Bearer $TOKEN"
> 
> In short, no, but I'm glad you asked. I think it's going to be a
> common request, and I need to get better at explaining why it's not
> safe, so we can document it clearly. Or else someone can point out
> that I'm misunderstanding, which honestly would make all this much
> easier and less complicated. I would love to be able to do it that
> way.
> 
> We cannot, for the same reason libpq must send the server an access
> token instead of an ID token. The /userinfo endpoint tells you who the
> end user is, but it doesn't tell you whether the Bearer is actually
> allowed to access the database. That difference is critical: it's
> entirely possible for an end user to be authorized to access the
> database, *and yet* the Bearer token may not actually carry that
> authorization on their behalf. (In fact, the user may have actively
> refused to give the Bearer that permission.)

> That's why people are so pedantic about saying that OAuth is an
> authorization framework and not an authentication framework.

This statement alone sounds as if you missed *authentication*, but you seem to
admit above that the /userinfo endpoint provides it ("tells you who the end
user is"). I agree that it does. My understanding is that this endpoint, as
well as the concept of "claims" and "scopes", is introduced by OpenID, which
is an *authentication* framework, although it's built on top of OAuth.

Regarding *authorization*, I agree that the bearer token may not contain
enough information to determine whether the owner of the token is allowed to
access the database. However, I consider database a special kind of
"application", which can handle authorization on its own. In this case, the
authorization can be controlled by (not) assigning the user the LOGIN
attribute, as well as by (not) granting it privileges on particular database
objects. In short, I think that *authentication* is all we need.

> To illustrate, think about all the third-party web services out there
> that ask you to Sign In with Google. They ask Google for permission to
> access your personal ID, and Google asks you if you're okay with that,
> and you either allow or deny it. Now imagine that I ran one of those
> services, and I decided to become evil. I could take my legitimately
> acquired Bearer token -- which should only give me permission to query
> your Google ID -- and send it to a Postgres database you're authorized
> to access.
> 
> The server is supposed to introspect it, say, "hey, this token doesn't
> give the bearer access to the database at all," and shut everything
> down. For extra credit, the server could notice that the client ID
> tied to the access token isn't even one that it recognizes! But if all
> the server does is ask Google, "what's the email address associated
> with this token's end user?", then it's about to make some very bad
> decisions. The email address it gets back doesn't belong to Jacob the
> Evil Bearer; it belongs to you.

Are you sure you can legitimately acquire the bearer token containing my email
address? I think the email address returned by the /userinfo endpoint is one
of the standard claims [1]. Thus by returning the particular value of "email"
from the endpoint the identity provider asserts that the token owner does have
this address. (And that, if "email_verified" claim is "true", it spent some
effort to verify that the email address is controlled by that user.)

> Now, the token introspection endpoint I mentioned upthread

Can you please point me to the particular message?

> should give us the required information (scopes, etc.). But Google doesn't
> implement that one. In fact they don't seem to have implemented custom
> scopes at all in the years since I started work on this feature, which makes
> me think that people are probably not going to be able to safely log into
> Postgres using Google tokens. Hopefully there's some feature buried
> somewhere that I haven't seen.
> 
> Let me know if that makes sense. (And again: I'd love to be proven
> wrong. It would improve the reach of the feature considerably if I
> am.)

Another question, assuming the token verification is resolved somehow:
wouldn't it be sufficient for the initial implementation if the client could
pass the bearer token to libpq in the connection string?

Obviously, one use case is than an application / web server which needs the
token to authenticate the user could eventually pass the token to the database
server. Thus, if users could authenticate to the database using their
individual ids, it would no longer be necessary to store a separate userid /
password for the application in a configuration file.

Also, if libpq accepted the bearer token via the connection string,

Re: msys inet_pton strangeness




On 2024-09-30 Mo 7:00 AM, Alexander Lakhin wrote:

Hello Andrew and Thomas,

29.09.2024 18:47, Andrew Dunstan пишет:


I'm inclined to think we might need to reverse the order of the last 
two. TBH I don't really understand how this has worked up to now.




I've looked at the last successful run [1] and discovered that
fe-secure-common.c didn't compile cleanly too:
ccache gcc ... 
/home/pgrunner/bf/root/REL_15_STABLE/pgsql.build/../pgsql/src/interfaces/libpq/fe-secure-common.c
C:/tools/nmsys64/home/pgrunner/bf/root/REL_15_STABLE/pgsql/src/interfaces/libpq/fe-secure-common.c: 
In function 'pq_verify_peer_name_matches_certificate_ip':
C:/tools/nmsys64/home/pgrunner/bf/root/REL_15_STABLE/pgsql/src/interfaces/libpq/fe-secure-common.c:219:21: 
warning: implicit declaration of function 'inet_pton'; did you mean 
'inet_aton'? [-Wimplicit-function-declaration]

  219 | if (inet_pton(AF_INET6, host, &addr) == 1)
  | ^
  | inet_aton

So it worked just because that missing declaration generated just a
warning, not an error.




Ah, so this is because gcc 14.1.0 treats this as an error but gcc 12.2.0 
treats it as a warning. Now it makes sense.



cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

RE: long-standing data loss bug in initial sync of logical replication

2024-09-30 Thread Hayato Kuroda (Fujitsu)

> 2.
> Similarly with above, the relcache won't be invalidated when ALTER
> PUBLICATION
> OWNER TO is executed. This means that privilage checks may be ignored if the
> entry
> is still valid. Not sure, but is there a possibility this causes an 
> inconsistency?

Hmm, IIUC, the attribute pubowner is not used for now. The paragpargh
"There are currently no privileges on publications" [1] may show the current
status. However, to keep the current behavior, I suggest to invalidate the 
relcache
of pubrelations when the owner is altered.

[1]: https://www.postgresql.org/docs/devel/logical-replication-security.html

Best regards,
Hayato Kuroda
FUJITSU LIMITED

Re: pg_upgrade check for invalid databases

2024-09-30 Thread Thomas Krennwallner

On 30/09/2024 17.29, Daniel Gustafsson wrote:

On 30 Sep 2024, at 16:55, Tom Lane wrote:

TBH I'm not finding anything very much wrong with the current
behavior... this has to be a rare situation, do we need to add
debatable behavior to make it easier?

One argument would be to make the checks consistent, pg_upgrade generally tries
to report all the offending entries to help the user when fixing the source
database. Not sure if it's a strong enough argument for carrying code which
really shouldn't see much use though.
In general, I agree that this situation should be rare for deliberate
DROP DATABASE interrupted in interactive sessions.

Unfortunately, for (popular) tools that perform automatic "temporary
database" cleanup, we could recently see an increase in invalid databases.

The additional check for pg_upgrade was made necessary due to several
unrelated customers having invalid databases that stem from left-over
Prisma Migrate "shadow databases" [1]. We could not reproduce this
Prisma Migrate issue yet, as those migrations happened some time ago.
Maybe this bug really stems from a much older Prisma Migrate version and
we only see the fallout now. This is still a TODO item.

But it appears that this tool can get interrupted "at the wrong time"
while it is deleting temporary databases (probably a manual Ctrl-C), and
clients are unaware that this can then leave behind invalid databases.

Those temporary databases do not cause any harm as they are not used
anymore. But eventually, PG installations will be upgraded to the next
major version, and it is only then when those invalid databases
resurface after pg_upgrade fails to run the checks.

Long story short: interactive DROP DATABASE interrupts are rare (they do
exist, but customers are usually aware). Automation tools on the other
hand may run DROP DATABASE and when they get interrupted at the wrong
time they will then produce several left-over invalid databases.
pg_upgrade will then fail to run the checks.

[1]
https://www.prisma.io/docs/orm/prisma-migrate/understanding-prisma-migrate/shadow-database

Re: Doc: typo in config.sgml

>> That's because non-breaking space (nbsp) is not encoded as 0xa0 in
>> UTF-8. nbsp in UTF-8 is "0xc2 0xa0" (2 bytes) (A 0xa0 is a nbsp's code
>> point in Unicode. i.e. U+00A0).
>> So grep -P "[\xC2\xA0]" should work to detect nbsp.
> 
> `LC_ALL=C grep -P "\xC2\xA0"` works for my environment. 
> ([ and ] were not necessary.)
> 
> When LC_ALL is null, `grep -P "\xA0"` could not detect any characters in 
> charset.sgml,
> but I think it is better to specify both LC_ALL=C and "\xC2\xA0" for making 
> sure detecting
> nbsp.
> 
> One problem is that -P option can be used in only GNU grep, and grep in mac 
> doesn't support it.
> 
> On bash, we can also use `grep $'\xc2\xa0'`, but I am not sure we can assume 
> the shell is bash.
> 
> Maybe, better way is use perl itself rather than grep as following.
> 
>  `perl -ne '/\xC2\xA0/ and print' `
> 
> I attached a patch fixed in this way.

GNU sed can also be used without setting LC_ALL:

sed -n /"\xC2\xA0"/p

However I am not sure if non-GNU sed can do this too...

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Re: query_id, pg_stat_activity, extended query protocol

2024-09-30 Thread Michael Paquier

On Mon, Sep 30, 2024 at 10:07:55AM +0900, Michael Paquier wrote:
> The first point is just some prevention for the future.  The second
> case is something we should fix and test.  I am attaching a patch that
> addresses both.  Note that the test case cannot use a transaction
> block as query IDs are only reported for the top queries, and we can
> do a scan of pg_stat_activity to see if the query ID is set.  The
> assertion was getting more complicated, so I have hidden that behind a
> macro in execMain.c.  All that should complete this project.

And done this part.

While looking at the interactions between query ID and debug_string,
I've bumped into something that could be a fun project for a
contributor.  Will post that in a bit, that may interest some.
--
Michael


signature.asc
Description: PGP signature

pg_basebackup and error messages dependent on the order of the arguments

2024-09-30 Thread Daniel Westermann (DWE)

Hi,

shouldn't this give the same error message?

$ pg_basebackup --checkpoint=fast --format=t --compress --pgdata=/var/tmp/dummy
pg_basebackup: error: must specify output directory or backup target
pg_basebackup: hint: Try "pg_basebackup --help" for more information.

$ pg_basebackup --pgdata=/var/tmp/dummy  --checkpoint=fast --format=t --compress
pg_basebackup: option '--compress' requires an argument
pg_basebackup: hint: Try "pg_basebackup --help" for more information.

I don't see why the first case gives not the same message as the second, 
especially as the "output directory" is specified.

Tested on v17 and devel, but I guess it was always like this.

Regards
Daniel

Re: Add new COPY option REJECT_LIMIT

2024-09-30 Thread torikoshia


On 2024-09-28 10:48, Jian he wrote:

Thanks for your comments!


+/*
+ * Extract REJECT_LIMIT value from a DefElem.
+ */
+static int64
+defGetCopyRejectLimitOptions(DefElem *def)
+{
+ int64 reject_limit;
+
+ if (def->arg == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("REJECT_LIMIT requires a positive integer")));
+
+ if (nodeTag(def->arg) == T_Integer)
+ {
+ reject_limit = defGetInt64(def);
+ if (reject_limit <= 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("REJECT_LIMIT (%lld) must be greater than zero",
+ (long long) reject_limit)));
+ }
+ else
+ {
+ char   *sval = defGetString(def);
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("REJECT_LIMIT (%s) must be a positive integer",
+ sval)));
+ }
+
+ return reject_limit;
+}
there will be an integer overflow issue?

Can you try the following?


Since defGetInt64() checks not only whether the input value exceeds the 
range of bigint but also input value is specified, attached v6 patch 
checks them by directly calling defGetInt64().




static int64
defGetCopyRejectLimitOptions(DefElem *def)
{
int64reject_limit;
reject_limit = defGetInt64(def);
if (reject_limit <= 0)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("REJECT_LIMIT (%lld) must be greater than zero",
(long long) reject_limit)));
}




REJECT_LIMIT integer
i think, you want REJECT_LIMIT be bigint?
so here it should be
REJECT_LIMIT bigint\
?


Hmm, bigint and integer include numbers less than 0, but REJECT_LIMIT 
only accepts numbers greater than 0. I now feel both may not be 
appropriate.
Referring to the manual of CREATE SEQUENCE[1], here we may be able to 
use not only the data types, such as bigint and integer but something 
like minvalue, maxvalue.
I'm wondering if we can use the wording maxerror as in the attached 
patch.



[1] https://www.postgresql.org/docs/devel/sql-createsequence.html


--
Regards,

--
Atsushi Torikoshi
NTT DATA Group CorporationFrom 55a99fc186c263cdd7741a38a9c684c9cb8ac1d1 Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi 
Date: Mon, 30 Sep 2024 20:33:26 +0900
Subject: [PATCH v6] Add new COPY option REJECT_LIMIT

9e2d870 enabled the COPY FROM command to skip soft errors, but there
was no limitation about the number of errors and skipped all the soft
errors.
In some cases there would be a limitation how many errors are
tolerable. This patch introduces REJECT_LIMIT to specify how many
errors can be skipped.
---
 doc/src/sgml/ref/copy.sgml  | 19 
 src/backend/commands/copy.c | 34 +
 src/backend/commands/copyfrom.c |  5 +
 src/include/commands/copy.h |  1 +
 src/test/regress/expected/copy2.out | 10 +
 src/test/regress/sql/copy2.sql  | 21 ++
 6 files changed, 90 insertions(+)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 1518af8a04..2d645ed255 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -44,6 +44,7 @@ COPY { table_name [ ( column_name [, ...] ) | * }
 FORCE_NULL { ( column_name [, ...] ) | * }
 ON_ERROR error_action
+REJECT_LIMIT maxerror
 ENCODING 'encoding_name'
 LOG_VERBOSITY verbosity
 
@@ -411,6 +412,24 @@ COPY { table_name [ ( 

 
+   
+REJECT_LIMIT
+
+ 
+  Specifies the maximum number of errors tolerated while converting a
+  column's input value to its data type, when ON_ERROR is
+  set to ignore.
+  If the input causes more errors than the specified value, the COPY
+  command fails, even with ON_ERROR set to ignore.
+  This clause must be used with ON_ERROR=ignore
+  and maxerror must be positive.
+  If not specified, ON_ERROR=ignore
+  allows an unlimited number of errors, meaning COPY will
+  skip all erroneous data.
+ 
+
+   
+

 ENCODING
 
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 3bb579a3a4..f3edc01605 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -418,6 +418,23 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
 	return COPY_ON_ERROR_STOP;	/* keep compiler quiet */
 }
 
+/*
+ * Extract REJECT_LIMIT value from a DefElem.
+ */
+static int64
+defGetCopyRejectLimitOption(DefElem *def)
+{
+	int64	reject_limit = defGetInt64(def);
+
+	if (reject_limit <= 0)
+		ereport(ERROR,
+			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			 errmsg("REJECT_LIMIT (%lld) must be greater than zero",
+ (long long) reject_limit)));
+
+	return reject_limit;
+}
+
 /*
  * Extract a CopyLogVerbosityChoice value from a DefElem.
  */
@@ -470,6 +487,7 @@ ProcessCopyOptions(ParseState *pstate,
 	bool		header_specified = false;
 	bool		on_error_specified = false;
 	bool		log_verbosity_specified = false;
+	bool		reject_limit_specified = false;
 	ListCell   *option;
 
 	/* Sup

Re: pg_walsummary, Character-not-present-in-option

btsugieyuusuke  writes:
>> Therefore, shouldn't “f:” and “w:” be removed?

Looks like that to me too.  Pushed.

regards, tom lane

Re: general purpose array_sort

2024-09-30 Thread Junwang Zhao

Hi Jian,

On Mon, Sep 30, 2024 at 11:13 PM jian he  wrote:
>
> On Mon, Sep 30, 2024 at 1:01 PM Junwang Zhao  wrote:
> >
> > > I would suggest accepting:
> > > asc
> > > desc
> > > asc nulls first
> > > asc nulls last *
> > > desc nulls first *
> > > desc nulls last
> > >
> > > As valid inputs for "dir" - and that the starred options are the defaults 
> > > when null position is omitted.
> > >
> > > In short, mimic create index.
> > >
> > > David J.
> > >
> >
> > PFA v3 with David's suggestion addressed.
> >
>
> I think just adding 2 bool arguments (asc/desc, nulls last/not nulls
> last) would be easier.
Yeah, this would be easier, it's just the intarray module use
the direction parameter, I keep it here for the same user
experience, I don't insist if some committer thinks 2 bool arguments
would be a better option.

> but either way, (i don't have a huge opinion)
> but document the second argument, imagine case
> SELECT array_sort('{a,B}'::text[] , E'aSc NulLs  LaST \t\r\n');
> would be tricky?

The case you provide should give the correct results, but
I doubt users will do this.
I'm not good at document wording, so you might give me some help
with the document part.

>
>
> errmsg("multidimensional arrays sorting are not supported")));
> write a sql test to trigger the error message that would be great.
>
> you can add two or one example to collate.icu.utf8.sql to demo that it
> actually works with COLLATE  collation_name
> like:
> SELECT array_sort('{a,B}'::text[] COLLATE case_insensitive);
> SELECT array_sort('{a,B}'::text[] COLLATE "C");
>
Fixed.

>
> #define WHITESPACE " \t\n\r"
> you may also check function scanner_isspace
>
Fixed.

>
> + typentry = (TypeCacheEntry *) fcinfo->flinfo->fn_extra;
> + if (typentry == NULL || typentry->type_id != elmtyp)
> + {
> + typentry = lookup_type_cache(elmtyp, sort_asc ? TYPECACHE_LT_OPR :
> TYPECACHE_GT_OPR);
> + fcinfo->flinfo->fn_extra = (void *) typentry;
> + }
> you need to one-time check typentry->lt_opr or typentry->gt_opr exists?
> see CreateStatistics.
> /* Disallow data types without a less-than operator */
> type = lookup_type_cache(attForm->atttypid, TYPECACHE_LT_OPR);
> if (type->lt_opr == InvalidOid)
> ereport(ERROR,
> (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>  errmsg("column \"%s\" cannot be used in
> statistics because its type %s has no default btree operator class",
> attname, format_type_be(attForm->atttypid;

I added an Assert for this part, not sure if that is enough.

-- 
Regards
Junwang Zhao


v4-0001-general-purpose-array_sort.patch
Description: Binary data

Re: Psql meta-command conninfo+

2024-09-30 Thread Jim Jones




On 01.10.24 06:27, Hunaid Sohail wrote:
> There are two users in the conninfo+: 'session' and 'authenticated'.
> Both are documented.

Right. I meant "Session User"

> Authenticated User: The name of the user returned by PQuser()
> Session User: The session user's name.

Thanks

-- 
Jim

Re: SET or STRICT modifiers on function affect planner row estimates

2024-09-30 Thread Ashutosh Bapat

Hi Michal,
It is difficult to understand the exact problem from your description.
Can you please provide EXPLAIN outputs showing the expected plan and
the unexpected plan; plans on the node where the query is run and
where the partitions are located.

On Mon, Sep 30, 2024 at 12:19 AM Michał Kłeczek  wrote:
>
> Hi Hackers,
>
> I am not sure if this is a bug or I am missing something:
>
> There is a partitioned table with partitions being a mix of foreign and 
> regular tables.
> I have a function:
>
> report(param text) RETURNS TABLE(…) STABLE LANGUAGE sql AS
> $$
> SELECT col1, expr1(col2), expr2(col2), sum(col3) FROM tbl GROUP BY col1, 
> expr1(col2), expr2(col2)
> $$
>
> EXPLAIN SELECT * FROM report(‘xyz’);
>
> returns expected plan pushing down aggregate expression to remote server.
>
> When I add STRICT or SET search_path to the function definition, the  plan is 
> (as expected) a simple function scan.
> But - to my surprise - auto explain in the logs shows unexpected plan with 
> all nodes scanning partitions having row estimates = 1
>
> Is it expected behavior?
>
> —
> Michal
>


-- 
Best Wishes,
Ashutosh Bapat

Re: ACL_MAINTAIN, Lack of comment content

> On 30 Sep 2024, at 12:38, Yugo Nagata  wrote:
> 
> On Mon, 30 Sep 2024 11:40:29 +0200
> Daniel Gustafsson  wrote:
> 
>> - * MATERIALIZED VIEW, and REINDEX on all relations.
>> + * MATERIALIZED VIEW, REINDEX and LOCK TABLE on all relations.
> 
> Should we put a comma between REINDEX and "and" as following?
> 
> "... MATERIALIZED VIEW, REINDEX, and LOCK TABLE on all relations."

I'm not a native speaker so I'm not sure which is right, but grepping for other
lists of items shows that the last "and" item is often preceded by a comma so
I'll do that.

--
Daniel Gustafsson

Re: ACL_MAINTAIN, Lack of comment content

2024-09-30 Thread Nathan Bossart

On Mon, Sep 30, 2024 at 04:13:55PM +0200, Daniel Gustafsson wrote:
>> On 30 Sep 2024, at 12:38, Yugo Nagata  wrote:
>> 
>> Should we put a comma between REINDEX and "and" as following?
>> 
>> "... MATERIALIZED VIEW, REINDEX, and LOCK TABLE on all relations."
> 
> I'm not a native speaker so I'm not sure which is right, but grepping for 
> other
> lists of items shows that the last "and" item is often preceded by a comma so
> I'll do that.

I'm not aware of a project policy around the Oxford comma [0], but I tend
to include one.

[0] https://en.wikipedia.org/wiki/Serial_comma

-- 
nathan

Re: Doc: typo in config.sgml

On Mon, 30 Sep 2024 20:07:31 +0900 (JST)
Tatsuo Ishii  wrote:

> >> I wonder if it would be worth to add a check for this like we have to tabs?
> 
> +1.
> 
> >> The attached adds a rule to "make -C doc/src/sgml check" for trapping nbsp
> >> (doing so made me realize we don't have an equivalent meson target).
> > 
> > Your patch couldn't detect 0xA0 in config.sgml in my machine, but it works
> > when I use `grep -P "[\xA0]"` instead of `grep -e "\xA0"`.
> > 
> > However, it also detects the following line in charset.sgml.
> > (https://www.postgresql.org/docs/current/collation.html)
> > 
> >  For example, locale und-u-kb sorts 'àe' before 'aé'.
> > 
> > This is not non-breaking space, so should not be detected as an error.
> 
> That's because non-breaking space (nbsp) is not encoded as 0xa0 in
> UTF-8. nbsp in UTF-8 is "0xc2 0xa0" (2 bytes) (A 0xa0 is a nbsp's code
> point in Unicode. i.e. U+00A0).
> So grep -P "[\xC2\xA0]" should work to detect nbsp.

`LC_ALL=C grep -P "\xC2\xA0"` works for my environment. 
([ and ] were not necessary.)

When LC_ALL is null, `grep -P "\xA0"` could not detect any characters in 
charset.sgml,
but I think it is better to specify both LC_ALL=C and "\xC2\xA0" for making 
sure detecting
nbsp.

One problem is that -P option can be used in only GNU grep, and grep in mac 
doesn't support it.

On bash, we can also use `grep $'\xc2\xa0'`, but I am not sure we can assume 
the shell is bash.

Maybe, better way is use perl itself rather than grep as following.

 `perl -ne '/\xC2\xA0/ and print' `

I attached a patch fixed in this way.

Regards,
Yugo Nagata

> 
> Best reagards,
> --
> Tatsuo Ishii
> SRA OSS K.K.
> English: http://www.sraoss.co.jp/index_en/
> Japanese:http://www.sraoss.co.jp


-- 
Yugo NAGATA 
diff --git a/doc/src/sgml/Makefile b/doc/src/sgml/Makefile
index 9c9bbfe375..2081ba1ffc 100644
--- a/doc/src/sgml/Makefile
+++ b/doc/src/sgml/Makefile
@@ -194,7 +194,7 @@ MAKEINFO = makeinfo
 ##
 
 # Quick syntax check without style processing
-check: postgres.sgml $(ALLSGML) check-tabs
+check: postgres.sgml $(ALLSGML) check-tabs check-nbsp
 	$(XMLLINT) $(XMLINCLUDE) --noout --valid $<
 
 
@@ -259,6 +259,9 @@ endif # sqlmansectnum != 7
 check-tabs:
 	@( ! grep '	' $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml $(srcdir)/*.xsl) ) || (echo "Tabs appear in SGML/XML files" 1>&2;  exit 1)
 
+check-nbsp:
+	@( ! $(PERL) -ne '/\xC2\xA0/ and print "$$ARGV $$_"' $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml $(srcdir)/*.xsl) ) || (echo "Non-breaking space appear in SGML/XML files" 1>&2;  exit 1)
+
 ##
 ## Clean
 ##

Re: Changing the default random_page_cost value

2024-09-30 Thread Greg Sabino Mullane

On Fri, Sep 27, 2024 at 12:03 PM Dagfinn Ilmari Mannsåker 
wrote:

> It might also be worth mentioning cloudy block storage (e.g. AWS' EBS),
> which is typically backed by SSDs, but has extra network latency.
>

That seems a little too in the weeds for me, but wording suggestions are
welcome. To get things moving forward, I made a doc patch which changes a
few things, namely:

* Mentions the distinction between ssd and hdd right up front.
* Moves the tablespace talk to the very end, as tablespace use is getting
rarer (again, thanks in part to ssds)
* Mentions the capability to set per-database and per-role since we mention
per-tablespace.
* Removes a lot of the talk of caches and justifications for the 4.0
setting. While those are interesting, I've been tuning this parameter for
many years and never really cared about the "90% cache rate". The proof is
in the pudding: rpc is the canonical "try it and see" parameter. Tweak.
Test. Repeat.

Cheers,
Greg

0001-Lower-random_page_cost-default-to-1.2-and-update-docs-about-it.patch
Description: Binary data

Re: Conflict Detection and Resolution

2024-09-30 Thread shveta malik

On Mon, Sep 30, 2024 at 2:55 PM Peter Smith  wrote:
>
> On Mon, Sep 30, 2024 at 4:29 PM shveta malik  wrote:
> >
> > On Mon, Sep 30, 2024 at 11:04 AM Peter Smith  wrote:
> > >
> > > On Mon, Sep 30, 2024 at 2:27 PM shveta malik  
> > > wrote:
> > > >
> > > > On Fri, Sep 27, 2024 at 1:00 PM Peter Smith  
> > > > wrote:
> > > >
> > > > > ~~~
> > > > >
> > > > > 14.
> > > > > 99. General - ordering of conflict_resolver
> > > > >
> > > > > nit - ditto. Let's name these in alphabetical order. IMO it makes more
> > > > > sense than the current random ordering.
> > > > >
> > > >
> > > >  I feel ordering of resolvers should be same as that of conflict
> > > > types, i.e. resolvers of insert variants first, then update variants,
> > > > then delete variants. But would like to know what others think on
> > > > this.
> > > >
> > >
> > > Resolvers in v14 were documented in this random order:
> > > error
> > > skip
> > > apply_remote
> > > keep_local
> > > apply_or_skip
> > > apply_or_error
> > >
> >
> > Yes, these should be changed.
> >
> > > Some of these are resolvers for different conflicts. How can you order
> > > these as "resolvers for insert" followed by "resolvers for update"
> > > followed by "resolvers for delete" without it all still appearing in
> > > random order?
> >
> > I was thinking of ordering them like this:
> >
> > apply_remote:  applicable to insert_exists, update_exists,
> > update_origin_differ, delete_origin_differ
> > keep_local:   applicable to insert_exists,
> > update_exists,  update_origin_differ, delete_origin_differ
> > apply_or_skip:  applicable to update_missing
> > apply_or_error :applicable to update_missing
> > skip:  applicable to update_missing and
> > delete_missing
> > error: applicable to all.
> >
> > i.e. in order of how they are applicable to conflict_types starting
> > from insert_exists till delete_origin_differ  (i.e. reading
> > ConflictTypeResolverMap, from left to right and then top to bottom).
> > Except I have kept 'error' at the end instead of keeping it after
> > 'keep_local' as the former makes more sense there.
> >
>
> This proves my point because, without your complicated explanation to
> accompany it, the final order (below) just looks random to me:
> apply_remote
> keep_local
> apply_or_skip
> apply_or_error
> skip
> error
>
> Unless there is some compelling reason to do it differently, I still
> prefer A-Z (the KISS principle).
>

The "applicable to conflict_types" against each resolver (which will
be mentioned in doc too) is a pretty good reason in itself to keep the
resolvers in the suggested order.  To me, it seems more logical than
placing 'apply_or_error' which only applies to the 'update_missing'
conflict_type at the top, while 'error,' which applies to all
conflict_types, placed in the middle. But I understand that
preferences may vary, so I'll leave this to the discretion of others.

thanks
Shveta

Re: Conflict Detection and Resolution

2024-09-30 Thread shveta malik

On Tue, Oct 1, 2024 at 9:48 AM shveta malik  wrote:
>

I have started reviewing patch002, it is WIP, but please find initial
set of comments:

1.
ExecSimpleRelationInsert():
+ /* Check for conflict and return to caller for resolution if found */
+ if (resolver != CR_ERROR &&
+ has_conflicting_tuple(estate, resultRelInfo, &(*conflictslot),
+   CT_INSERT_EXISTS, resolver, slot, subid,
+   apply_remote))

Why are we calling has_conflicting_tuple only if the resolver is not
'ERROR '? Is it for optimization to avoid pre-scan for ERROR cases? If
so, please add a comment.

2)
has_conflicting_tuple():
+ /*
+ * Return if any conflict is found other than one with 'ERROR'
+ * resolver configured. In case of 'ERROR' resolver, emit error here;
+ * otherwise return to caller for resolutions.
+ */
+ if (FindConflictTuple(resultRelInfo, estate, uniqueidx, slot,
+   &(*conflictslot)))

has_conflicting_tuple() is called only from ExecSimpleRelationInsert()
when the resolver of 'insert_exists' is not 'ERROR', then why do we
have the above comment in has_conflicting_tuple()?


3)
Since has_conflicting_tuple() is only called for insert_exists
conflict, better to name it as 'has_insert_conflicting_tuple' or
'find_insert_conflicting_tuple'. My preference is the second one,
similar to FindConflictTuple().

4)
We can have an ASSERT in has_conflicting_tuple() that conflict_type is
only insert_exists.

5)
has_conflicting_tuple():
+ }
+ return false;
+}

we can have a blank line before returning.

6)
Existing has_conflicting_tuple header comment:
+/*
+ * Check all the unique indexes for conflicts and return true if found.
+ * If the configured resolver is in favour of apply, give the conflicted
+ * tuple information in conflictslot.
+ */

Suggestion:
/*
 * Check the unique indexes for conflicts. Return true on finding the
first conflict itself.
 *
 * If the configured resolver is in favour of apply, give the conflicted
 * tuple information in conflictslot.
 */



7)
Can we please rearrange 'has_conflicting_tuple' arguments. First
non-pointers, then single pointers and then double pointers.

Oid subid, ConflictType type, ConflictResolver resolver, bool
apply_remote, ResultRelInfo *resultRelInfo, EState *estate,
TupleTableSlot *slot, TupleTableSlot **conflictslot

8)
Now since we are doing a pre-scan of indexes before the actual
table-insert, this existing comment needs some change. Also we need to
mention why we are scanning again when we have done pre-scan already.

/*
 * Checks the conflict indexes to fetch the
conflicting local tuple
 * and reports the conflict. We perform this check
here, instead of
 * performing an additional index scan before the
actual insertion and
 * reporting the conflict if any conflicting tuples
are found. This is
 * to avoid the overhead of executing the extra scan
for each INSERT
 * operation, 
 */

thanks
Shveta

Re: Psql meta-command conninfo+

2024-09-30 Thread Hunaid Sohail

Hi,

On Mon, Sep 30, 2024 at 11:16 PM Jim Jones 
wrote:

> On 26.09.24 09:15, Hunaid Sohail wrote:
> > This patch renames "Current User" to "Authenticated User" as suggested
> > by me in my last email. I have also updated the documentation
> accordingly.
>
> Could you perhaps in the documentation elaborate a bit more on the
> difference between "Current User" and "Authenticated User"? IMHO a
> couple of words on how exactly they differ would be very helpful, as
> they show the same user in many cases.
>

There is no "Current User" in the conninfo+; as I mentioned, I have renamed
it to "Authenticated User".
There are two users in the conninfo+: 'session' and 'authenticated'. Both
are documented.


> Although a bit redundant, I'd argue that "SSL Compression" is better
> than just "Compression".
>

Noted.

Regards,
Hunaid Sohail

Re: Doc: typo in config.sgml

> On Mon, 30 Sep 2024 17:23:24 +0900 (JST)
> Tatsuo Ishii  wrote:
> 
>> >> I think there's an unnecessary underscore in config.sgml.
>> >> Attached patch fixes it.
>> > 
>> > I could not apply the patch with an error.
>> > 
>> >  error: patch failed: doc/src/sgml/config.sgml:9380
>> >  error: doc/src/sgml/config.sgml: patch does not apply
>> 
>> Strange. I have no problem applying the patch here.
>> 
>> > I found your patch contains an odd character (ASCII Code 240?)
>> > by performing `od -c` command on the file. See the attached file.
>> 
>> Yes, 240 in octal (== 0xc2) is in the patch but it's because current
>> config.sgml includes the character. You can check it by looking at
>> line 9383 of config.sgml.
> 
> Yes, you are right, I can find the 0xc2 char in config.sgml using od -c,
> although I still could not apply the patch. 
> 
> I think this is non-breaking space of (C2A0) of utf-8. I guess my
> terminal normally regards this as a space, so applying patch fails.
> 
> I found it also in line 85 of ref/drop_extension.sgml.

Thanks. I have pushed the fix for ref/drop_extension.sgml along with
config.sgml.

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Re: Doc: typo in config.sgml

On Tue, 01 Oct 2024 10:33:50 +0900 (JST)
Tatsuo Ishii  wrote:

> >> That's because non-breaking space (nbsp) is not encoded as 0xa0 in
> >> UTF-8. nbsp in UTF-8 is "0xc2 0xa0" (2 bytes) (A 0xa0 is a nbsp's code
> >> point in Unicode. i.e. U+00A0).
> >> So grep -P "[\xC2\xA0]" should work to detect nbsp.
> > 
> > `LC_ALL=C grep -P "\xC2\xA0"` works for my environment. 
> > ([ and ] were not necessary.)
> > 
> > When LC_ALL is null, `grep -P "\xA0"` could not detect any characters in 
> > charset.sgml,
> > but I think it is better to specify both LC_ALL=C and "\xC2\xA0" for making 
> > sure detecting
> > nbsp.
> > 
> > One problem is that -P option can be used in only GNU grep, and grep in mac 
> > doesn't support it.
> > 
> > On bash, we can also use `grep $'\xc2\xa0'`, but I am not sure we can 
> > assume the shell is bash.
> > 
> > Maybe, better way is use perl itself rather than grep as following.
> > 
> >  `perl -ne '/\xC2\xA0/ and print' `
> > 
> > I attached a patch fixed in this way.
> 
> GNU sed can also be used without setting LC_ALL:
> 
> sed -n /"\xC2\xA0"/p
> 
> However I am not sure if non-GNU sed can do this too...

Although I've not check it myself, BSD sed doesn't support \x escape according 
to [1].

[1] 
https://stackoverflow.com/questions/24275070/sed-not-giving-me-correct-substitute-operation-for-newline-with-mac-difference

By the way, I've attached a patch a bit modified to use the plural form 
statement
as same as check-tabs.

 Non-breaking **spaces** appear in SGML/XML files


Regards,
Yugo Nagata

> 
> Best reagards,
> --
> Tatsuo Ishii
> SRA OSS K.K.
> English: http://www.sraoss.co.jp/index_en/
> Japanese:http://www.sraoss.co.jp


-- 
Yugo NAGATA 
diff --git a/doc/src/sgml/Makefile b/doc/src/sgml/Makefile
index 9c9bbfe375..17feae9ed0 100644
--- a/doc/src/sgml/Makefile
+++ b/doc/src/sgml/Makefile
@@ -194,7 +194,7 @@ MAKEINFO = makeinfo
 ##
 
 # Quick syntax check without style processing
-check: postgres.sgml $(ALLSGML) check-tabs
+check: postgres.sgml $(ALLSGML) check-tabs check-nbsp
 	$(XMLLINT) $(XMLINCLUDE) --noout --valid $<
 
 
@@ -259,6 +259,9 @@ endif # sqlmansectnum != 7
 check-tabs:
 	@( ! grep '	' $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml $(srcdir)/*.xsl) ) || (echo "Tabs appear in SGML/XML files" 1>&2;  exit 1)
 
+check-nbsp:
+	@( ! $(PERL) -ne '/\xC2\xA0/ and print "$$ARGV $$_"' $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml $(srcdir)/*.xsl) ) || (echo "Non-breaking spaces appear in SGML/XML files" 1>&2;  exit 1)
+
 ##
 ## Clean
 ##

Re: Address the -Wuse-after-free warning in ATExecAttachPartition()

2024-09-30 Thread Amit Langote

Hi,

On Mon, Jul 8, 2024 at 7:08 PM Junwang Zhao  wrote:
> On Mon, Jul 8, 2024 at 3:22 PM Nitin Jadhav
>  wrote:
> >
> > In [1], Andres reported a -Wuse-after-free bug in the
> > ATExecAttachPartition() function.  I've created a patch to address it
> > with pointers from Amit offlist.
> >
> > The issue was that the partBoundConstraint variable was utilized after
> > the list_concat() function. This could potentially lead to accessing
> > the partBoundConstraint variable after its memory has been freed.
> >
> > The issue was resolved by using the return value of the list_concat()
> > function, instead of using the list1 argument of list_concat(). I
> > copied the partBoundConstraint variable to a new variable named
> > partConstraint and used it for the previous references before invoking
> > get_proposed_default_constraint(). I confirmed that the
> > eval_const_expressions(), make_ands_explicit(),
> > map_partition_varattnos(), QueuePartitionConstraintValidation()
> > functions do not modify the memory location pointed to by the
> > partBoundConstraint variable. Therefore, it is safe to use it for the
> > next reference in get_proposed_default_constraint()
> >
> > Attaching the patch. Please review and share the comments if any.
> > Thanks to Andres for spotting the bug and some off-list advice on how
> > to reproduce it.
>
> The patch LGTM.

I reviewed the faulty code in ATExecAttachPartition() that this patch
aims to address, and it seems that the fix suggested by Alvaro in the
original report [1] might be more appropriate. It also allows us to
resolve a minor issue related to how partBoundConstraint is handled
later in the function. Specifically, the constraint checked against
the default partition, if one exists on the parent table, should not
include the negated version of the parent's constraint, which the
current code mistakenly does. This occurs because the list_concat()
operation modifies the partBoundConstraint when combining it with the
parent table’s partition constraint. Although this doesn't cause a
live bug, the inclusion of the redundant parent constraint introduces
unnecessary complexity in the combined constraint expression. This can
cause slight delays in evaluating the constraint, as the redundant
check always evaluates to false for rows in the default partition.
Ultimately, the constraint failure is still determined by the negated
version of the partition constraint of the table being attached.

I'm inclined to apply the attached patch only to the master branch as
the case for applying this to back-branches doesn't seem all that
strong to me.  Thoughts?

--
Thanks, Amit Langote

v2-0001-Fix-expression-list-handling-in-ATExecAttachParti.patch
Description: Binary data

Re: pg_basebackup and error messages dependent on the order of the arguments

2024-09-30 Thread Daniel Westermann (DWE)

>I wrote:
>> As this example shows, we really ought to validate the compression
>> argument on sight in order to get sensible error messages.  The
>> trouble is that for server-side compression the code wants to just
>> pass the string through to the server and not form its own opinion
>> as to whether it's a known algorithm.

>> Perhaps it would help if we simply rejected strings beginning
>> with a dash?  I haven't tested, but roughly along the lines of

>Taking a closer look, many of the other switches-requiring-an-argument
>also just absorb "optarg" without checking its value till much later,
>so I'm not sure how far we could move the needle by special-casing
>--compress.

My point was not so much about --compress but rather giving a good error 
message.

Looking at this:
$ pg_basebackup --checkpoint=fast --format=t --compress --pgdata=/var/tmp/dummy
pg_basebackup: error: must specify output directory or backup target

... the error message is misleading and will confuse users more than it helps.

Regards
Daniel

Re: Psql meta-command conninfo+

2024-09-30 Thread Jim Jones

Hi

On 26.09.24 09:15, Hunaid Sohail wrote:
> This patch renames "Current User" to "Authenticated User" as suggested
> by me in my last email. I have also updated the documentation accordingly.

Could you perhaps in the documentation elaborate a bit more on the
difference between "Current User" and "Authenticated User"? IMHO a
couple of words on how exactly they differ would be very helpful, as
they show the same user in many cases.

> Authenticated User: The name of the user returned by PQuser()
> Session User: The session user's name.

Although a bit redundant, I'd argue that "SSL Compression" is better
than just "Compression".

Other than that, it looks much better now:

$ /usr/local/postgres-dev/bin/psql -x "\
    host=server.uni-muenster.de
    hostaddr=192.168.178.27
    user=jim dbname=db
    port=54322
    sslmode=verify-full
    sslrootcert=server-certificates/server.crt
    sslcert=jim-certificates/jim.crt
    sslkey=jim-certificates/jim.key"

psql (18devel)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384,
compression: off, ALPN: postgresql)
Type "help" for help.

db=# SET client_encoding TO 'LATIN1';
SET

db=# \conninfo+
Connection Information
-[ RECORD 1 ]+---
Database | db
Authenticated User   | jim
Session User | jim
Host | server.uni-muenster.de
Host Address | 192.168.178.27
Port | 54322
Protocol Version | 3
SSL Connection   | true
SSL Protocol | TLSv1.3
SSL Cipher   | TLS_AES_256_GCM_SHA384
Compression  | off
ALPN | postgresql
GSSAPI Authenticated | false
Client Encoding  | LATIN1
Server Encoding  | UTF8
Backend PID  | 874890

Thanks!

Re: Inconsistency in reporting checkpointer stats





On 2024/09/22 20:44, Nitin Jadhav wrote:

+   
CheckpointStats.ckpt_slru_written,
+   (double) 
CheckpointStats.ckpt_slru_written * 100 / NBuffers,

I don't think NBuffers represents the maximum number of SLRU buffers.
We might need to calculate this based on specific GUC settings,
like transaction_buffers.


Great observation. Since the SLRU buffers written during a checkpoint
can include transaction_buffers, commit_timestamp_buffers,
subtransaction_buffers, multixact_member_buffers,
multixact_offset_buffers, and serializable_buffers, the total count of
SLRU buffers should be the sum of all these types. We might need to
introduce a global variable, such as total_slru_count, in the
globals.c file to hold this sum. The num_slots variable in the
SlruSharedData structure needs to be accessed from all types of SLRU
and stored in total_slru_count. This can then be used during logging
to calculate the percentage of SLRU buffers written. However, I’m
unsure if this effort is justified. While it makes sense for normal
buffers to display the percentage, the additional code required might
not provide significant value to users. Therefore, I have removed this
in the attached v6 patch.


+1

Thanks for updating the patch! It looks good to me.
Barring any objections, I will commit this patch.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: AIO v2.0

2024-09-30 Thread Noah Misch

On Mon, Sep 30, 2024 at 10:49:17AM -0400, Andres Freund wrote:
> We also discussed the topic at 
> https://postgr.es/m/20240925020022.c5.nmisch%40google.com
> > ... neither BM_SETTING_HINTS nor keeping bounce buffers looks like a bad
> > decision.  From what I've heard so far of the performance effects, if it 
> > were
> > me, I would keep the bounce buffers.  I'd pursue BM_SETTING_HINTS and bounce
> > buffer removal as a distinct project after the main AIO capability.  Bounce
> > buffers have an implementation.  They aren't harming other design decisions.
> > The AIO project is big, so I'd want to err on the side of not designating
> > other projects as its prerequisites.
> 
> Given the issues that modifying pages while in flight causes, not just with PG
> level checksums, but also filesystem level checksum, I don't feel like it's a
> particularly promising approach.
> 
> However, I think this doesn't have to mean that the BM_SETTING_HINTS stuff has
> to be completed before we can move forward with AIO. If I split out the write
> portion from the read portion a bit further, the main AIO changes and the
> shared-buffer read user can be merged before there's a dependency on the hint
> bit stuff being done.
> 
> Does that seem reasonable?

Yes.

Re: pg_upgrade check for invalid databases

> On 30 Sep 2024, at 16:55, Tom Lane  wrote:

> TBH I'm not finding anything very much wrong with the current
> behavior... this has to be a rare situation, do we need to add
> debatable behavior to make it easier?

One argument would be to make the checks consistent, pg_upgrade generally tries
to report all the offending entries to help the user when fixing the source
database.  Not sure if it's a strong enough argument for carrying code which
really shouldn't see much use though.

--
Daniel Gustafsson

Re: Increase of maintenance_work_mem limit in 64-bit Windows

2024-09-30 Thread v . popolitov


David Rowley писал(а) 2024-09-24 01:07:

On Tue, 24 Sept 2024 at 02:47, Vladlen Popolitov
 wrote:
I agree, it is better to fix all them together. I also do not like 
this

hack, it will be removed from the patch, if I check and change
all  at once.
I think, it will take about 1 week to fix and test all changes. I will
estimate the total volume of the changes and think, how to group them
in the patch ( I hope, it will be only one patch)


There's a few places that do this:

Size maxBlockSize = ALLOCSET_DEFAULT_MAXSIZE;

/* choose the maxBlockSize to be no larger than 1/16 of work_mem */
while (16 * maxBlockSize > work_mem * 1024L)

I think since maxBlockSize is a Size variable, that the above should
probably be:

while (16 * maxBlockSize > (Size) work_mem * 1024)

Maybe there can be a precursor patch to fix all those to get rid of
the 'L' and cast to the type we're comparing to or assigning to rather
than trying to keep the result of the multiplication as a long.


Hi

I rechecked all , that depend on MAX_KILOBYTES limit and 
fixed

all casts that are affected by 4-bytes long type in Windows 64-bit. Now
next variables are limited by 2TB in all 64-bit systems:
maintenance_work_mem
work_mem
logical_decoding_work_mem
max_stack_depth
autovacuum_work_mem
gin_pending_list_limit
wal_skip_threshold
Also wal_keep_size_mb, min_wal_size_mb, max_wal_size_mb,
max_slot_wal_keep_size_mb are not affected by "long" cast.

From 6d275cb66cb39b1f209a6b43db2ce377ec0d7ba8 Mon Sep 17 00:00:00 2001
From: Vladlen Popolitov 
Date: Tue, 1 Oct 2024 00:10:37 +0300
Subject: [PATCH v2] work_mem_vars limit increased in 64bit Windows

---
 src/backend/access/gin/ginfast.c|  2 +-
 src/backend/access/gin/ginget.c |  2 +-
 src/backend/access/hash/hash.c  |  2 +-
 src/backend/access/heap/vacuumlazy.c|  2 +-
 src/backend/access/nbtree/nbtpage.c |  2 +-
 src/backend/commands/vacuumparallel.c   |  2 +-
 src/backend/executor/execUtils.c|  2 +-
 src/backend/executor/nodeBitmapIndexscan.c  |  2 +-
 src/backend/executor/nodeBitmapOr.c |  2 +-
 src/backend/nodes/tidbitmap.c   |  6 +++---
 src/backend/optimizer/path/costsize.c   | 12 ++--
 src/backend/replication/logical/reorderbuffer.c |  6 +++---
 src/backend/tcop/postgres.c |  4 ++--
 src/backend/utils/sort/tuplestore.c |  2 +-
 src/include/nodes/tidbitmap.h   |  2 +-
 src/include/utils/guc.h |  2 +-
 16 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index eeca3ed318..d707e459bb 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -456,7 +456,7 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
 	 * ginInsertCleanup() should not be called inside our CRIT_SECTION.
 	 */
 	cleanupSize = GinGetPendingListCleanupSize(index);
-	if (metadata->nPendingPages * GIN_PAGE_FREESIZE > cleanupSize * 1024L)
+	if (metadata->nPendingPages * GIN_PAGE_FREESIZE > cleanupSize * (Size)1024L)
 		needCleanup = true;
 
 	UnlockReleaseBuffer(metabuffer);
diff --git a/src/backend/access/gin/ginget.c b/src/backend/access/gin/ginget.c
index 0b4f2ebadb..1f295d5907 100644
--- a/src/backend/access/gin/ginget.c
+++ b/src/backend/access/gin/ginget.c
@@ -125,7 +125,7 @@ collectMatchBitmap(GinBtreeData *btree, GinBtreeStack *stack,
 	Form_pg_attribute attr;
 
 	/* Initialize empty bitmap result */
-	scanEntry->matchBitmap = tbm_create(work_mem * 1024L, NULL);
+	scanEntry->matchBitmap = tbm_create(work_mem * INT64CONST(1024), NULL);
 
 	/* Null query cannot partial-match anything */
 	if (scanEntry->isPartialMatch &&
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 5ce3609394..0063902021 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -120,7 +120,7 @@ hashbuild(Relation heap, Relation index, IndexInfo *indexInfo)
 	double		reltuples;
 	double		allvisfrac;
 	uint32		num_buckets;
-	long		sort_threshold;
+	uint64		sort_threshold;
 	HashBuildState buildstate;
 
 	/*
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index d82aa3d489..eb658f66f1 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2883,7 +2883,7 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
 	 */
 
 	dead_items_info = (VacDeadItemsInfo *) palloc(sizeof(VacDeadItemsInfo));
-	dead_items_info->max_bytes = vac_work_mem * 1024L;
+	dead_items_info->max_bytes = vac_work_mem * (size_t)1024;
 	dead_items_info->num_items = 0;
 	vacrel->dead_items_info = dead_items_info;
 
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 01bbece6bf..4ab3f6129f 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage

Re: SET or STRICT modifiers on function affect planner row estimates

2024-09-30 Thread Michał Kłeczek

Hi,
Thanks for taking a look.

> On 30 Sep 2024, at 14:14, Ashutosh Bapat  wrote:
> 
> Hi Michal,
> It is difficult to understand the exact problem from your description.
> Can you please provide EXPLAIN outputs showing the expected plan and
> the unexpected plan; plans on the node where the query is run and
> where the partitions are located.

The table structure is as follows:

CREATE TABLE tbl (…) PARTITION BY RANGE year(col02_date)

CREATE TABLE tbl_2015 PARTITION OF tbl FOR VALUES BETWEEN (2023) AND (2024) 
PARTITION BY HASH (col01_no)
… subsequent years


CREATE TABLE tbl_2021_32_9 PARTITION OF tbl_2021 FOR VALUES WITH (MODULUS 32 
REMAINDER 9)
…

CREATE FOREIGN TABLE tbl_2022_16_9 PARTITION OF tbl_2022 FOR VALUES WITH 
(MODULUS 32 REMAINDER 9)

All tables are freshly ANALYZEd.

I have a function:

CREATE FUNCTION report(col01 text, year_from, month_from, year_to, month_to) 
RETURNS … LANGUAGE sql
$$
SELECT
  col01_no, year(col02_date), month(col02_date), sum(col03) FROM tbl WHERE 
col02_no = col02 AND (col02_date conditions) GROUP BY 1, 2, 3
$$

EXPLAIN (…) SELECT * FROM report(…);

gives

Plan 1 (performs pushdown of aggregates):
 
Append  (cost=9.33..76322501.41 rows=3 width=58) (actual time=5.051..6.414 
rows=21 loops=1)
  InitPlan 1 (returns $0)
->  Result  (cost=0.00..0.01 rows=1 width=4) (actual time=0.001..0.001 
rows=1 loops=1)
  Output: '2021-01-01'::date
  InitPlan 2 (returns $1)
->  Result  (cost=0.00..0.01 rows=1 width=4) (actual time=0.001..0.001 
rows=1 loops=1)
  Output: '2023-12-31'::date
  ->  GroupAggregate  (cost=9.31..9.85 rows=1 width=58) (actual 
time=0.073..0.074 rows=0 loops=1)
Output: op.col01_no, (year(op.col02_date)), (month(op.col02_date)), 
sum(op.col03) FILTER (WHERE (op.debit_flag = '0'::bpchar)), sum(op.col03) 
FILTER (WHERE (op.debit_flag <> '0'::bpchar)), count(1)
Group Key: (year(op.col02_date)), (month(op.col02_date))
->  Sort  (cost=9.31..9.32 rows=1 width=44) (actual time=0.072..0.073 
rows=0 loops=1)
  Output: (year(op.col02_date)), (month(op.col02_date)), 
op.col01_no, op.col03, op.debit_flag
  Sort Key: (year(op.col02_date)), (month(op.col02_date))
  Sort Method: quicksort  Memory: 25kB
  ->  Index Only Scan using 
tbl_2021_32_9_universal_gist_idx_3a2df25af5bc48a on cbt.tbl_2021_32_9 op  
(cost=0.28..9.30 rows=1 width=44) (actual time=0.063..0.063 rows=0 loops=1)
Output: year(op.col02_date), month(op.col02_date), 
op.col01_no, op.col03, op.debit_flag
Index Cond: ((op.col01_no = 
'2210902066000110831697'::text) AND (op.col02_date >= $0) AND 
(op.col02_date <= $1))
Filter: ((year(op.col02_date) >= 2021) AND 
(year(op.col02_date) <= 2023))
Heap Fetches: 0
  ->  Async Foreign Scan  (cost=100.02..76322480.36 rows=1 width=58) (actual 
time=0.753..0.755 rows=11 loops=1)
Output: op_1.col01_no, (year(op_1.col02_date)), 
(month(op_1.col02_date)), (sum(op_1.col03) FILTER (WHERE (op_1.debit_flag = 
'0'::bpchar))), (sum(op_1.col03) FILTER (WHERE (op_1.debit_flag <> 
'0'::bpchar))), ((count(1))::double precision)
Relations: Aggregate on 
(cbt_c61d467d1b5fd1d218d9e6e7dd44a333.tbl_2022_16_9 op_1)
Remote SQL: SELECT col01_no, cbt.year(col02_date), 
cbt.month(col02_date), sum(col03) FILTER (WHERE (debit_flag = '0')), sum(col03) 
FILTER (WHERE (debit_flag <> '0')), count(1) FROM cbt.tbl_2022_16_9 WHERE 
((cbt.year(col02_date) >= 2021)) AND ((cbt.year(col02_date) <= 2023)) AND 
((col02_date >= $1::date)) AND ((col02_date <= $2::date)) AND ((col01_no = 
'2210902066000110831697')) GROUP BY 1, 2, 3
  ->  GroupAggregate  (cost=10.63..11.17 rows=1 width=58) (actual 
time=4.266..4.423 rows=10 loops=1)
Output: op_2.col01_no, (year(op_2.col02_date)), 
(month(op_2.col02_date)), sum(op_2.col03) FILTER (WHERE (op_2.debit_flag = 
'0'::bpchar)), sum(op_2.col03) FILTER (WHERE (op_2.debit_flag <> '0'::bpchar)), 
count(1)
Group Key: (year(op_2.col02_date)), (month(op_2.col02_date))
->  Sort  (cost=10.63..10.64 rows=1 width=44) (actual time=4.238..4.273 
rows=735 loops=1)
  Output: (year(op_2.col02_date)), (month(op_2.col02_date)), 
op_2.col01_no, op_2.col03, op_2.debit_flag
  Sort Key: (year(op_2.col02_date)), (month(op_2.col02_date))
  Sort Method: quicksort  Memory: 82kB
  ->  Index Only Scan using 
tbl_2023_128_9_universal_gist_idx_3a2df25af5bc48a on cbt.tbl_2023_128_9 op_2  
(cost=0.54..10.62 rows=1 width=44) (actual time=0.295..4.059 rows=735 loops=1)
Output: year(op_2.col02_date), month(op_2.col02_date), 
op_2.col01_no, op_2.col03, op_2.debit_flag
Index Cond: ((op_2.col01_no = 
'2210902066000110831697'::text) AND (op_2.col02_date >= $0) AND 
(op_2.col02_date <= $1))
Filter: ((year(op_2.col02_date) >= 2021) AND 
(year(op_2.col02_date) <= 2023))

Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation

2024-09-30 Thread Noah Misch

On Wed, Sep 25, 2024 at 03:46:27PM -0700, Masahiko Sawada wrote:
> On Wed, Sep 18, 2024 at 6:46 PM Noah Misch  wrote:
> > On Tue, Sep 17, 2024 at 09:43:41PM -0700, Masahiko Sawada wrote:
> > > On Mon, Sep 16, 2024 at 9:24 AM Noah Misch  wrote:
> > > > On Thu, Sep 12, 2024 at 03:42:48PM -0700, Masahiko Sawada wrote:
> > > > > On Tue, Sep 10, 2024 at 3:05 PM Noah Misch  wrote:
> > > > > > On Tue, Sep 10, 2024 at 05:56:47PM -0400, Tom Lane wrote:
> > > > > > > Got it.  So now I'm wondering if we need all the complexity of 
> > > > > > > storing
> > > > > > > stuff in the GIN metapages.  Could we simply read the (primary's)
> > > > > > > signedness out of pg_control and use that?

> > > Thanks. I like the simplicity of this approach. If we agree with this
> > > approach, I'd like to proceed with it.
> >
> > Works for me.
> >
> > > Regardless of what approach we take, I wanted to provide some
> > > regression tests for these changes, but I could not come up with a
> > > reasonable idea. It would be great if we could do something like
> > > 027_stream_regress.pl on cross-architecture replication. But just
> > > executing 027_stream_regress.pl on cross-architecture replication
> > > could not be sufficient since we would like to confirm query results
> > > as well. If we could have a reasonable tool or way, it would also help
> > > find other similar bugs related architectural differences.
> >
> > Perhaps add a regress.c function that changes the control file flag and
> > flushes the change to disk?
> 
> I've tried this idea but found out that extensions can flush the
> controlfile on the disk after flipping the char signedness value, they
> cannot update the controlfile data on the shared memory. And, when the
> server shuts down, the on-disk controlfile is updated with the
> on-memory controlfile data, so the char signedness is reverted.
> 
> We can add a function to set the char signedness of on-memory
> controlfile data or add a new option to pg_resetwal to change the char
> signedness on-disk controlfile data but they seem to be overkill to me
> and I'm concerned about misusing these option and function.

Before the project is over, pg_upgrade will have some way to set the control
file signedness to the source cluster signedness.  I think pg_upgrade should
also take an option for overriding the source cluster signedness for source
clusters v17 and older.  That way, a user knowing they copied the v17 source
cluster from x86 to ARM can pg_upgrade properly without the prerequisite of
acquiring an x86 VM.  Once that all exists, perhaps it will be clearer how
best to change signedness for testing.

> While this change does not encourage the use of 'char' without
> explicitly specifying its signedness, it provides a hint to ensure
> consistent behavior for indexes (e.g., GIN and GiST) and tables that
> already store data sorted by the 'char' type on disk, especially in
> cross-platform replication scenarios.

Let's put that sort of information in the GetDefaultCharSignedness() header
comment.  New code should use explicit "signed char" or "unsigned char" when
it matters for data file compatibility.  GetDefaultCharSignedness() exists for
code that deals with pre-v18 data files, where we accidentally let C
implementation signedness affect persistent data.

> @@ -4256,6 +4257,18 @@ WriteControlFile(void)
>  
>   ControlFile->float8ByVal = FLOAT8PASSBYVAL;
>  
> + /*
> +  * The signedness of the char type is implementation-defined. For 
> example
> +  * on x86 architecture CPUs, the char data type is typically treated as
> +  * signed by default whereas on aarch architecture CPUs it is typically
> +  * treated as unsigned by default.
> +  */
> +#if CHAR_MIN != 0
> + ControlFile->default_char_signedness = true;
> +#else
> + ControlFile->default_char_signedness = false;
> +#endif

This has initdb set the field to the current C implementation's signedness.  I
wonder if, instead, initdb should set signedness=true unconditionally.  Then
the only source of signedness=false will be pg_upgrade.

Advantage: signedness=false will get rarer over time.  If we had known about
this problem during the last development cycle that forced initdb (v8.3), we
would have made all clusters signed or all clusters unsigned.  Making
pg_upgrade the only source of signedness=false will cause the population of
database clusters to converge toward that retrospective ideal.

Disadvantage: testing signedness=false will require more planning.

What other merits should we consider as part of deciding?

Re: [PoC] Federated Authn/z with OAUTHBEARER

2024-09-30 Thread Jacob Champion

On Mon, Sep 30, 2024 at 6:38 AM Antonin Houska  wrote:
>
> Jacob Champion  wrote:
>
> > On Fri, Sep 27, 2024 at 10:58 AM Antonin Houska  wrote:
> > That's why people are so pedantic about saying that OAuth is an
> > authorization framework and not an authentication framework.
>
> This statement alone sounds as if you missed *authentication*, but you seem to
> admit above that the /userinfo endpoint provides it ("tells you who the end
> user is"). I agree that it does. My understanding is that this endpoint, as
> well as the concept of "claims" and "scopes", is introduced by OpenID, which
> is an *authentication* framework, although it's built on top of OAuth.

OpenID is an authentication framework, but it's generally focused on a
type of client known as a Relying Party. In the architecture of this
patchset, the Relying Party would be libpq, which has the option of
retrieving authentication claims from the provider. Unfortunately for
us, libpq has no use for those claims. It's not trying to authenticate
the user for its own purposes.

The Postgres server, on the other hand, is not a Relying Party. (It's
an OAuth resource server, in this architecture.) It's not performing
any of the OIDC flows, it's not talking to the end user and the
provider at the same time, and it is very restricted in its ability to
influence the client exchange via the SASL mechanism.

> Regarding *authorization*, I agree that the bearer token may not contain
> enough information to determine whether the owner of the token is allowed to
> access the database. However, I consider database a special kind of
> "application", which can handle authorization on its own. In this case, the
> authorization can be controlled by (not) assigning the user the LOGIN
> attribute, as well as by (not) granting it privileges on particular database
> objects. In short, I think that *authentication* is all we need.

Authorizing the *end user's* access to the database using scopes is
optional. Authorizing the *bearer's* ability to connect on behalf of
the end user, however, is mandatory. Hopefully the below clarifies.

(I agree that most people probably want to use authentication, so that
the database can then make decisions based on HBA settings. OIDC is a
fine way to do that.)

> Are you sure you can legitimately acquire the bearer token containing my email
> address?

Yes. In general that's how OpenID-based "Sign in with "
works. All those third-party services are running around with tokens
that identify you, but unless they've asked for more abilities and
you've granted them the associated scopes, identifying you is all they
can do.

> I think the email address returned by the /userinfo endpoint is one
> of the standard claims [1]. Thus by returning the particular value of "email"
> from the endpoint the identity provider asserts that the token owner does have
> this address.

We agree that /userinfo gives authentication claims for the end user.
It's just insufficient for our use case.

For example, there are enterprise applications out there that will ask
for read access to your Google Calendar. If you're willing to grant
that, then you probably won't mind if those applications also know
your email address, but you probably do mind if they're suddenly able
to access your production databases just because you gave them your
email.

Put another way: if you log into Postgres using OAuth, and your
provider doesn't show you a big message saying "this application is
about to access *your* prod database using *your* identity; do you
want to allow that?", then your DBA has deployed a really dangerous
configuration. That's a critical protection feature you get from your
OAuth provider. Otherwise, what's stopping somebody else from setting
up their own malicious service to farm access tokens? All they'd have
to do is ask for your email.

> Another question, assuming the token verification is resolved somehow:
> wouldn't it be sufficient for the initial implementation if the client could
> pass the bearer token to libpq in the connection string?

It was discussed wayyy upthread:

https://postgr.es/m/CAAWbhmhmBe9v3aCffz5j8Sg4HMWWkB5FvTDCSZ_Vh8E1fX91Gw%40mail.gmail.com

Basically, at that point the entire implementation becomes an exercise
for the reader. I want to avoid that if possible. I'm not adamantly
opposed to it, but I think the client-side hook implementation is
going to be better for the use cases that have been discussed so far.

> Also, if libpq accepted the bearer token via the connection string, it would
> be possible to implement the authorization as a separate front-end application
> (e.g. pg_oauth_login) rather than adding more complexity to libpq itself.

The application would still need to parse the server error response.
There was (a small) consensus at the time [1] that parsing error
messages for that purpose would be really unpleasant; hence the hook
architecture.

> (I'm learning this stuff on-the-fly, so there might be something naive

Re: Fixing backslash dot for COPY FROM...CSV

2024-09-30 Thread Daniel Verite

Artur Zakirov wrote:

> I've tested the patch and it seems it works as expected.

Thanks for looking at this!

> It seems it isn't necessary to handle "\." within
> "CopyAttributeOutCSV()" (file "src/backend/commands/copyto.c")
> anymore.

It's still useful to produce CSV data that can be safely
reloaded by previous versions.
Until these versions are EOL'ed, I assume we'd better
continue to quote "\."

> Another thing is that the comparison "copystream ==
> pset.cur_cmd_source" happens twice within "handleCopyIn()". TBH it is
> a bit confusing to me what is the real purpose of that check, but one
> of the comparisons looks unnecessary.

Indeed, good catch. The redundant comparison is removed in the
attached v6.


Best regards,
-- 
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite
From b05ebe4c4d5123592797c59282e3fab6e8eeed04 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Daniel=20V=C3=A9rit=C3=A9?= 
Date: Mon, 30 Sep 2024 20:57:50 +0200
Subject: [PATCH v6] Support backslash-dot on a line by itself as valid data in
 COPY FROM...CSV

---
 doc/src/sgml/ref/copy.sgml   |  6 +--
 doc/src/sgml/ref/psql-ref.sgml   |  4 --
 src/backend/commands/copyfromparse.c | 57 +++-
 src/bin/psql/copy.c  | 28 +-
 src/test/regress/expected/copy.out   | 18 +
 src/test/regress/sql/copy.sql| 12 ++
 6 files changed, 63 insertions(+), 62 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 1518af8a04..f7eb59dbac 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -814,11 +814,7 @@ COPY count
 format, \., the end-of-data marker, could also appear
 as a data value.  To avoid any misinterpretation, a \.
 data value appearing as a lone entry on a line is automatically
-quoted on output, and on input, if quoted, is not interpreted as the
-end-of-data marker.  If you are loading a file created by another
-application that has a single unquoted column and might have a
-value of \., you might need to quote that value in the
-input file.
+quoted on output.

 

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 3fd9959ed1..0570211cf2 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1179,10 +1179,6 @@ SELECT $1 \parse stmt1
 destination, because all data must pass through the client/server
 connection.  For large amounts of data the SQL
 command might be preferable.
-Also, because of this pass-through method, \copy
-... from in CSV mode will erroneously
-treat a \. data value alone on a line as an
-end-of-input marker.
 
 
 
diff --git a/src/backend/commands/copyfromparse.c 
b/src/backend/commands/copyfromparse.c
index 97a4c387a3..d0df2d807f 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -136,14 +136,6 @@ if (1) \
} \
 } else ((void) 0)
 
-/* Undo any read-ahead and jump out of the block. */
-#define NO_END_OF_COPY_GOTO \
-if (1) \
-{ \
-   input_buf_ptr = prev_raw_ptr + 1; \
-   goto not_end_of_copy; \
-} else ((void) 0)
-
 /* NOTE: there's a copy of this in copyto.c */
 static const char BinarySignature[11] = "PGCOPY\n\377\r\n\0";
 
@@ -1182,7 +1174,6 @@ CopyReadLineText(CopyFromState cstate)
boolresult = false;
 
/* CSV variables */
-   boolfirst_char_in_line = true;
boolin_quote = false,
last_was_esc = false;
charquotec = '\0';
@@ -1377,10 +1368,9 @@ CopyReadLineText(CopyFromState cstate)
}
 
/*
-* In CSV mode, we only recognize \. alone on a line.  This is 
because
-* \. is a valid CSV data value.
+* In CSV mode, backslash is a normal character.
 */
-   if (c == '\\' && (!cstate->opts.csv_mode || first_char_in_line))
+   if (c == '\\' && !cstate->opts.csv_mode)
{
charc2;
 
@@ -1413,21 +1403,15 @@ CopyReadLineText(CopyFromState cstate)
 
if (c2 == '\n')
{
-   if (!cstate->opts.csv_mode)
-   ereport(ERROR,
-   
(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
-
errmsg("end-of-copy marker does not match previous newline style")));
-   else
-   NO_END_OF_COPY_GOTO;
+   ereport(ERROR,
+

Re: set_rel_pathlist function unnecessary params.

Kirill Reshke  writes:
> I happened to notice that `set_rel_pathlist` params, RelOptInfo *rel
> and RangeTblEntry *rte are
> unnecessary, because upon all usages,
> `rte=root->simple_rte_array[rti]` and
> `rel=root->simple_rel_array[rti]` holds. What's the point of providing
> the same information 3 times?

To avoid having to re-fetch it from those arrays?

> So, I propose to refactor this a little bit.
> Am I missing something?

I'm -1 on changing this.  It'd provide no detectable benefit
while creating back-patching hazards in this code.

regards, tom lane

Re: Inval reliability, especially for inplace updates

2024-09-30 Thread Noah Misch

Rebased.
Author: Noah Misch 
Commit: Noah Misch 

For inplace update, send nontransactional invalidations.

The inplace update survives ROLLBACK.  The inval didn't, so another
backend's DDL could then update the row without incorporating the
inplace update.  In the test this fixes, a mix of CREATE INDEX and ALTER
TABLE resulted in a table with an index, yet relhasindex=f.  That is a
source of index corruption.

Core code no longer needs XLOG_INVALIDATIONS, but this leaves removing
it for future work.  Extensions could be relying on that mechanism, so
that removal would not be back-patch material.  Back-patch to v12 (all
supported versions).

Reviewed by FIXME and (in earlier versions) Andres Freund.

Discussion: https://postgr.es/m/20240523000548.58.nmi...@google.com

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index da5e656..7c82a95 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -6327,6 +6327,9 @@ heap_inplace_update_and_unlock(Relation relation,
HeapTupleHeader htup = oldtup->t_data;
uint32  oldlen;
uint32  newlen;
+   int nmsgs = 0;
+   SharedInvalidationMessage *invalMessages = NULL;
+   boolRelcacheInitFileInval = false;
 
Assert(ItemPointerEquals(&oldtup->t_self, &tuple->t_self));
oldlen = oldtup->t_len - htup->t_hoff;
@@ -6334,6 +6337,29 @@ heap_inplace_update_and_unlock(Relation relation,
if (oldlen != newlen || htup->t_hoff != tuple->t_data->t_hoff)
elog(ERROR, "wrong tuple length");
 
+   /*
+* Construct shared cache inval if necessary.  Note that because we only
+* pass the new version of the tuple, this mustn't be used for any
+* operations that could change catcache lookup keys.  But we aren't
+* bothering with index updates either, so that's true a fortiori.
+*/
+   CacheInvalidateHeapTupleInplace(relation, tuple, NULL);
+
+   /* Like RecordTransactionCommit(), log only if needed */
+   if (XLogStandbyInfoActive())
+   nmsgs = inplaceGetInvalidationMessages(&invalMessages,
+   
   &RelcacheInitFileInval);
+
+   /*
+* Unlink relcache init files as needed.  If unlinking, acquire
+* RelCacheInitLock until after associated invalidations.  By doing this
+* in advance, if we checkpoint and then crash between inplace
+* XLogInsert() and inval, we don't rely on StartupXLOG() ->
+* RelationCacheInitFileRemove().  That uses elevel==LOG, so replay 
would
+* neglect to PANIC on EIO.
+*/
+   PreInplace_Inval();
+
/* NO EREPORT(ERROR) from here till changes are logged */
START_CRIT_SECTION();
 
@@ -6363,9 +6389,16 @@ heap_inplace_update_and_unlock(Relation relation,
XLogRecPtr  recptr;
 
xlrec.offnum = ItemPointerGetOffsetNumber(&tuple->t_self);
+   xlrec.dbId = MyDatabaseId;
+   xlrec.tsId = MyDatabaseTableSpace;
+   xlrec.relcacheInitFileInval = RelcacheInitFileInval;
+   xlrec.nmsgs = nmsgs;
 
XLogBeginInsert();
-   XLogRegisterData((char *) &xlrec, SizeOfHeapInplace);
+   XLogRegisterData((char *) &xlrec, MinSizeOfHeapInplace);
+   if (nmsgs != 0)
+   XLogRegisterData((char *) invalMessages,
+nmsgs * 
sizeof(SharedInvalidationMessage));
 
XLogRegisterBuffer(0, buffer, REGBUF_STANDARD);
XLogRegisterBufData(0, (char *) htup + htup->t_hoff, newlen);
@@ -6377,17 +6410,28 @@ heap_inplace_update_and_unlock(Relation relation,
PageSetLSN(BufferGetPage(buffer), recptr);
}
 
-   END_CRIT_SECTION();
-
-   heap_inplace_unlock(relation, oldtup, buffer);
+   LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
 
/*
-* Send out shared cache inval if necessary.  Note that because we only
-* pass the new version of the tuple, this mustn't be used for any
-* operations that could change catcache lookup keys.  But we aren't
-* bothering with index updates either, so that's true a fortiori.
+* Send invalidations to shared queue.  SearchSysCacheLocked1() assumes 
we
+* do this before UnlockTuple().
 *
-* XXX ROLLBACK discards the invalidation.  See test inplace-inval.spec.
+* If we're mutating a tuple visible only to this transaction, there's 
an
+* equivalent transactional inval from the action that created the 
tuple,
+* and this inval is superfluous.
+*/
+   AtInplace_Inval();
+
+   END_CRIT_SECTION();
+   UnlockTuple(relation, &tuple->t_self, Inpl

set_rel_pathlist function unnecessary params.

2024-09-30 Thread Kirill Reshke

I happened to notice that `set_rel_pathlist` params, RelOptInfo *rel
and RangeTblEntry *rte are
unnecessary, because upon all usages,
`rte=root->simple_rte_array[rti]` and
`rel=root->simple_rel_array[rti]` holds. What's the point of providing
the same information 3 times? Is it kept like that for extension
backward compatibility.

So, I propose to refactor this a little bit.

Am I missing something?



-- 
Best regards,
Kirill Reshke

Re: pg_basebackup and error messages dependent on the order of the arguments

"Daniel Westermann (DWE)"  writes:
>> Taking a closer look, many of the other switches-requiring-an-argument
>> also just absorb "optarg" without checking its value till much later,
>> so I'm not sure how far we could move the needle by special-casing
>> --compress.

> My point was not so much about --compress but rather giving a good error 
> message.

Right, and my point was that the issue is bigger than --compress.
For example, you get exactly the same misbehavior with

$ pg_basebackup --checkpoint=fast --format=t -d --pgdata=/var/tmp/dummy
pg_basebackup: error: must specify output directory or backup target
pg_basebackup: hint: Try "pg_basebackup --help" for more information.

I'm not sure how to solve the problem once you consider this larger
scope.  I don't think we could forbid arguments beginning with "-" for
all of these switches.

regards, tom lane

Re: pg_verifybackup: TAR format backup verification

On Mon, Sep 30, 2024 at 11:24 AM Tom Lane  wrote:
> > This is just a string in a
> > JSON file that represents an integer which will hopefully turn out to
> > be the size of the file on disk. I guess I don't know what type I
> > should be using here. Most things in PostgreSQL use a type like uint32
> > or uint64, but technically what we're going to be comparing against in
> > the end is probably an off_t produced by stat(), but the return value
> > of strtoul() or strtoull() is unsigned long or unsigned long long,
> > which is not any of those things. If you have a suggestion here, I'm
> > all ears.
>
> I don't know if it's realistic to expect that this code might be used
> to process JSON blobs exceeding 4GB.  But if it is, I'd be inclined to
> use uint64 and strtoull for these purposes, if only to avoid
> cross-platform hazards with varying sizeof(long) and sizeof(size_t).
>
> Um, wait ... we do have strtou64(), so you should use that.

The thing we should be worried about is not how large a JSON blob
might be, but rather how large any file that appears in the data
directory might be. So uint32 is out; and I think I hear you voting
for uint64 over size_t. But then how do you think we should print
that? Cast to unsigned long long and use %llu?

> >> Aside from that, I'm unimpressed with expending a five-line comment
> >> at line 376 to justify casting control_file_bytes to int,
>
> > I don't know what you mean by this.
>
> I mean that we have a widely-used, better solution.  If you copied
> this from someplace else, the someplace else could stand to be
> improved too.

I don't understand what you think the widely-used, better solution is
here. As far as I can see, there are two goods here, between which one
must choose. One can decide to use the same error message string, and
I hope we can agree that's good, because I've been criticized in the
past when I have done otherwise, as have many others. The other good
is to use the most appropriate data type. One cannot have both of
those things in this instance, unless one goes and fixes the other
code also, but such a change had no business being part of this patch.
If the issue had been serious and likely to occur in real life, I
would have probably fixed it in a preparatory patch, but it isn't, so
I settled for adding a comment.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Fixing backslash dot for COPY FROM...CSV

"Daniel Verite"  writes:
> [ v6-0001-Support-backslash-dot-on-a-line-by-itself-as-vali.patch ]

I did some more work on the docs and comments, and pushed that.

Returning to my upthread thought that

>>> I think we should fix it so that \. that's not alone on a line
>>> throws an error, but I wouldn't go further than that.

here's a quick follow-on patch to make that happen.  It could
probably do with a test case to demonstrate the error, but
I didn't bother yet pending approval that we want to do this.
(This passes check-world as it stands, indicating we have no
existing test that expects this case to work.)

Also, I used the same error message "end-of-copy marker corrupt"
that we have for the case of junk following the marker, but
I can't say that I think much of that phrasing.  What do people
think of "end-of-copy marker is not alone on its line", instead?

regards, tom lane

diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index a280efe23f..2ea9679b3c 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -1426,13 +1426,17 @@ CopyReadLineText(CopyFromState cstate)
 }

 /*
- * Transfer only the data before the \. into line_buf, then
- * discard the data and the \. sequence.
+ * If there is any data on this line before the \., complain.
+ */
+if (cstate->line_buf.len > 0 ||
+	prev_raw_ptr > cstate->input_buf_index)
+	ereport(ERROR,
+			(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
+			 errmsg("end-of-copy marker corrupt")));
+
+/*
+ * Discard the \. and newline, then report EOF.
  */
-if (prev_raw_ptr > cstate->input_buf_index)
-	appendBinaryStringInfo(&cstate->line_buf,
-		   cstate->input_buf + cstate->input_buf_index,
-		   prev_raw_ptr - cstate->input_buf_index);
 cstate->input_buf_index = input_buf_ptr;
 result = true;	/* report EOF */
 break;

Re: Changing the state of data checksums in a running cluster

2024-09-30 Thread Michael Banck

Hi,

On Mon, Sep 30, 2024 at 11:21:30PM +0200, Daniel Gustafsson wrote:
> > Yeah, I think a view like pg_stat_progress_checksums would work.
> 
> Added in the attached version.  It probably needs some polish (the docs for
> sure do) but it's at least a start.

Just a nitpick, but we call it data_checksums about everywhere, but the
new view is called pg_stat_progress_datachecksums - I think
pg_stat_progress_data_checksums would look better even if it gets quite
long.

Michael

Re: SET or STRICT modifiers on function affect planner row estimates

=?utf-8?Q?Micha=C5=82_K=C5=82eczek?=  writes:
>> On 30 Sep 2024, at 14:14, Ashutosh Bapat  
>> wrote:
>> It is difficult to understand the exact problem from your description.
>> Can you please provide EXPLAIN outputs showing the expected plan and
>> the unexpected plan; plans on the node where the query is run and
>> where the partitions are located.

> The table structure is as follows:

> CREATE TABLE tbl (…) PARTITION BY RANGE year(col02_date)

You're still expecting people to magically intuit what all those
"..."s are.  I could spend many minutes trying to reconstruct
a runnable example from these fragments, and if it didn't behave
as you say, it'd be wasted effort because I didn't guess right
about some un-mentioned detail.  Please provide a *self-contained*
example if you want someone to poke into this in any detail.
You have not mentioned your PG version, either.

My first guess would be that adding STRICT or adding a SET clause
prevents function inlining, because it does.  However, your Plan 2
doesn't seem to involve a FunctionScan node, so either these plans
aren't really what you say or there's something else going on.

regards, tom lane

Re: pg_verifybackup: TAR format backup verification

On Mon, Sep 30, 2024 at 11:31 AM Tom Lane  wrote:
> WFM, modulo the suggestion about changing data types.

I would prefer not to make the data type change here because it has
quite a few tentacles. If I change member_copy_control_data() then I
have to change astreamer_verify_content() which means changing the
astreamer interface which means adjusting all of the other astreamers.
That can certainly be done, but it's quite possible it might provoke
some other Coverity warning. Since this is a length, it might've been
better to use an unsigned data type, but there's no reason that I can
see why it should be size_t specifically: the origin of the value
could be either the return value of read(), which is ssize_t not
size_t, or the number of bytes returned by a decompression library or
the number of bytes present in a protocol message. Trying to make
things fit better here is just likely to make them fit worse someplace
else.

"You are in a maze of twisty little data types, all alike."

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: apply_scanjoin_target_to_paths and partitionwise join

2024-09-30 Thread Andrei Lepikhov


On 24/7/2024 15:22, Ashutosh Bapat wrote:

On Wed, Jul 24, 2024 at 9:42 AM Richard Guo  wrote:

Is there a specific query that demonstrates benefits from this change?
I'm curious about scenarios where a partitionwise join runs slower
than a non-partitionwise join.


[1] provides a testcase where a nonpartitionwise join is better than
partitionwise join. This testcase is derived from a bug reported by an
EDB customer. [2] is another bug report on psql-bugs.
I haven't passed through the patch yet, but can this issue affect the 
decision on what to push down to foreign servers: a whole join or just a 
scan of two partitions?
If the patch is related to the pushdown decision, I'd say it is quite an 
annoying problem for me. From time to time, I see cases where JOIN 
produces more tuples than both partitions have in total - in this case, 
it would be better to transfer tables' tuples to the main instance 
before joining them.


--
regards, Andrei Lepikhov

Re: pg_verifybackup: TAR format backup verification

Robert Haas  writes:
> On Mon, Sep 30, 2024 at 11:24 AM Tom Lane  wrote:
>> Um, wait ... we do have strtou64(), so you should use that.

> The thing we should be worried about is not how large a JSON blob
> might be, but rather how large any file that appears in the data
> directory might be. So uint32 is out; and I think I hear you voting
> for uint64 over size_t.

Yes.  size_t might only be 32 bits.

> But then how do you think we should print
> that? Cast to unsigned long long and use %llu?

Our two standard solutions are to do that or to use UINT64_FORMAT.
But UINT64_FORMAT is problematic in translatable strings because
then the .po files would become platform-specific, so long long
is what to use in that case.  For a non-translated format string
you can do either.

> I don't understand what you think the widely-used, better solution is
> here.

What we just said above.

regards, tom lane

Re: ACL_MAINTAIN, Lack of comment content

> On 30 Sep 2024, at 17:43, Tom Lane  wrote:
> 
> Nathan Bossart  writes:
>> On Mon, Sep 30, 2024 at 04:13:55PM +0200, Daniel Gustafsson wrote:
>>> I'm not a native speaker so I'm not sure which is right, but grepping for 
>>> other
>>> lists of items shows that the last "and" item is often preceded by a comma 
>>> so
>>> I'll do that.
> 
>> I'm not aware of a project policy around the Oxford comma [0], but I tend
>> to include one.
> 
> Yeah, as that wikipedia article suggests, you can find support for
> either choice.  I'd say do what looks best in context.

Thanks for input, I ended up keeping the comma.

--
Daniel Gustafsson

Re: pg_verifybackup: TAR format backup verification

Robert Haas  writes:
> On Mon, Sep 30, 2024 at 11:31 AM Tom Lane  wrote:
>> WFM, modulo the suggestion about changing data types.

> I would prefer not to make the data type change here because it has
> quite a few tentacles.

I see your point for the function's "len" argument, but perhaps it's
worth doing

-   intremaining;
+   size_t remaining;

remaining = sizeof(ControlFileData) - mystreamer->control_file_bytes;
memcpy(((char *) &mystreamer->control_file)
   + mystreamer->control_file_bytes,
-  data, Min(len, remaining));
+  data, Min((size_t) len, remaining));

This is enough to ensure the Min() remains safe.

regards, tom lane

Re: pg_upgrade check for invalid databases