pgsql: doc: Fix typos in protocol.sgml

2022-08-02 Thread Michael Paquier
doc: Fix typos in protocol.sgml

Author: Ekaterina Kiryanova
Discussion: 
https://postgr.es/m/745414e7-efb2-a6ae-5b83-fcbdf35aa...@postgrespro.ru
Backpatch-through: 15

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/a69959fab2f3633992b5cabec85acecbac6074c8

Modified Files
--
doc/src/sgml/protocol.sgml | 14 +++---
1 file changed, 7 insertions(+), 7 deletions(-)



pgsql: doc: Fix typos in protocol.sgml

2022-08-02 Thread Michael Paquier
doc: Fix typos in protocol.sgml

Author: Ekaterina Kiryanova
Discussion: 
https://postgr.es/m/745414e7-efb2-a6ae-5b83-fcbdf35aa...@postgrespro.ru
Backpatch-through: 15

Branch
--
REL_15_STABLE

Details
---
https://git.postgresql.org/pg/commitdiff/5b94d3ccb7ad9be902c37505ed54aabd2aeeccf1

Modified Files
--
doc/src/sgml/protocol.sgml | 14 +++---
1 file changed, 7 insertions(+), 7 deletions(-)



pgsql: Improve performance of ORDER BY / DISTINCT aggregates

2022-08-02 Thread David Rowley
Improve performance of ORDER BY / DISTINCT aggregates

ORDER BY / DISTINCT aggreagtes have, since implemented in Postgres, been
executed by always performing a sort in nodeAgg.c to sort the tuples in
the current group into the correct order before calling the transition
function on the sorted tuples.  This was not great as often there might be
an index that could have provided pre-sorted input and allowed the
transition functions to be called as the rows come in, rather than having
to store them in a tuplestore in order to sort them once all the tuples
for the group have arrived.

Here we change the planner so it requests a path with a sort order which
supports the most amount of ORDER BY / DISTINCT aggregate functions and
add new code to the executor to allow it to support the processing of
ORDER BY / DISTINCT aggregates where the tuples are already sorted in the
correct order.

Since there can be many ORDER BY / DISTINCT aggregates in any given query
level, it's very possible that we can't find an order that suits all of
these aggregates.  The sort order that the planner chooses is simply the
one that suits the most aggregate functions.  We take the most strictly
sorted variation of each order and see how many aggregate functions can
use that, then we try again with the order of the remaining aggregates to
see if another order would suit more aggregate functions.  For example:

SELECT agg(a ORDER BY a),agg2(a ORDER BY a,b) ...

would request the sort order to be {a, b} because {a} is a subset of the
sort order of {a,b}, but;

SELECT agg(a ORDER BY a),agg2(a ORDER BY c) ...

would just pick a plan ordered by {a} (we give precedence to aggregates
which are earlier in the targetlist).

SELECT agg(a ORDER BY a),agg2(a ORDER BY b),agg3(a ORDER BY b) ...

would choose to order by {b} since two aggregates suit that vs just one
that requires input ordered by {a}.

Author: David Rowley
Reviewed-by: Ronan Dunklau, James Coleman, Ranier Vilela, Richard Guo, Tom Lane
Discussion: 
https://postgr.es/m/CAApHDvpHzfo92%3DR4W0%2BxVua3BUYCKMckWAmo-2t_KiXN-wYH%3Dw%40mail.gmail.com

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/1349d2790bf48a4de072931c722f39337e72055e

Modified Files
--
contrib/postgres_fdw/expected/postgres_fdw.out|  32 ++-
contrib/postgres_fdw/sql/postgres_fdw.sql |   2 +
src/backend/executor/execExpr.c   |  52 +++-
src/backend/executor/execExprInterp.c | 102 +++
src/backend/executor/nodeAgg.c|  34 ++-
src/backend/jit/llvm/llvmjit_expr.c   |  48 
src/backend/jit/llvm/llvmjit_types.c  |   2 +
src/backend/optimizer/path/pathkeys.c |  45 +++-
src/backend/optimizer/plan/planagg.c  |   2 +-
src/backend/optimizer/plan/planner.c  | 310 --
src/backend/optimizer/prep/prepagg.c  |   7 +-
src/backend/parser/parse_expr.c   |   1 +
src/backend/parser/parse_func.c   |   1 +
src/include/catalog/catversion.h  |   2 +-
src/include/executor/execExpr.h   |  17 ++
src/include/executor/nodeAgg.h|   8 +
src/include/nodes/pathnodes.h |  16 +-
src/include/nodes/primnodes.h |   7 +
src/include/optimizer/paths.h |   4 +-
src/test/regress/expected/aggregates.out  |  80 +-
src/test/regress/expected/partition_aggregate.out | 118 
src/test/regress/expected/sqljson.out |  12 +-
src/test/regress/expected/tuplesort.out   |  42 +--
src/test/regress/sql/aggregates.sql   |  43 +++
24 files changed, 849 insertions(+), 138 deletions(-)



pgsql: Change type "char"'s I/O format for non-ASCII characters.

2022-08-02 Thread Tom Lane
Change type "char"'s I/O format for non-ASCII characters.

Previously, a byte with the high bit set was just transmitted
as-is by charin() and charout().  This is problematic if the
database encoding is multibyte, because the result of charout()
won't be validly encoded, which breaks various stuff that
expects all text strings to be validly encoded.  We've
previously decided to enforce encoding validity rather than try
to individually harden each place that might have a problem with
such strings, so it's time to do something about "char".

To fix, represent high-bit-set characters as \ooo (backslash
and three octal digits), following the ancient "escape" format
for bytea.  charin() will continue to accept the old way as well,
though that is only reachable in single-byte encodings.

Add some test cases just so there is coverage for this code.
We'll otherwise leave this question undocumented as it was before,
because we don't really want to encourage end-user use of "char".

For the moment, back-patch into v15 so that this change appears
in 15beta3.  If there's not great pushback we should consider
absorbing this change into the older branches.

Discussion: https://postgr.es/m/2318797.1638558...@sss.pgh.pa.us

Branch
--
REL_15_STABLE

Details
---
https://git.postgresql.org/pg/commitdiff/c034b629cc6f44099c9f54f3f0b3f4340e02d9bc

Modified Files
--
doc/src/sgml/datatype.sgml   | 10 +++--
src/backend/utils/adt/char.c | 72 
src/test/regress/expected/char.out   | 63 ++-
src/test/regress/expected/char_1.out | 63 ++-
src/test/regress/expected/char_2.out | 63 ++-
src/test/regress/sql/char.sql| 20 +-
6 files changed, 263 insertions(+), 28 deletions(-)



pgsql: Change type "char"'s I/O format for non-ASCII characters.

2022-08-02 Thread Tom Lane
Change type "char"'s I/O format for non-ASCII characters.

Previously, a byte with the high bit set was just transmitted
as-is by charin() and charout().  This is problematic if the
database encoding is multibyte, because the result of charout()
won't be validly encoded, which breaks various stuff that
expects all text strings to be validly encoded.  We've
previously decided to enforce encoding validity rather than try
to individually harden each place that might have a problem with
such strings, so it's time to do something about "char".

To fix, represent high-bit-set characters as \ooo (backslash
and three octal digits), following the ancient "escape" format
for bytea.  charin() will continue to accept the old way as well,
though that is only reachable in single-byte encodings.

Add some test cases just so there is coverage for this code.
We'll otherwise leave this question undocumented as it was before,
because we don't really want to encourage end-user use of "char".

For the moment, back-patch into v15 so that this change appears
in 15beta3.  If there's not great pushback we should consider
absorbing this change into the older branches.

Discussion: https://postgr.es/m/2318797.1638558...@sss.pgh.pa.us

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/ec62ce55a813db5c925d89a53b5b22baa509abb6

Modified Files
--
doc/src/sgml/datatype.sgml   | 10 +++--
src/backend/utils/adt/char.c | 72 
src/test/regress/expected/char.out   | 63 ++-
src/test/regress/expected/char_1.out | 63 ++-
src/test/regress/expected/char_2.out | 63 ++-
src/test/regress/sql/char.sql| 20 +-
6 files changed, 263 insertions(+), 28 deletions(-)



pgsql: Remove unused fields from ExprEvalStep

2022-08-02 Thread David Rowley
Remove unused fields from ExprEvalStep

These were added recently by 1349d2790.

Reported-by: Zhihong Yu
Discussion: 
https://postgr.es/m/CALNJ-vTi+YDuAWKp4Z_Dv=mrz=aq81qtg0d7wzc8y7rs_+i...@mail.gmail.com

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/9fc1776dda9f1ba6d36c4e7970218c3391f1bb2c

Modified Files
--
src/include/executor/execExpr.h | 3 ---
1 file changed, 3 deletions(-)



pgsql: Be more wary about 32-bit integer overflow in pg_stat_statements

2022-08-02 Thread Tom Lane
Be more wary about 32-bit integer overflow in pg_stat_statements.

We've heard a couple of reports of people having trouble with
multi-gigabyte-sized query-texts files.  It occurred to me that on
32-bit platforms, there could be an issue with integer overflow
of calculations associated with the total query text size.
Address that with several changes:

1. Limit pg_stat_statements.max to INT_MAX / 2 not INT_MAX.
The hashtable code will bound it to that anyway unless "long"
is 64 bits.  We still need overflow guards on its use, but
this helps.

2. Add a check to prevent extending the query-texts file to
more than MaxAllocHugeSize.  If it got that big, qtext_load_file
would certainly fail, so there's not much point in allowing it.
Without this, we'd need to consider whether extent, query_offset,
and related variables shouldn't be off_t not size_t.

3. Adjust the comparisons in need_gc_qtexts() to be done in 64-bit
arithmetic on all platforms.  It appears possible that under duress
those multiplications could overflow 32 bits, yielding a false
conclusion that we need to garbage-collect the texts file, which
could lead to repeatedly garbage-collecting after every hash table
insertion.

Per report from Bruno da Silva.  I'm not convinced that these
issues fully explain his problem; there may be some other bug that's
contributing to the query-texts file becoming so large in the first
place.  But it did get that big, so #2 is a reasonable defense,
and #3 could explain the reported performance difficulties.

(See also commit 8bbe4cbd9, which addressed some related bugs.
The second Discussion: link is the thread that led up to that.)

This issue is old, and is primarily a problem for old platforms,
so back-patch.

Discussion: 
https://postgr.es/m/cab+nuk93fl1q9elocotvlp07g7rav4vbdrkm0cvqohdvmpa...@mail.gmail.com
Discussion: https://postgr.es/m/5601d354.5000...@bluetreble.com

Branch
--
REL_10_STABLE

Details
---
https://git.postgresql.org/pg/commitdiff/dd414bf4e047e55028db28172e8184fcd2ee1201

Modified Files
--
contrib/pg_stat_statements/pg_stat_statements.c | 26 +
1 file changed, 22 insertions(+), 4 deletions(-)



pgsql: Be more wary about 32-bit integer overflow in pg_stat_statements

2022-08-02 Thread Tom Lane
Be more wary about 32-bit integer overflow in pg_stat_statements.

We've heard a couple of reports of people having trouble with
multi-gigabyte-sized query-texts files.  It occurred to me that on
32-bit platforms, there could be an issue with integer overflow
of calculations associated with the total query text size.
Address that with several changes:

1. Limit pg_stat_statements.max to INT_MAX / 2 not INT_MAX.
The hashtable code will bound it to that anyway unless "long"
is 64 bits.  We still need overflow guards on its use, but
this helps.

2. Add a check to prevent extending the query-texts file to
more than MaxAllocHugeSize.  If it got that big, qtext_load_file
would certainly fail, so there's not much point in allowing it.
Without this, we'd need to consider whether extent, query_offset,
and related variables shouldn't be off_t not size_t.

3. Adjust the comparisons in need_gc_qtexts() to be done in 64-bit
arithmetic on all platforms.  It appears possible that under duress
those multiplications could overflow 32 bits, yielding a false
conclusion that we need to garbage-collect the texts file, which
could lead to repeatedly garbage-collecting after every hash table
insertion.

Per report from Bruno da Silva.  I'm not convinced that these
issues fully explain his problem; there may be some other bug that's
contributing to the query-texts file becoming so large in the first
place.  But it did get that big, so #2 is a reasonable defense,
and #3 could explain the reported performance difficulties.

(See also commit 8bbe4cbd9, which addressed some related bugs.
The second Discussion: link is the thread that led up to that.)

This issue is old, and is primarily a problem for old platforms,
so back-patch.

Discussion: 
https://postgr.es/m/cab+nuk93fl1q9elocotvlp07g7rav4vbdrkm0cvqohdvmpa...@mail.gmail.com
Discussion: https://postgr.es/m/5601d354.5000...@bluetreble.com

Branch
--
REL_13_STABLE

Details
---
https://git.postgresql.org/pg/commitdiff/6b67db10c366ee825345ef81dcca57d29ad4c7f1

Modified Files
--
contrib/pg_stat_statements/pg_stat_statements.c | 26 +
1 file changed, 22 insertions(+), 4 deletions(-)



pgsql: Be more wary about 32-bit integer overflow in pg_stat_statements

2022-08-02 Thread Tom Lane
Be more wary about 32-bit integer overflow in pg_stat_statements.

We've heard a couple of reports of people having trouble with
multi-gigabyte-sized query-texts files.  It occurred to me that on
32-bit platforms, there could be an issue with integer overflow
of calculations associated with the total query text size.
Address that with several changes:

1. Limit pg_stat_statements.max to INT_MAX / 2 not INT_MAX.
The hashtable code will bound it to that anyway unless "long"
is 64 bits.  We still need overflow guards on its use, but
this helps.

2. Add a check to prevent extending the query-texts file to
more than MaxAllocHugeSize.  If it got that big, qtext_load_file
would certainly fail, so there's not much point in allowing it.
Without this, we'd need to consider whether extent, query_offset,
and related variables shouldn't be off_t not size_t.

3. Adjust the comparisons in need_gc_qtexts() to be done in 64-bit
arithmetic on all platforms.  It appears possible that under duress
those multiplications could overflow 32 bits, yielding a false
conclusion that we need to garbage-collect the texts file, which
could lead to repeatedly garbage-collecting after every hash table
insertion.

Per report from Bruno da Silva.  I'm not convinced that these
issues fully explain his problem; there may be some other bug that's
contributing to the query-texts file becoming so large in the first
place.  But it did get that big, so #2 is a reasonable defense,
and #3 could explain the reported performance difficulties.

(See also commit 8bbe4cbd9, which addressed some related bugs.
The second Discussion: link is the thread that led up to that.)

This issue is old, and is primarily a problem for old platforms,
so back-patch.

Discussion: 
https://postgr.es/m/cab+nuk93fl1q9elocotvlp07g7rav4vbdrkm0cvqohdvmpa...@mail.gmail.com
Discussion: https://postgr.es/m/5601d354.5000...@bluetreble.com

Branch
--
REL_12_STABLE

Details
---
https://git.postgresql.org/pg/commitdiff/6608a43056365dc866fc8bd2c9f323aea4725210

Modified Files
--
contrib/pg_stat_statements/pg_stat_statements.c | 26 +
1 file changed, 22 insertions(+), 4 deletions(-)



pgsql: Be more wary about 32-bit integer overflow in pg_stat_statements

2022-08-02 Thread Tom Lane
Be more wary about 32-bit integer overflow in pg_stat_statements.

We've heard a couple of reports of people having trouble with
multi-gigabyte-sized query-texts files.  It occurred to me that on
32-bit platforms, there could be an issue with integer overflow
of calculations associated with the total query text size.
Address that with several changes:

1. Limit pg_stat_statements.max to INT_MAX / 2 not INT_MAX.
The hashtable code will bound it to that anyway unless "long"
is 64 bits.  We still need overflow guards on its use, but
this helps.

2. Add a check to prevent extending the query-texts file to
more than MaxAllocHugeSize.  If it got that big, qtext_load_file
would certainly fail, so there's not much point in allowing it.
Without this, we'd need to consider whether extent, query_offset,
and related variables shouldn't be off_t not size_t.

3. Adjust the comparisons in need_gc_qtexts() to be done in 64-bit
arithmetic on all platforms.  It appears possible that under duress
those multiplications could overflow 32 bits, yielding a false
conclusion that we need to garbage-collect the texts file, which
could lead to repeatedly garbage-collecting after every hash table
insertion.

Per report from Bruno da Silva.  I'm not convinced that these
issues fully explain his problem; there may be some other bug that's
contributing to the query-texts file becoming so large in the first
place.  But it did get that big, so #2 is a reasonable defense,
and #3 could explain the reported performance difficulties.

(See also commit 8bbe4cbd9, which addressed some related bugs.
The second Discussion: link is the thread that led up to that.)

This issue is old, and is primarily a problem for old platforms,
so back-patch.

Discussion: 
https://postgr.es/m/cab+nuk93fl1q9elocotvlp07g7rav4vbdrkm0cvqohdvmpa...@mail.gmail.com
Discussion: https://postgr.es/m/5601d354.5000...@bluetreble.com

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/c67c2e2a29392b85ba7c728d3ceed986808eeec3

Modified Files
--
contrib/pg_stat_statements/pg_stat_statements.c | 26 +
1 file changed, 22 insertions(+), 4 deletions(-)



pgsql: Be more wary about 32-bit integer overflow in pg_stat_statements

2022-08-02 Thread Tom Lane
Be more wary about 32-bit integer overflow in pg_stat_statements.

We've heard a couple of reports of people having trouble with
multi-gigabyte-sized query-texts files.  It occurred to me that on
32-bit platforms, there could be an issue with integer overflow
of calculations associated with the total query text size.
Address that with several changes:

1. Limit pg_stat_statements.max to INT_MAX / 2 not INT_MAX.
The hashtable code will bound it to that anyway unless "long"
is 64 bits.  We still need overflow guards on its use, but
this helps.

2. Add a check to prevent extending the query-texts file to
more than MaxAllocHugeSize.  If it got that big, qtext_load_file
would certainly fail, so there's not much point in allowing it.
Without this, we'd need to consider whether extent, query_offset,
and related variables shouldn't be off_t not size_t.

3. Adjust the comparisons in need_gc_qtexts() to be done in 64-bit
arithmetic on all platforms.  It appears possible that under duress
those multiplications could overflow 32 bits, yielding a false
conclusion that we need to garbage-collect the texts file, which
could lead to repeatedly garbage-collecting after every hash table
insertion.

Per report from Bruno da Silva.  I'm not convinced that these
issues fully explain his problem; there may be some other bug that's
contributing to the query-texts file becoming so large in the first
place.  But it did get that big, so #2 is a reasonable defense,
and #3 could explain the reported performance difficulties.

(See also commit 8bbe4cbd9, which addressed some related bugs.
The second Discussion: link is the thread that led up to that.)

This issue is old, and is primarily a problem for old platforms,
so back-patch.

Discussion: 
https://postgr.es/m/cab+nuk93fl1q9elocotvlp07g7rav4vbdrkm0cvqohdvmpa...@mail.gmail.com
Discussion: https://postgr.es/m/5601d354.5000...@bluetreble.com

Branch
--
REL_14_STABLE

Details
---
https://git.postgresql.org/pg/commitdiff/17fd203b414e9a1d649fb22ab11afd8355947476

Modified Files
--
contrib/pg_stat_statements/pg_stat_statements.c | 26 +
1 file changed, 22 insertions(+), 4 deletions(-)



pgsql: Be more wary about 32-bit integer overflow in pg_stat_statements

2022-08-02 Thread Tom Lane
Be more wary about 32-bit integer overflow in pg_stat_statements.

We've heard a couple of reports of people having trouble with
multi-gigabyte-sized query-texts files.  It occurred to me that on
32-bit platforms, there could be an issue with integer overflow
of calculations associated with the total query text size.
Address that with several changes:

1. Limit pg_stat_statements.max to INT_MAX / 2 not INT_MAX.
The hashtable code will bound it to that anyway unless "long"
is 64 bits.  We still need overflow guards on its use, but
this helps.

2. Add a check to prevent extending the query-texts file to
more than MaxAllocHugeSize.  If it got that big, qtext_load_file
would certainly fail, so there's not much point in allowing it.
Without this, we'd need to consider whether extent, query_offset,
and related variables shouldn't be off_t not size_t.

3. Adjust the comparisons in need_gc_qtexts() to be done in 64-bit
arithmetic on all platforms.  It appears possible that under duress
those multiplications could overflow 32 bits, yielding a false
conclusion that we need to garbage-collect the texts file, which
could lead to repeatedly garbage-collecting after every hash table
insertion.

Per report from Bruno da Silva.  I'm not convinced that these
issues fully explain his problem; there may be some other bug that's
contributing to the query-texts file becoming so large in the first
place.  But it did get that big, so #2 is a reasonable defense,
and #3 could explain the reported performance difficulties.

(See also commit 8bbe4cbd9, which addressed some related bugs.
The second Discussion: link is the thread that led up to that.)

This issue is old, and is primarily a problem for old platforms,
so back-patch.

Discussion: 
https://postgr.es/m/cab+nuk93fl1q9elocotvlp07g7rav4vbdrkm0cvqohdvmpa...@mail.gmail.com
Discussion: https://postgr.es/m/5601d354.5000...@bluetreble.com

Branch
--
REL_11_STABLE

Details
---
https://git.postgresql.org/pg/commitdiff/06f6a07ba465a6c2731697a7548e7be363cf4c57

Modified Files
--
contrib/pg_stat_statements/pg_stat_statements.c | 26 +
1 file changed, 22 insertions(+), 4 deletions(-)



pgsql: Be more wary about 32-bit integer overflow in pg_stat_statements

2022-08-02 Thread Tom Lane
Be more wary about 32-bit integer overflow in pg_stat_statements.

We've heard a couple of reports of people having trouble with
multi-gigabyte-sized query-texts files.  It occurred to me that on
32-bit platforms, there could be an issue with integer overflow
of calculations associated with the total query text size.
Address that with several changes:

1. Limit pg_stat_statements.max to INT_MAX / 2 not INT_MAX.
The hashtable code will bound it to that anyway unless "long"
is 64 bits.  We still need overflow guards on its use, but
this helps.

2. Add a check to prevent extending the query-texts file to
more than MaxAllocHugeSize.  If it got that big, qtext_load_file
would certainly fail, so there's not much point in allowing it.
Without this, we'd need to consider whether extent, query_offset,
and related variables shouldn't be off_t not size_t.

3. Adjust the comparisons in need_gc_qtexts() to be done in 64-bit
arithmetic on all platforms.  It appears possible that under duress
those multiplications could overflow 32 bits, yielding a false
conclusion that we need to garbage-collect the texts file, which
could lead to repeatedly garbage-collecting after every hash table
insertion.

Per report from Bruno da Silva.  I'm not convinced that these
issues fully explain his problem; there may be some other bug that's
contributing to the query-texts file becoming so large in the first
place.  But it did get that big, so #2 is a reasonable defense,
and #3 could explain the reported performance difficulties.

(See also commit 8bbe4cbd9, which addressed some related bugs.
The second Discussion: link is the thread that led up to that.)

This issue is old, and is primarily a problem for old platforms,
so back-patch.

Discussion: 
https://postgr.es/m/cab+nuk93fl1q9elocotvlp07g7rav4vbdrkm0cvqohdvmpa...@mail.gmail.com
Discussion: https://postgr.es/m/5601d354.5000...@bluetreble.com

Branch
--
REL_15_STABLE

Details
---
https://git.postgresql.org/pg/commitdiff/82ebc70d1c7fd9b301e15cec658696d28df01835

Modified Files
--
contrib/pg_stat_statements/pg_stat_statements.c | 26 +
1 file changed, 22 insertions(+), 4 deletions(-)