Re: Incremental View Maintenance, take 2

2024-07-27 Thread Kirill Reshke
Hi!
Cloudberry DB (Greenplum fork) uses IMMV feature for AQUMV (auto query
use matview) feature, so i got interested in how it is implemented.

On Thu, 11 Jul 2024 at 09:24, Yugo NAGATA  wrote:
>
> I updated the patch to bump up the version numbers in psql and pg_dump codes
> from 17 to 18.

Few suggestions:

1) `Add-relisivm-column-to-pg_class-system-catalog` commit message
should be fixed, there is "isimmv" in the last line.
2) I dont get why `Add-Incremental-View-Maintenance-support.patch`
goes after 0005 & 0004. Shoulndt we first implement feature server
side, only when client (psql & pg_dump) side?
3) Can we provide regression tests for each function separately? Test
for main feature in main patch, test for DISTINCT support in
v34-0007-Add-DISTINCT-support-for-IVM.patch etc? This way the patchset
will be easier to review, and can be committed separelety.
4) v34-0006-Add-Incremental-View-Maintenance-support.patch no longer
applies due to 4b74ebf726d444ba820830cad986a1f92f724649. After
resolving issues manually, it does not compile, because
4b74ebf726d444ba820830cad986a1f92f724649 also removes
save_userid/save_sec_context fields from ExecCreateTableAs.

>   if (RelationIsIVM(matviewRel) && stmt->skipData)
Now this function accepts skipData param.

5) For DISTINCT support patch uses hidden __ivm* columns. Is this
design discussed anywhere? I wonder if this is a necessity (only
solution) or if there are alternatives.
6)
What are the caveats of supporting some simple cases for aggregation
funcs like in example?
```
regress=# CREATE INCREMENTAL MATERIALIZED VIEW mv_ivm_2 AS SELECT
sum(j) + sum(i) from mv_base_a;
ERROR:  expression containing an aggregate in it is not supported on
incrementally maintainable materialized view
```
I can see some difficulties with division CREATE IMMV  AS SELECT
1/sum(i) from mv_base_a;  (sum(i) == 0 case), but adding &
multiplication should be ok, aren't they?


Overall, patchset looks mature, however it is far from being
committable due to lack of testing/feedback/discussion. There is only
one way to fix this... Test and discuss it!


[1] https://github.com/cloudberrydb/cloudberrydb




Re: POC, WIP: OR-clause support for indexes

2024-07-27 Thread Alexander Korotkov
On Thu, Jul 25, 2024 at 5:04 PM Alena Rybakina
 wrote:
> To be honest, I have found a big problem in this patch - we try to perform 
> the transformation every time we examime a column:
>
> for (indexcol = 0; indexcol < index->nkeycolumns; indexcol++) { ...
>
> }
>
> I have fixed it and moved the transformation before going through the loop.

What makes you think there is a problem?  Do you have a test case
illustrating a slow planning time?

When v27 performs transformation for a particular column, it just
stops facing the first unmatched OR entry.  So,
match_orclause_to_indexcol() examines just the first OR entry for all
the columns excepts at most one.  So, the check
match_orclause_to_indexcol() does is not much slower than other
match_*_to_indexcol() do.

I actually think this could help performance in many cases, not hurt
it.  At least we get rid of O(n^2) complexity over the number of OR
entries, which could be very many.

--
Regards,
Alexander Korotkov
Supabase




Re: Building with meson on NixOS/nixpkgs

2024-07-27 Thread Heikki Linnakangas

On 26/07/2024 23:01, Tristan Partin wrote:

Heikki asked me to take a look at this patchset for the commitfest.
Looks good to me.

Heikki, just be careful rebasing the first patch. You need to make sure
the newly set `required: false` gets carried forward.


Committed and backpatched to v16 and v17. Thanks for the good 
explanations in the commit messages, Walther!


--
Heikki Linnakangas
Neon (https://neon.tech)





Re: Speed up JSON escape processing with SIMD plus other optimisations

2024-07-27 Thread David Rowley
On Wed, 24 Jul 2024 at 22:55, Heikki Linnakangas  wrote:
>
> On 02/07/2024 07:49, David Rowley wrote:
> > I've attached a rebased set of patches.  The previous set no longer applied.
>
> I looked briefly at the first patch. Seems reasonable.
>
> One little thing that caught my eye is that in populate_scalar(), you
> sometimes make a temporary copy of the string to add the
> null-terminator, but then call escape_json() which doesn't need the
> null-terminator anymore. See attached patch to avoid that. However, it's
> not clear to me how to reach that codepath, or if it reachable at all. I
> tried to add a NOTICE there and ran the regression tests, but got no
> failures.

Thanks for noticing that. It seems like a good simplification
regardless. I've incorporated it.

I made another pass over the 0001 and 0003 patches and after a bit of
renaming, I pushed the result.  I ended up keeping escape_json() as-is
and giving the new function the name escape_json_with_len().  The text
version is named ecape_json_text(). I think originally I did it the
other way as thought I'd have been able to adjust more locations than
I did. Having it this way around is slightly less churn.

I did another round of testing on the SIMD patch (attached as v5-0001)
as I wondered if the SIMD loop maybe shouldn't wait too long before
copying the bytes to the destination string.  I had wondered if the
JSON string was very large that if we looked ahead too far that by the
time we flush those bytes out to the destination buffer, we'd have
started eviction of L1 cachelines for parts of the buffer that are
still to be flushed.  I put this to the test (test 3) and found that
with a 1MB JSON string it is faster to flush every 512 bytes than it
is to only flush after checking the entire 1MB.  With a 10kB JSON
string (test 2), the extra code to flush every 512 bytes seems to slow
things down.  I'm a bit undecided about whether the flushing is
worthwhile or not. It really depend on the length of JSON strings we'd
like to optimise for. It might be possible to get the best of both but
I think it might require manually implementing portions of
appendBinaryStringInfo(). I'd rather not go there. Does anyone have
any thoughts about that?

Test 2 (10KB) does show a ~261% performance increase but dropped to
~227% flushing every 512 bytes. Test 3 (1MB) increased performance by
~99% without early flushing and increased to ~156% flushing every 512
bytes.

bench.sql: select row_to_json(j1)::jsonb from j1;

## Test 1 (variable JSON strings up to 1KB)
create table j1 (very_long_column_name_to_test_json_escape text);
insert into j1 select repeat('x', x) from generate_series(0,1024)x;
vacuum freeze j1;

master @ 17a5871d:
$ for i in {1..3}; do pgbench -n -f bench.sql -T 10 -M prepared
postgres | grep tps; done
tps = 364.410386 (without initial connection time)
tps = 367.914165 (without initial connection time)
tps = 365.794513 (without initial connection time)

master + v5-0001
$ for i in {1..3}; do pgbench -n -f bench.sql -T 10 -M prepared
postgres | grep tps; done
tps = 683.570613 (without initial connection time)
tps = 685.206578 (without initial connection time)
tps = 679.014056 (without initial connection time)

## Test 2 (10KB JSON strings)
create table j1 (very_long_column_name_to_test_json_escape text);
insert into j1 select repeat('x', 1024*10) from generate_series(0,1024)x;
vacuum freeze j1;

master @ 17a5871d:
$ for i in {1..3}; do pgbench -n -f bench.sql -T 10 -M prepared
postgres | grep tps; done
tps = 23.872630 (without initial connection time)
tps = 26.232014 (without initial connection time)
tps = 26.495739 (without initial connection time)

master + v5-0001
$ for i in {1..3}; do pgbench -n -f bench.sql -T 10 -M prepared
postgres | grep tps; done
tps = 96.813515 (without initial connection time)
tps = 96.023632 (without initial connection time)
tps = 99.630428 (without initial connection time)

master + v5-0001 ESCAPE_JSON_MAX_LOOKHEAD 512
$ for i in {1..3}; do pgbench -n -f bench.sql -T 10 -M prepared
postgres | grep tps; done
tps = 83.597442 (without initial connection time)
tps = 85.045554 (without initial connection time)
tps = 82.105907 (without initial connection time)

## Test 3 (1MB JSON strings)
create table j1 (very_long_column_name_to_test_json_escape text);
insert into j1 select repeat('x', 1024*1024) from generate_series(0,10)x;
vacuum freeze j1;

master @ 17a5871d:
$ for i in {1..3}; do pgbench -n -f bench.sql -T 10 -M prepared
postgres | grep tps; done
tps = 18.885922 (without initial connection time)
tps = 18.829701 (without initial connection time)
tps = 18.889369 (without initial connection time)

master v5-0001
$ for i in {1..3}; do pgbench -n -f bench.sql -T 10 -M prepared
postgres | grep tps; done
tps = 37.464967 (without initial connection time)
tps = 37.536676 (without initial connection time)
tps = 37.561387 (without initial connection time)

master + v5-0001 ESCAPE_JSON_MAX_LOOKHEAD 512
$ for i in {1..3}; do pgbench -n -f b

why is pg_upgrade's regression run so slow?

2024-07-27 Thread Andrew Dunstan



As part of its 002_pgupgrade.pl, the pg_upgrade tests do a rerun of the 
normal regression tests. That in itself is sad, but what concerns me 
here is why it's so much slower than the regular run? This is apparent 
everywhere (e.g. on crake the standard run takes about 30 to 90 s, but 
pg_upgrade's run takes 5 minutes or more). On Windows, it's 
catastrophic, and only hasn't been noticed because the buildfarm client 
wasn't counting a timeout as a failure. That was an error on my part and 
I have switched a few of my machines to code that checks more robustly 
for failure of meson tests - specifically by looking for the absence of 
test.success rather than the presence of test.fail. That means that 
drongo and fairywren are getting timeout errors. e.g. on the latest run 
on fairywren, the regular regression run took 226s, but pg_upgrade's run 
of what should be the same set of tests took 2418s. What the heck is 
going on here? Is it because there are the concurrent tests running? 
That doesn't seem enough to make the tests run more than 10 times as long.


I have a strong suspicion this is exacerbated by "debug_parallel_query = 
regress", especially since the tests run much faster on REL_17_STABLE 
where I am not setting that, but that can't be the whole explanation, 
since that setting should apply to both sets of tests.



cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: why is pg_upgrade's regression run so slow?

2024-07-27 Thread Tom Lane
Andrew Dunstan  writes:
> As part of its 002_pgupgrade.pl, the pg_upgrade tests do a rerun of the 
> normal regression tests. That in itself is sad, but what concerns me 
> here is why it's so much slower than the regular run? This is apparent 
> everywhere (e.g. on crake the standard run takes about 30 to 90 s, but 
> pg_upgrade's run takes 5 minutes or more).

Just to add some more fuel to the fire: I do *not* observe such an
effect on my own animals.  For instance, sifaka's last run shows
00:09 for install-check-C and the same (within rounding error)
for the regression test step in 002_pgupgrade; on the much slower
mamba, the last run took 07:33 for install-check-C and 07:40 for the
002_pgupgrade regression test step.

I'm still using the makefile-based build, and I'm not using
debug_parallel_query, but it doesn't make a lot of sense to me
that either of those things should affect this.

Looking at crake's last run, there are other oddities: why does
the "check" step take 00:24 while "install-check-en_US.utf8" (which
should be doing strictly less work) takes 01:00?  Again, there are
not similar discrepancies on my animals.  Are you sure there's not
background load on the machine?

regards, tom lane




Re: [18] separate collation and ctype versions, and cleanup of pg_database locale fields

2024-07-27 Thread Jeff Davis
On Thu, 2024-07-25 at 13:29 -0700, Jeff Davis wrote:
> it may be a good idea to version collation and ctype
> separately. The ctype version is, more or less, the Unicode version,
> and we know what that is for the builtin provider as well as ICU.

Attached a rough patch for the purposes of discussion. It tracks the
ctype version separately, but doesn't do anything with it yet.

The main problem is that it's one more slightly confusing thing to
understand, especially in pg_database because it's the ctype version of
the database default collation, not necessarily datctype.

Maybe we can do something with the naming or catalog representation to
make this more clear?

Regards,
Jeff Davis

From c4f637cebf96c9243ee866e15066b67838745c58 Mon Sep 17 00:00:00 2001
From: Jeff Davis 
Date: Fri, 26 Jul 2024 17:28:02 -0700
Subject: [PATCH v1] Add datctypeversion and collctypeversion.

The ctype version can be distinct from the collation version in some
cases, so track it separately in the catalog. For the builtin
"C.UTF-8" locale, the ctype version is the version of Unicode that
Postgres was built with. For ICU, the ctype version is the version of
Unicode that ICU was built with. For libc, it's the same as the
collation version.

These catalog fields are not used yet, but maintain them for the
future.
---
 src/backend/catalog/pg_collation.c   |   5 +
 src/backend/commands/collationcmds.c |  95 
 src/backend/commands/dbcommands.c| 158 ---
 src/backend/utils/adt/pg_locale.c|  38 +++
 src/bin/initdb/initdb.c  |   2 +
 src/include/catalog/pg_collation.h   |   2 +
 src/include/catalog/pg_database.h|   3 +
 src/include/catalog/pg_proc.dat  |  10 ++
 src/include/utils/pg_locale.h|   1 +
 9 files changed, 297 insertions(+), 17 deletions(-)

diff --git a/src/backend/catalog/pg_collation.c b/src/backend/catalog/pg_collation.c
index 7f2f7012299..822bccbd715 100644
--- a/src/backend/catalog/pg_collation.c
+++ b/src/backend/catalog/pg_collation.c
@@ -48,6 +48,7 @@ CollationCreate(const char *collname, Oid collnamespace,
 const char *colllocale,
 const char *collicurules,
 const char *collversion,
+const char *collctypeversion,
 bool if_not_exists,
 bool quiet)
 {
@@ -202,6 +203,10 @@ CollationCreate(const char *collname, Oid collnamespace,
 		values[Anum_pg_collation_collversion - 1] = CStringGetTextDatum(collversion);
 	else
 		nulls[Anum_pg_collation_collversion - 1] = true;
+	if (collctypeversion)
+		values[Anum_pg_collation_collctypeversion - 1] = CStringGetTextDatum(collctypeversion);
+	else
+		nulls[Anum_pg_collation_collctypeversion - 1] = true;
 
 	tup = heap_form_tuple(tupDesc, values, nulls);
 
diff --git a/src/backend/commands/collationcmds.c b/src/backend/commands/collationcmds.c
index 63ef9a08411..d8525dbfceb 100644
--- a/src/backend/commands/collationcmds.c
+++ b/src/backend/commands/collationcmds.c
@@ -64,6 +64,7 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 	DefElem*deterministicEl = NULL;
 	DefElem*rulesEl = NULL;
 	DefElem*versionEl = NULL;
+	DefElem*ctypeversionEl = NULL;
 	char	   *collcollate;
 	char	   *collctype;
 	const char *colllocale;
@@ -72,6 +73,7 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 	int			collencoding;
 	char		collprovider;
 	char	   *collversion = NULL;
+	char	   *collctypeversion = NULL;
 	Oid			newoid;
 	ObjectAddress address;
 
@@ -103,6 +105,8 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 			defelp = &rulesEl;
 		else if (strcmp(defel->defname, "version") == 0)
 			defelp = &versionEl;
+		else if (strcmp(defel->defname, "ctype_version") == 0)
+			defelp = &ctypeversionEl;
 		else
 		{
 			ereport(ERROR,
@@ -211,6 +215,9 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 		if (versionEl)
 			collversion = defGetString(versionEl);
 
+		if (ctypeversionEl)
+			collctypeversion = defGetString(ctypeversionEl);
+
 		if (collproviderstr)
 		{
 			if (pg_strcasecmp(collproviderstr, "builtin") == 0)
@@ -360,6 +367,18 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 		collversion = get_collation_actual_version(collprovider, locale);
 	}
 
+	if (!collctypeversion)
+	{
+		const char *locale;
+
+		if (collprovider == COLLPROVIDER_LIBC)
+			locale = collctype;
+		else
+			locale = colllocale;
+
+		collctypeversion = get_ctype_actual_version(collprovider, locale);
+	}
+
 	newoid = CollationCreate(collName,
 			 collNamespace,
 			 GetUserId(),
@@ -371,6 +390,7 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e
 			 colllocale,
 			 collicurules,
 			 collversion,
+			 collctypeversion,
 			 if_not_exists,
 			 false);	/* not quiet */
 
@@ -578,6 +598,77 @@ pg_collation_actual_version(PG_FUNCTION_ARGS)
 }
 
 
+Datum
+pg_ctype_actua

Re: Protocol question regarding Portal vs Cursor

2024-07-27 Thread Dave Cramer
Dave Cramer


On Sat, 27 Jul 2024 at 01:55, Tatsuo Ishii  wrote:

> > So while the API's are "virtually" identical AFAICT there is no way to
> > create a "WITH HOLD" portal ?
>
> I am not sure if I fully understand your question but I think you can
> create a portal with "WITH HOLD" option.
>
> BEGIN;
> DECLARE c CURSOR WITH HOLD FOR SELECT * FROM generate_series(1,10);
>
> (of course you could use extended query protocol instead of simple
> query protocol here)
>
> After this there's portal named "c" in the backend with WITH HOLD
> attribute. And you could issue a Describe message against the portal.
> Also you could issue an Execute messages to fetch N rows (N can be
> specified in the Execute message) with or without in a transaction
> because WITH HOLD is specified.
>
> Here is a sample session. The generate_series() generates 10 rows. You
> can fetch 5 rows from portal "c" inside the transaction. After the
> transaction closed, you can fetch remaining 5 rows as expected.
>
> FE=> Query (query="BEGIN")
> <= BE CommandComplete(BEGIN)
> <= BE ReadyForQuery(T)
> FE=> Query (query="DECLARE c CURSOR WITH HOLD FOR SELECT * FROM
> generate_series(1,10)")
> <= BE CommandComplete(DECLARE CURSOR)
> <= BE ReadyForQuery(T)
> FE=> Describe(portal="c")
> FE=> Execute(portal="c")
> FE=> Sync
> <= BE RowDescription
> <= BE DataRow
> <= BE DataRow
> <= BE DataRow
> <= BE DataRow
> <= BE DataRow
> <= BE PortalSuspended
> <= BE ReadyForQuery(T)
> FE=> Query (query="END")
> <= BE CommandComplete(COMMIT)
> <= BE ReadyForQuery(I)
> FE=> Execute(portal="c")
> FE=> Sync
> <= BE DataRow
> <= BE DataRow
> <= BE DataRow
> <= BE DataRow
> <= BE DataRow
> <= BE PortalSuspended
> <= BE ReadyForQuery(I)
> FE=> Terminate
>
> Best reagards,
>


Yes, sorry, I should have said one can not create a with hold portal using
the BIND command

Dave


Unexpected Null Pointer For Static Shared Memory Segment

2024-07-27 Thread Aditya Gupta
Hello,

I hope this message finds you well. I am currently working on a PostgreSQL 
extension and have encountered an issue where a static pointer becomes null 
between different AM routines. My problem is as follows:

I am working with a few AM routines, specifically “ambuild” and “amrescan”. 
During “ambuild”, I use “ShmemInitStruct” to initialize a segment of shared 
memory and save the pointer to this location in my static, global pointer. I 
then set some values of the structure that the pointer points to, which I 
believe works correctly. I have ensured to acquire, and later release, the 
“AddinShmemInitLock” as well as check if we have found a segment of the same 
name in shared memory. I can access the pointer and any data I save in the 
struct perfectly fine during this AM routine.

When the extension later performs “amrescan”, the static pointer I had set is 
null. I am not quite sure why this is happening. I would greatly appreciate any 
guidance or suggestions! Perhaps I need to use the startup hooks when calling 
the “ShmemInitStruct” function (although I would like to avoid this as the size 
of the segment I am initializing varies at run time) or use dynamic shared 
memory?

Please let me know if there are any more details I can provide or if anything 
is unclear. Thanks for your time and assistance!

Best,
Aditya Gupta



Re: tiny step toward threading: reduce dependence on setlocale()

2024-07-27 Thread Andreas Karlsson

On 7/26/24 10:35 PM, Jeff Davis wrote:

database_ctype_is_c refers to the LC_CTYPE environment of the database
-- pg_database.datctype. default_locale.ctype_is_c is the ctype of the
database's default collation.

Confusing, I know, but it matters for a few things that still depend on
the LC_CTYPE, such as tsearch and maybe a few extensions. See
f413941f41.


Ah, right! That was thinko on my behalf.

The set of patches looks good to me now. There is further refactoring 
that can be done in this area (and should be done given all calls e.g to 
isalpha()) but I think this set of patches improves code readability 
while moving us away from setlocale().


And even if we take a tiny performance hit here, which your tests did 
not measure, I would say it is worth it both due to code clarity and due 
to not relying on thread unsafe state.


I do not see these patches in the commitfest app but if they were I 
would have marked them as ready for committer.


Andreas





Fix overflow in pg_size_pretty

2024-07-27 Thread Joseph Koshakow
Hi all,

Attached is a patch that resolves an overflow in pg_size_pretty() that
resulted in unexpected behavior when PG_INT64_MIN was passed in as an
argument.

The pg_abs_s64() helper function is extracted and simplified from patch
0001 from [0]. I didn't add similar functions for other sized integers
since they'd be unused, but I'd be happy to add them if others
disagree.

`SELECT -9223372036854775808::bigint` results in an out of range error,
even though `-9223372036854775808` can fit in a `bigint` and
`SELECT pg_typeof(-9223372036854775808)` returns `bigint`. That's why
the `::bigint` cast is omitted from my test.

[0]
https://www.postgresql.org/message-id/flat/caavxfhdbpoyegs7s+xf4iaw0-cgiq25jpydwbqqqvltle_t...@mail.gmail.com

Thanks,
Joseph Koshakow
From 6ec885412f2e0f3a3e019ec1906901e39c6d517a Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Sat, 27 Jul 2024 15:06:09 -0400
Subject: [PATCH] Fix overflow in pg_size_pretty

This commit removes an overflow from pg_size_pretty that causes
PG_INT64_MIN to by displayed with the bytes unit instead of the PB
unit.
---
 src/backend/utils/adt/dbsize.c   |  3 ++-
 src/include/common/int.h | 13 +
 src/test/regress/expected/dbsize.out |  6 ++
 src/test/regress/sql/dbsize.sql  |  2 ++
 4 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 25d7110c13..5648f40101 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -21,6 +21,7 @@
 #include "catalog/pg_tablespace.h"
 #include "commands/dbcommands.h"
 #include "commands/tablespace.h"
+#include "common/int.h"
 #include "miscadmin.h"
 #include "storage/fd.h"
 #include "utils/acl.h"
@@ -577,7 +578,7 @@ pg_size_pretty(PG_FUNCTION_ARGS)
 		uint8		bits;
 
 		/* use this unit if there are no more units or we're below the limit */
-		if (unit[1].name == NULL || i64abs(size) < unit->limit)
+		if (unit[1].name == NULL || pg_abs_s64(size) < unit->limit)
 		{
 			if (unit->round)
 size = half_rounded(size);
diff --git a/src/include/common/int.h b/src/include/common/int.h
index 7fc046e78a..d86eb6dd5e 100644
--- a/src/include/common/int.h
+++ b/src/include/common/int.h
@@ -258,6 +258,19 @@ pg_mul_s64_overflow(int64 a, int64 b, int64 *result)
 #endif
 }
 
+static inline uint64
+pg_abs_s64(int64 a)
+{
+	if (unlikely(a == PG_INT64_MIN))
+	{
+		return ((uint64) PG_INT64_MAX) + 1;
+	}
+	else
+	{
+		return (uint64) i64abs(a);
+	}
+}
+
 /*
  * Overflow routines for unsigned integers
  *
diff --git a/src/test/regress/expected/dbsize.out b/src/test/regress/expected/dbsize.out
index f1121a87aa..3398a3eceb 100644
--- a/src/test/regress/expected/dbsize.out
+++ b/src/test/regress/expected/dbsize.out
@@ -12,6 +12,12 @@ SELECT size, pg_size_pretty(size), pg_size_pretty(-1 * size) FROM
  1000 | 909 TB | -909 TB
 (6 rows)
 
+SELECT pg_size_pretty(-9223372036854775808);
+ pg_size_pretty 
+
+ -8192 PB
+(1 row)
+
 SELECT size, pg_size_pretty(size), pg_size_pretty(-1 * size) FROM
 (VALUES (10::numeric), (1000::numeric), (100::numeric),
 (10::numeric), (1::numeric),
diff --git a/src/test/regress/sql/dbsize.sql b/src/test/regress/sql/dbsize.sql
index b34cf33385..e3e18e948e 100644
--- a/src/test/regress/sql/dbsize.sql
+++ b/src/test/regress/sql/dbsize.sql
@@ -3,6 +3,8 @@ SELECT size, pg_size_pretty(size), pg_size_pretty(-1 * size) FROM
 (10::bigint), (1::bigint),
 (1000::bigint)) x(size);
 
+SELECT pg_size_pretty(-9223372036854775808);
+
 SELECT size, pg_size_pretty(size), pg_size_pretty(-1 * size) FROM
 (VALUES (10::numeric), (1000::numeric), (100::numeric),
 (10::numeric), (1::numeric),
-- 
2.34.1



Re: Protocol question regarding Portal vs Cursor

2024-07-27 Thread Tom Lane
Dave Cramer  writes:
> On Sat, 27 Jul 2024 at 01:55, Tatsuo Ishii  wrote:
>>> So while the API's are "virtually" identical AFAICT there is no way to
>>> create a "WITH HOLD" portal ?

> Yes, sorry, I should have said one can not create a with hold portal using
> the BIND command

Yeah.  The two APIs (cursors and extended query protocol) manipulate
the same underlying Portal objects, but the features exposed by the
APIs aren't all identical.  We've felt that this isn't high priority
to sync up, since you can create a Portal with one API then manipulate
it through the other if need be.

regards, tom lane




Re: add function argument names to regex* functions.

2024-07-27 Thread Tom Lane
jian he  writes:
> On Fri, Jul 26, 2024 at 10:40 PM Tom Lane  wrote:
>> Hmm, yeah, you're right.  I didn't want to write two separate
>> synopses there, but maybe there's no choice.

> Now the output is
> It has the syntax regexp_replace(string, pattern, replacement [, flags
> ]) and regexp_replace(string, pattern, replacement, start [, N [,
> flags ]]).

Pushed, thanks.

regards, tom lane




Re: xid_wraparound tests intermittent failure.

2024-07-27 Thread Andrew Dunstan


On 2024-07-26 Fr 1:46 PM, Masahiko Sawada wrote:

On Thu, Jul 25, 2024 at 6:52 PM Andrew Dunstan  wrote:


On 2024-07-25 Th 3:40 PM, Masahiko Sawada wrote:

On Thu, Jul 25, 2024 at 11:06 AM Masahiko Sawada  wrote:

On Thu, Jul 25, 2024 at 10:56 AM Andrew Dunstan  wrote:

On 2024-07-23 Tu 6:59 PM, Masahiko Sawada wrote:

See 



The failure logs are from a run where both tests 1 and 2 failed.

Thank you for sharing the logs.

I think that the problem seems to match what Alexander Lakhin
mentioned[1]. Probably we can fix such a race condition somehow but
I'm not sure it's worth it as setting autovacuum = off and
autovacuum_max_workers = 1 (or a low number) is an extremely rare
case. I think it would be better to stabilize these tests. One idea is
to turn the autovacuum GUC parameter on while setting
autovacuum_enabled = off for each table. That way, we can ensure that
autovacuum workers are launched. And I think it seems to align real
use cases.

Regards,

[1] 
https://www.postgresql.org/message-id/02373ec3-50c6-df5a-0d65-5b9b1c0c86d6%40gmail.com


OK, do you want to propose a patch?

Yes, I'll prepare and share it soon.

I've attached the patch. Could you please test if the patch fixes the
instability you observed?

Since we turn off autovacuum on all three tests and we wait for
autovacuum to complete processing databases, these tests potentially
have a similar (but lower) risk. So I modified these tests to turn it
on so we can ensure the autovacuum runs periodically.


I assume you actually meant to remove the "autovacuum = off" in 
003_wraparound.pl. With that change in your patch I retried my test, but on iteration 100 
out of 100 it failed on test 002_limits.pl.


I think we need to remove the "autovacuum = off' also in 002_limits.pl
as it waits for autovacuum to process both template0 and template1
databases. Just to be clear, the failure happened even without
"autovacuum = off"?



The attached patch, a slight modification of yours, removes "autovacuum 
= off" for all three tests, and given that a set of 200 runs was clean 
for me.



cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com
diff --git a/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
index 37550b67a4..2692b35f34 100644
--- a/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
+++ b/src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl
@@ -18,7 +18,6 @@ my $node = PostgreSQL::Test::Cluster->new('main');
 $node->init;
 $node->append_conf(
 	'postgresql.conf', qq[
-autovacuum = off # run autovacuum only when to anti wraparound
 autovacuum_naptime = 1s
 # so it's easier to verify the order of operations
 autovacuum_max_workers = 1
@@ -27,23 +26,25 @@ log_autovacuum_min_duration = 0
 $node->start;
 $node->safe_psql('postgres', 'CREATE EXTENSION xid_wraparound');
 
-# Create tables for a few different test scenarios
+# Create tables for a few different test scenarios. We disable autovacuum
+# on these tables to run it only to prevent wraparound.
 $node->safe_psql(
 	'postgres', qq[
-CREATE TABLE large(id serial primary key, data text, filler text default repeat(random()::text, 10));
+CREATE TABLE large(id serial primary key, data text, filler text default repeat(random()::text, 10))
+   WITH (autovacuum_enabled = off);
 INSERT INTO large(data) SELECT generate_series(1,3);
 
-CREATE TABLE large_trunc(id serial primary key, data text, filler text default repeat(random()::text, 10));
+CREATE TABLE large_trunc(id serial primary key, data text, filler text default repeat(random()::text, 10))
+   WITH (autovacuum_enabled = off);
 INSERT INTO large_trunc(data) SELECT generate_series(1,3);
 
-CREATE TABLE small(id serial primary key, data text, filler text default repeat(random()::text, 10));
+CREATE TABLE small(id serial primary key, data text, filler text default repeat(random()::text, 10))
+   WITH (autovacuum_enabled = off);
 INSERT INTO small(data) SELECT generate_series(1,15000);
 
-CREATE TABLE small_trunc(id serial primary key, data text, filler text default repeat(random()::text, 10));
+CREATE TABLE small_trunc(id serial primary key, data text, filler text default repeat(random()::text, 10))
+   WITH (autovacuum_enabled = off);
 INSERT INTO small_trunc(data) SELECT generate_series(1,15000);
-
-CREATE TABLE autovacuum_disabled(id serial primary key, data text) WITH (autovacuum_enabled=false);
-INSERT INTO autovacuum_disabled(data) SELECT generate_series(1,1000);
 ]);
 
 # Bump the query timeout to avoid false negatives on slow test systems.
@@ -63,7 +64,6 @@ $background_psql->query_safe(
 	DELETE FROM large_trunc WHERE id > 1;
 	DELETE FROM small WHERE id % 2 = 0;
 	DELETE FROM small_trunc WHERE id > 1000;
-	DELETE FROM autovacuum_disabled WHERE id % 2 = 0;
 ]);
 
 # Consume 2 billion XIDs, to get us very close to wraparound
@@ -107,20 +107,18 @@ $ret = 

Re: allow changing autovacuum_max_workers without restarting

2024-07-27 Thread Nathan Bossart
rebased

-- 
nathan
>From 61513f744012c2b9b59085ce8c4a960da9e56ee7 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Sat, 22 Jun 2024 15:05:44 -0500
Subject: [PATCH v9 1/1] allow changing autovacuum_max_workers without
 restarting

---
 doc/src/sgml/config.sgml  | 28 ++-
 doc/src/sgml/runtime.sgml |  4 +-
 src/backend/access/transam/xlog.c |  2 +-
 src/backend/postmaster/autovacuum.c   | 76 +++
 src/backend/postmaster/postmaster.c   |  2 +-
 src/backend/storage/lmgr/proc.c   |  6 +-
 src/backend/utils/init/postinit.c |  6 +-
 src/backend/utils/misc/guc_tables.c   | 11 ++-
 src/backend/utils/misc/postgresql.conf.sample |  3 +-
 src/include/postmaster/autovacuum.h   |  1 +
 src/test/perl/PostgreSQL/Test/Cluster.pm  |  1 +
 11 files changed, 111 insertions(+), 29 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 57cd7bb972..7fc270c0ae 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8552,6 +8552,25 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH 
csv;
   
  
 
+ 
+  autovacuum_worker_slots (integer)
+  
+   autovacuum_worker_slots configuration 
parameter
+  
+  
+  
+   
+Specifies the number of backend slots to reserve for autovacuum worker
+processes.  The default is 16.  This parameter can only be set at 
server
+start.
+   
+   
+When changing this value, consider also adjusting
+.
+   
+  
+ 
+
  
   autovacuum_max_workers (integer)
   
@@ -8562,7 +8581,14 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH 
csv;

 Specifies the maximum number of autovacuum processes (other than the
 autovacuum launcher) that may be running at any one time.  The default
-is three.  This parameter can only be set at server start.
+is three.  This parameter can only be set in the
+postgresql.conf file or on the server command 
line.
+   
+   
+Note that a setting for this value which is higher than
+ will have no effect,
+since autovacuum workers are taken from the pool of slots established
+by that setting.

   
  
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 2c4d5ef640..a1f43556ac 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -838,7 +838,7 @@ psql: error: connection to server on socket 
"/tmp/.s.PGSQL.5432" failed: No such
 When using System V semaphores,
 PostgreSQL uses one semaphore per allowed 
connection
 (), allowed autovacuum worker process
-(), allowed WAL sender process
+(), allowed WAL sender process
 (), allowed background
 process (), etc., in sets of 16.
 The runtime-computed parameter 
@@ -891,7 +891,7 @@ $ postgres -D $PGDATA -C 
num_os_semaphores
 When using POSIX semaphores, the number of semaphores needed is the
 same as for System V, that is one semaphore per allowed connection
 (), allowed autovacuum worker process
-(), allowed WAL sender process
+(), allowed WAL sender process
 (), allowed background
 process (), etc.
 On the platforms where this option is preferred, there is no specific
diff --git a/src/backend/access/transam/xlog.c 
b/src/backend/access/transam/xlog.c
index f86f4b5c4b..64c1edb798 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5373,7 +5373,7 @@ CheckRequiredParameterValues(void)
 */
if (ArchiveRecoveryRequested && EnableHotStandby)
{
-   /* We ignore autovacuum_max_workers when we make this test. */
+   /* We ignore autovacuum_worker_slots when we make this test. */
RecoveryRequiresIntParameter("max_connections",
 
MaxConnections,
 
ControlFile->MaxConnections);
diff --git a/src/backend/postmaster/autovacuum.c 
b/src/backend/postmaster/autovacuum.c
index 4e4a0ccbef..937cd940aa 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -115,6 +115,7 @@
  * GUC parameters
  */
 bool   autovacuum_start_daemon = false;
+intautovacuum_worker_slots;
 intautovacuum_max_workers;
 intautovacuum_work_mem = -1;
 intautovacuum_naptime;
@@ -210,7 +211,7 @@ typedef struct autovac_table
 /*-
  * This struct holds information about a single worker's whereabouts.  We keep
  * an array of these in shared memory, sized according to
- * autovacuum_max_workers.
+ * autovacuum_worker_slots.
  *
  * wi_linksentry into free list or running list
  * wi_dboidOID of t

Re: Fix overflow in pg_size_pretty

2024-07-27 Thread Joseph Koshakow
On Sat, Jul 27, 2024 at 3:18 PM Joseph Koshakow  wrote:
>
> `SELECT -9223372036854775808::bigint` results in an out of range error,
> even though `-9223372036854775808` can fit in a `bigint` and
> `SELECT pg_typeof(-9223372036854775808)` returns `bigint`. That's why
> the `::bigint` cast is omitted from my test.

Turns out it was just an order of operations issue. Fix is attached.

Thanks,
Joseph Koshakow
From 1224087ab4e13a107b51ac17c77e83dc7db37ef9 Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Sat, 27 Jul 2024 15:06:09 -0400
Subject: [PATCH] Fix overflow in pg_size_pretty

This commit removes an overflow from pg_size_pretty that causes
PG_INT64_MIN to by displayed with the bytes unit instead of the PB
unit.
---
 src/backend/utils/adt/dbsize.c   |  3 ++-
 src/include/common/int.h | 13 +
 src/test/regress/expected/dbsize.out |  6 ++
 src/test/regress/sql/dbsize.sql  |  2 ++
 4 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 25d7110c13..5648f40101 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -21,6 +21,7 @@
 #include "catalog/pg_tablespace.h"
 #include "commands/dbcommands.h"
 #include "commands/tablespace.h"
+#include "common/int.h"
 #include "miscadmin.h"
 #include "storage/fd.h"
 #include "utils/acl.h"
@@ -577,7 +578,7 @@ pg_size_pretty(PG_FUNCTION_ARGS)
 		uint8		bits;
 
 		/* use this unit if there are no more units or we're below the limit */
-		if (unit[1].name == NULL || i64abs(size) < unit->limit)
+		if (unit[1].name == NULL || pg_abs_s64(size) < unit->limit)
 		{
 			if (unit->round)
 size = half_rounded(size);
diff --git a/src/include/common/int.h b/src/include/common/int.h
index 7fc046e78a..d86eb6dd5e 100644
--- a/src/include/common/int.h
+++ b/src/include/common/int.h
@@ -258,6 +258,19 @@ pg_mul_s64_overflow(int64 a, int64 b, int64 *result)
 #endif
 }
 
+static inline uint64
+pg_abs_s64(int64 a)
+{
+	if (unlikely(a == PG_INT64_MIN))
+	{
+		return ((uint64) PG_INT64_MAX) + 1;
+	}
+	else
+	{
+		return (uint64) i64abs(a);
+	}
+}
+
 /*
  * Overflow routines for unsigned integers
  *
diff --git a/src/test/regress/expected/dbsize.out b/src/test/regress/expected/dbsize.out
index f1121a87aa..eac878c3ec 100644
--- a/src/test/regress/expected/dbsize.out
+++ b/src/test/regress/expected/dbsize.out
@@ -12,6 +12,12 @@ SELECT size, pg_size_pretty(size), pg_size_pretty(-1 * size) FROM
  1000 | 909 TB | -909 TB
 (6 rows)
 
+SELECT pg_size_pretty((-9223372036854775808)::bigint);
+ pg_size_pretty 
+
+ -8192 PB
+(1 row)
+
 SELECT size, pg_size_pretty(size), pg_size_pretty(-1 * size) FROM
 (VALUES (10::numeric), (1000::numeric), (100::numeric),
 (10::numeric), (1::numeric),
diff --git a/src/test/regress/sql/dbsize.sql b/src/test/regress/sql/dbsize.sql
index b34cf33385..e1ad202016 100644
--- a/src/test/regress/sql/dbsize.sql
+++ b/src/test/regress/sql/dbsize.sql
@@ -3,6 +3,8 @@ SELECT size, pg_size_pretty(size), pg_size_pretty(-1 * size) FROM
 (10::bigint), (1::bigint),
 (1000::bigint)) x(size);
 
+SELECT pg_size_pretty((-9223372036854775808)::bigint);
+
 SELECT size, pg_size_pretty(size), pg_size_pretty(-1 * size) FROM
 (VALUES (10::numeric), (1000::numeric), (100::numeric),
 (10::numeric), (1::numeric),
-- 
2.34.1



Re: Speed up collation cache

2024-07-27 Thread Andreas Karlsson

On 7/26/24 11:00 PM, Jeff Davis wrote:

Results (in ms):

   "C"   "libc_c"   overhead
master:6350 7855 24%
v4-0001:   6091 6324  4%


I got more overhead in my quick benchmarking when I ran the same 
benchmark. Also tried your idea with caching the last lookup (PoC patch 
attached) and it basically removed all overhead, but I guess it will not 
help if you have two different non.default locales in the same query.


"C"   "libc_c" overhead
before: 6695  8376 25%
after:  6605  7340 11%
cache last: 6618  6677  1%

But even without that extra optimization I think this patch is worth 
merging and the patch is small, simple and clean and easy to understand 
and a just a clear speed up. Feels like a no brainer. I think that it is 
ready for committer.


And then we can discuss after committing if an additional cache of the 
last locale is worth it or not.


AndreasFrom 5f634670569a3ef8249ff1747af2157b6939f505 Mon Sep 17 00:00:00 2001
From: Andreas Karlsson 
Date: Sun, 28 Jul 2024 00:04:43 +0200
Subject: [PATCH] WIP: Ugly caching of last locale

---
 src/backend/utils/adt/pg_locale.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 4628fcd8dd..e0de7aa625 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -1566,6 +1566,9 @@ init_database_collation(void)
 	ReleaseSysCache(tup);
 }
 
+static Oid last_collid = InvalidOid;
+static pg_locale_t last_locale = NULL;
+
 /*
  * Create a locale_t from a collation OID.  Results are cached for the
  * lifetime of the backend.  Thus, do not free the result with freelocale().
@@ -1587,6 +1590,9 @@ pg_newlocale_from_collation(Oid collid)
 	if (collid == DEFAULT_COLLATION_OID)
 		return &default_locale;
 
+	if (collid == last_collid)
+		return last_locale;
+
 	cache_entry = lookup_collation_cache(collid);
 
 	if (cache_entry->locale == 0)
@@ -1712,6 +1718,9 @@ pg_newlocale_from_collation(Oid collid)
 		cache_entry->locale = resultp;
 	}
 
+	last_collid = collid;
+	last_locale = cache_entry->locale;
+
 	return cache_entry->locale;
 }
 
-- 
2.43.0



Re: Fix overflow in pg_size_pretty

2024-07-27 Thread David Rowley
On Sun, 28 Jul 2024 at 07:18, Joseph Koshakow  wrote:
> Attached is a patch that resolves an overflow in pg_size_pretty() that
> resulted in unexpected behavior when PG_INT64_MIN was passed in as an
> argument.

Could we just fix this more simply by assigning the absolute value of
the signed variable into an unsigned type?  It's a bit less code and
gets rid of the explicit test for PG_INT64_MIN.

David


pg_size_pretty_bigint_fix.patch
Description: Binary data


Re: why is pg_upgrade's regression run so slow?

2024-07-27 Thread Andrew Dunstan



On 2024-07-27 Sa 10:20 AM, Tom Lane wrote:

Andrew Dunstan  writes:

As part of its 002_pgupgrade.pl, the pg_upgrade tests do a rerun of the
normal regression tests. That in itself is sad, but what concerns me
here is why it's so much slower than the regular run? This is apparent
everywhere (e.g. on crake the standard run takes about 30 to 90 s, but
pg_upgrade's run takes 5 minutes or more).

Just to add some more fuel to the fire: I do *not* observe such an
effect on my own animals.  For instance, sifaka's last run shows
00:09 for install-check-C and the same (within rounding error)
for the regression test step in 002_pgupgrade; on the much slower
mamba, the last run took 07:33 for install-check-C and 07:40 for the
002_pgupgrade regression test step.

I'm still using the makefile-based build, and I'm not using
debug_parallel_query, but it doesn't make a lot of sense to me
that either of those things should affect this.

Looking at crake's last run, there are other oddities: why does
the "check" step take 00:24 while "install-check-en_US.utf8" (which
should be doing strictly less work) takes 01:00?  Again, there are
not similar discrepancies on my animals.  Are you sure there's not
background load on the machine?





Quite sure. Running crake and koel all it does. It's Fedora 40 running 
on bare metal, a Lenovo Y70 with an Intel Core i7-4720HQ CPU @ 2.60GHz 
and a Samsung SSD.


The culprit appears to be meson. When I tested running crake with 
"using_meson => 0" I got results in line with yours. The regression test 
times were consistent, including the installcheck tests.  That's 
especially ugly on Windows as we don't have any alternative way of 
running, and the results are so much more catastrophic. 
"debug_parallel_query" didn't seem to matter.



cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: why is pg_upgrade's regression run so slow?

2024-07-27 Thread Tom Lane
Andrew Dunstan  writes:
> On 2024-07-27 Sa 10:20 AM, Tom Lane wrote:
>> Just to add some more fuel to the fire: I do *not* observe such an
>> effect on my own animals.

> The culprit appears to be meson. When I tested running crake with 
> "using_meson => 0" I got results in line with yours.

Interesting.  Maybe meson is over-aggressively trying to run these
test suites in parallel?

regards, tom lane




Re: Protocol question regarding Portal vs Cursor

2024-07-27 Thread Tatsuo Ishii
> Yes, sorry, I should have said one can not create a with hold portal using
> the BIND command

Ok.

It would be possible to add a new parameter to the BIND command to
create such a portal. But it needs some changes to the existing
protocol definition and requires protocol version up, which is a major
pain.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp




Re: Fix overflow in pg_size_pretty

2024-07-27 Thread Joseph Koshakow
On Sat, Jul 27, 2024 at 6:28 PM David Rowley  wrote:
>
> On Sun, 28 Jul 2024 at 07:18, Joseph Koshakow  wrote:
>> Attached is a patch that resolves an overflow in pg_size_pretty() that
>> resulted in unexpected behavior when PG_INT64_MIN was passed in as an
>> argument.
>
> Could we just fix this more simply by assigning the absolute value of
> the signed variable into an unsigned type?

I might be misunderstanding, but my previous patch does assign the
absolute value of the signed variable into an unsigned type.

> It's a bit less code and
> gets rid of the explicit test for PG_INT64_MIN.

> + uint64 usize = size < 0 ? (uint64) (-size) : (uint64) size;

I think that the explicit test for PG_INT64_MIN is still required. If
`size` is equal to PG_INT64_MIN then `-size` will overflow. You end up
with the correct behavior if `size` wraps around, but that's only
guaranteed on platforms that support the `-fwrapv` flag.

Thanks,
Joseph Koshakow


Re: Fix overflow in pg_size_pretty

2024-07-27 Thread David Rowley
On Sun, 28 Jul 2024 at 11:06, Joseph Koshakow  wrote:
> > + uint64 usize = size < 0 ? (uint64) (-size) : (uint64) size;
>
> I think that the explicit test for PG_INT64_MIN is still required. If
> `size` is equal to PG_INT64_MIN then `-size` will overflow. You end up
> with the correct behavior if `size` wraps around, but that's only
> guaranteed on platforms that support the `-fwrapv` flag.

What if we spelt it out the same way as pg_lltoa() does?

i.e: uint64 usize = size < 0 ? 0 - (uint64) size : (uint64) size;

David




Re: Fix overflow in pg_size_pretty

2024-07-27 Thread Joseph Koshakow
On Sat, Jul 27, 2024 at 8:00 PM David Rowley  wrote:
>
> On Sun, 28 Jul 2024 at 11:06, Joseph Koshakow  wrote:
>> > + uint64 usize = size < 0 ? (uint64) (-size) : (uint64) size;
>>
>> I think that the explicit test for PG_INT64_MIN is still required. If
>> `size` is equal to PG_INT64_MIN then `-size` will overflow. You end up
>> with the correct behavior if `size` wraps around, but that's only
>> guaranteed on platforms that support the `-fwrapv` flag.
>
> What if we spelt it out the same way as pg_lltoa() does?
>
> i.e: uint64 usize = size < 0 ? 0 - (uint64) size : (uint64) size;

My understanding of pg_lltoa() is that it produces an underflow and
relies wrapping around from 0 to PG_UINT64_MAX. In fact the following
SQL, which relies on pg_lltoa() under the hood, panics with `-ftrapv`
enabled (which panics on underflows and overflows):

SELECT int8out(-9223372036854775808);

So we should actually probably modify pg_lltoa() to use pg_abs_s64()
too.

Thanks,
Joe Koshakow


Re: why is pg_upgrade's regression run so slow?

2024-07-27 Thread Thomas Munro
On Sun, Jul 28, 2024 at 10:48 AM Tom Lane  wrote:
> Interesting.  Maybe meson is over-aggressively trying to run these
> test suites in parallel?

Hypothesis: NTFS might not be as good at linking/unlinking lots of
files concurrently due to forced synchronous I/O, causing queuing?

That's what [1] was about for FreeBSD CI.  I couldn't immediately see
how to make a RAM disks on Windows but at the time I had a hunch that
that OS was struggling in the same way.

Could some tuning help?  Disable 8dot3name (a thing that creates a
ye-olde-MSDOS-compatible second directory entry for every file),
adjust disablelastaccess (something like noatime), disable USN journal
(a kind of secondary journal of all file system operations that is
used to drive the change notification API that we don't care about),
disable write cache flush so that any synchronous I/O operations don't
wait for that (at the risk of corruption on power loss, but maybe it's
OK on a device dedicated to temporary workspace)?  This is just from
some quick googling, but perhaps someone who actually knows how to
drive Windows locally and use the performance monitoring tools could
tell us what it's actually waiting on...

I noticed there is a new thing called Dev Drive[2] on Windows 11,
which claims to be faster for developer workloads and there are
graphs[3] showing various projects' test suites going faster.  It's
ReFS, a COW file system.  From some quick googling, the CopyFile()
system does a fast clone, and that should affect the robocopy command
in Cluster.pm (note: Unixoid cp in there also uses COW cloning on at
least xfs, zfs, probably apfs too).  So I would be interested to know
if that goes faster ... or slower.  I'm also interested in how it
reacts to the POSIX-semantics mode[4]; that might affect whether we
can ever pull the trigger on that idea.

[1] 
https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BFXLcEg1dyTqJjDiNQ8pGom4KrJj4wF38C90thti9dVA%40mail.gmail.com
[2] https://learn.microsoft.com/en-us/windows/dev-drive/
[3] https://devblogs.microsoft.com/visualstudio/devdrive/
[4] 
https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BajSQ_8eu2AogTncOnZ5me2D-Cn66iN_-wZnRjLN%2Bicg%40mail.gmail.com




Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin

2024-07-27 Thread John Naylor
On Saturday, July 27, 2024, Melanie Plageman 
wrote:
>
>
> Yes, the only thing that is important is having two rounds of index
> vacuuming and having one tuple with a value matching my cursor
> condition before the first index vacuum and one after. What do you
> mean update only the last few tuples though?
>

I meant we could update tuples with the highest offsets on each page. That
would then lead to longer arrays of bitmaps to store offsets during vacuum.
Lowering the minimum memory setting is easier to code and reason about,
however.


Re: Fix overflow in pg_size_pretty

2024-07-27 Thread David Rowley
On Sun, 28 Jul 2024 at 13:10, Joseph Koshakow  wrote:
>
> On Sat, Jul 27, 2024 at 8:00 PM David Rowley  wrote:
> > What if we spelt it out the same way as pg_lltoa() does?
> >
> > i.e: uint64 usize = size < 0 ? 0 - (uint64) size : (uint64) size;
>
> My understanding of pg_lltoa() is that it produces an underflow and
> relies wrapping around from 0 to PG_UINT64_MAX. In fact the following
> SQL, which relies on pg_lltoa() under the hood, panics with `-ftrapv`
> enabled (which panics on underflows and overflows):
>
> SELECT int8out(-9223372036854775808);

I didn't test to see where that's coming from, but I did test the two
attached .c files.  int.c uses the 0 - (unsigned int) var method and
int2.c uses (unsigned int) (-var).  Using clang and -ftrapv, I get:

$ clang int.c -o int -O2 -ftrapv
$ ./int
2147483648
$ clang int2.c -o int2 -O2 -ftrapv
$ ./int2
Illegal instruction

Similar with gcc:
$ gcc int.c -o int -O2 -ftrapv
$ ./int
2147483648
$ gcc int2.c -o int2 -O2 -ftrapv
$ ./int2
Aborted

I suspect your trap must be coming from somewhere else. It looks to me
like the "uint64 usize = size < 0 ? 0 - (uint64) size : (uint64)
size;" will be fine.

David
#include 
#include 

int main(void)
{
volatile int v = INT_MIN;
unsigned int u = v < 0 ? 0 - (unsigned int) v : (unsigned int) v;

printf("%u\n", u);
return 0;
}
#include 
#include 

int main(void)
{
volatile int v = INT_MIN;
unsigned int u = v < 0 ? (unsigned int) -v : (unsigned int) v;

printf("%u\n", u);
return 0;
}


Re: Fix overflow in pg_size_pretty

2024-07-27 Thread Joseph Koshakow
On Sat, Jul 27, 2024 at 11:42 PM David Rowley  wrote:
>
> I didn't test to see where that's coming from, but I did test the two
> attached .c files.  int.c uses the 0 - (unsigned int) var method and
> int2.c uses (unsigned int) (-var).  Using clang and -ftrapv, I get:
>
> $ clang int.c -o int -O2 -ftrapv
> $ ./int
> 2147483648
> $ clang int2.c -o int2 -O2 -ftrapv
> $ ./int2
> Illegal instruction
>
> Similar with gcc:
> $ gcc int.c -o int -O2 -ftrapv
> $ ./int
> 2147483648
> $ gcc int2.c -o int2 -O2 -ftrapv
> $ ./int2
> Aborted
>
> I suspect your trap must be coming from somewhere else. It looks to me
> like the "uint64 usize = size < 0 ? 0 - (uint64) size : (uint64)
> size;" will be fine.

My mistake, you're absolutely right. The trap is coming from
`pg_strtoint64_safe()`.

return -((int64) tmp);

Which I had already addressed in the other thread and completely forgot
about.

I did some more research and it looks like unsigned integer arithmetic
is guaranteed to wrap around, unlike signed integer arithmetic [0].

Attached is an updated patch with your approach. I removed the 0 from
the negative case because I think it was unnecessary, but happy to add
it back in if I missed something.

Thanks for the review!

Thanks,
Joseph Koshakow

[0]
https://www.gnu.org/software/autoconf/manual/autoconf-2.63/html_node/Integer-Overflow-Basics.html
From 1811b94ba4c08a0de972e8ded4892cf294e9f687 Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Sat, 27 Jul 2024 15:06:09 -0400
Subject: [PATCH] Fix overflow in pg_size_pretty

This commit removes an overflow from pg_size_pretty that causes
PG_INT64_MIN to by displayed with the bytes unit instead of the PB
unit.
---
 src/backend/utils/adt/dbsize.c   | 3 ++-
 src/test/regress/expected/dbsize.out | 6 ++
 src/test/regress/sql/dbsize.sql  | 2 ++
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 25d7110c13..8d58fc24ce 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -575,9 +575,10 @@ pg_size_pretty(PG_FUNCTION_ARGS)
 	for (unit = size_pretty_units; unit->name != NULL; unit++)
 	{
 		uint8		bits;
+		uint64		usize = size < 0 ? -(uint64) size : (uint64) size;
 
 		/* use this unit if there are no more units or we're below the limit */
-		if (unit[1].name == NULL || i64abs(size) < unit->limit)
+		if (unit[1].name == NULL || usize < unit->limit)
 		{
 			if (unit->round)
 size = half_rounded(size);
diff --git a/src/test/regress/expected/dbsize.out b/src/test/regress/expected/dbsize.out
index f1121a87aa..eac878c3ec 100644
--- a/src/test/regress/expected/dbsize.out
+++ b/src/test/regress/expected/dbsize.out
@@ -12,6 +12,12 @@ SELECT size, pg_size_pretty(size), pg_size_pretty(-1 * size) FROM
  1000 | 909 TB | -909 TB
 (6 rows)
 
+SELECT pg_size_pretty((-9223372036854775808)::bigint);
+ pg_size_pretty 
+
+ -8192 PB
+(1 row)
+
 SELECT size, pg_size_pretty(size), pg_size_pretty(-1 * size) FROM
 (VALUES (10::numeric), (1000::numeric), (100::numeric),
 (10::numeric), (1::numeric),
diff --git a/src/test/regress/sql/dbsize.sql b/src/test/regress/sql/dbsize.sql
index b34cf33385..e1ad202016 100644
--- a/src/test/regress/sql/dbsize.sql
+++ b/src/test/regress/sql/dbsize.sql
@@ -3,6 +3,8 @@ SELECT size, pg_size_pretty(size), pg_size_pretty(-1 * size) FROM
 (10::bigint), (1::bigint),
 (1000::bigint)) x(size);
 
+SELECT pg_size_pretty((-9223372036854775808)::bigint);
+
 SELECT size, pg_size_pretty(size), pg_size_pretty(-1 * size) FROM
 (VALUES (10::numeric), (1000::numeric), (100::numeric),
 (10::numeric), (1::numeric),
-- 
2.34.1