date:20220828

Re: Handle infinite recursion in logical replication setup

2022-08-28 Thread Peter Smith

Here are some review comments for patch v42-0002:

==

1. doc/src/sgml/logical-replication.sgml

copy_data = true 

There are a couple of these tags where there is a trailing space
before the . I guess it is doing no harm, but it is doing no
good either, so IMO better to get rid of the space.

~~

2. doc/src/sgml/logical-replication.sgml - 31.3.1. Setting replication
between two primaries

User need to ensure that no data changes happen on table t1 on
primary1 and primary2 until the setup is completed.

SUGGESTION
The user must ensure that no data changes happen on table t1 of
primary1 and primary2 until the setup is completed.

~~~

3. doc/src/sgml/logical-replication.sgml - 31.3.2. Adding a new
primary when there is no table data on any of the primaries

User need to ensure that no data changes happen on table t1 on all the
primaries primary1, primary2 and primary3 until the setup is
completed.

SUGGESTION
The user must ensure that no data changes happen on table t1 of all
the primaries primary1, primary2 and primary3 until the setup is
completed.

~~~

4. doc/src/sgml/logical-replication.sgml - 31.3.3. Adding a new
primary when table data is present on the existing primaries

User need to ensure that no operations should be performed on table t1
on primary2 and primary3 until the setup is completed.

SUGGESTION
The user must ensure that no operations are performed on table t1 of
primary2 and primary3 until the setup is completed.

Here also you changed the wording - "no data changes happen" versus
"no operations should be performed". Was that a deliberate difference?
Maybe they should all be consistent wording? If they are
interchangeable IMO the "no operations are performed..." wording was
the better of the two.

~~~

5. doc/src/sgml/logical-replication.sgml - 31.3.4. Generic steps for
adding a new primary to an existing set of primaries

Step-2: User need to ensure that no changes happen on the required
tables of the new primary until the setup is complete.

SUGGESTION
Step-2: The user must ensure that no changes happen on the required
tables of the new primary until the setup is complete.

~~~

6.

Step-4. User need to ensure that no changes happen on the required
tables of the existing primaries except the first primary until the
setup is complete.

SUGGESTION
Step-4. The user must ensure that no changes happen on the required
tables of the existing primaries except the first primary until the
setup is complete.

~~~

7. (the example)

7a.
Step-2. User need to ensure that no changes happen on table t1 on
primary4 until the setup is completed.

SUGGESTION
Step-2. The user must ensure that no changes happen on table t1 of
primary4 until the setup is completed.

~

7b.
Step-4. User need to ensure that no changes happen on table t1 on
primary2 and primary3 until the setup is completed.

SUGGESTION
Step-4. The user must ensure that no changes happen on table t1 of
primary2 and primary3 until the setup is completed.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Making Vars outer-join aware

2022-08-28 Thread Richard Guo

On Fri, Aug 19, 2022 at 2:45 AM Tom Lane  wrote:

> Here's a rebase up to HEAD, mainly to get the cfbot back in sync
> as to what's the live patch.

Noticed another different behavior from previous. When we try to reduce
JOIN_LEFT to JOIN_ANTI, we want to know if the join's own quals are
strict for any Var that was forced null by higher qual levels. We do
that by checking whether local_nonnullable_vars and forced_null_vars
overlap. However, the same Var from local_nonnullable_vars and
forced_null_vars may be labeled with different varnullingrels. If that
is the case, currently we would fail to tell they actually overlap. As
an example, consider 'b.i' in the query below

# explain (costs off) select * from a left join b on a.i = b.i where b.i is
null;
QUERY PLAN
---
 Hash Left Join
   Hash Cond: (a.i = b.i)
   Filter: (b.i IS NULL)
   ->  Seq Scan on a
   ->  Hash
 ->  Seq Scan on b
(6 rows)

Thanks
Richard

Re: Handle infinite recursion in logical replication setup

2022-08-28 Thread Peter Smith

Here are some review comments for patch v42-0001:

==

1. Commit message

A later review comment below suggests some changes to the WARNING
message so if those changes are made then the example in this commit
message also needs to be modified.

==

2. doc/src/sgml/ref/alter_subscription.sgml - copy_data

Refer to the Notes for the usage of true for copy_data parameter and
its interaction with the origin parameter.

I think saying "usage of true" sounded a bit strange.

SUGGESTION
Refer to the Notes about how copy_data = true can interact with the
origin parameter.

==

3. doc/src/sgml/ref/create_subscription.sgml - copy data

Refer to the Notes for the usage of true for copy_data parameter and
its interaction with the origin parameter.

SUGGESTION
(same as #2)

~~

4. doc/src/sgml/ref/create_subscription.sgml - origin

Refer to the Notes for the usage of true for copy_data parameter and
its interaction with the origin parameter.

SUGGESTION
(same as #2)

~~~

5. doc/src/sgml/ref/create_subscription.sgml - Notes

+  
+   If the subscription is created with origin = none and
+   copy_data = true, it will check if the publisher has
+   subscribed to the same table from other publishers and, if so, throw a
+   warning to notify user to check the publisher tables. The user can ensure

"throw a warning to notify user" -> "log a warning to notify the user"

~~

6.

+   warning to notify user to check the publisher tables. The user can ensure
+   that publisher tables do not have data which has an origin associated before
+   continuing with any other operations to prevent inconsistent data being
+   replicated.
+  

6a.

I'm not so sure about this. IMO the warning is not about the
replication – really it is about the COPY which has already happened
anyway. So the user can't really prevent anything from going wrong;
instead, they have to take some action to clean up if anything did go
wrong.

SUGGESTION
Before continuing with other operations the user should check that
publisher tables did not have data with different origins, otherwise
inconsistent data may have been copied.

~

6b.

I am also wondering what can the user do now. Assuming there was bad
COPY then the subscriber table has already got unwanted stuff copied
into it. Is there any advice we can give to help users fix this mess?

==

7. src/backend/commands/subscriptioncmds.c - AlterSubscription_refresh

+ /* Check whether we can allow copy of newly added relations. */
+ check_pub_table_subscribed(wrconn, sub->publications, copy_data,
+sub->origin, subrel_local_oids,
+subrel_count);

"whether we can allow" seems not quite the right wording here anymore,
because now there is no ERROR stopping this - so if there was unwanted
data the COPY will proceed to copy it anyhow...

~~~

8. src/backend/commands/subscriptioncmds.c - check_pub_table_subscribed

@@ -1781,6 +1794,121 @@ AlterSubscriptionOwner_oid(Oid subid, Oid newOwnerId)
  table_close(rel, RowExclusiveLock);
 }

+/*
+ * Check and throw a warning if the publisher has subscribed to the same table
+ * from some other publisher. This check is required only if "copy_data = true"
+ * and "origin = none" for CREATE SUBSCRIPTION and
+ * ALTER SUBSCRIPTION ... REFRESH statements to notify user that data having
+ * origin might have been copied.

"throw a warning" -> "log a warning"

"to notify user" -> "to notify the user" ?

~~~

9.

+ if (copydata == false || !origin ||
+ (pg_strcasecmp(origin, LOGICALREP_ORIGIN_NONE) != 0))
+ return;

SUGGESTION
if (!copydata || !origin ||
(pg_strcasecmp(origin, LOGICALREP_ORIGIN_NONE) != 0))

~~~

10. (Question)

I can't tell just by reading the code if FOR ALL TABLES case is
handled – e.g. will this recognise the case were the publisher might
have table data from other origins because it has a subscription on
some other node that was publishing "FOR ALL TABLES"?

~~~

11.

+ ereport(WARNING,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("publisher has subscribed table \"%s.%s\" from some other publisher",
+nspname, relname),
+ errdetail("Publisher might have subscribed one or more tables from
some other publisher."),
+ errhint("Verify that these publisher tables do not have data that
has an origin associated before proceeding to avoid inconsistency."));
+ break;

11a.
I'm not sure having that "break" code is correct logic. Won't that
mean the user will only get a warning for the first potential problem
encountered but then other potential problems tables will have no
warnings at all so the user may not be made aware of them? Perhaps you
need to first gather the list of all the suspicious tables before
logging a single warning which shows that list? Or perhaps you need to
log multiple warnings – one warning per suspicious table?

~

11b.
The WARNING message seems a bit inside-out because the errmsg is
giving more details than the errdetail.

SUGGESTIONS (or something similar)
errmsg - "subscription XXX requested origin=NONE but m

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-08-28 Thread Amit Kapila

On Sat, Aug 27, 2022 at 7:06 PM Masahiko Sawada  wrote:
>
> On Sat, Aug 27, 2022 at 7:24 PM Amit Kapila  wrote:
> >
> > On Sat, Aug 27, 2022 at 1:06 PM Masahiko Sawada  
> > wrote:
> > >
> > > > >
> > > > > I think then we should change this code in the master branch patch
> > > > > with an additional comment on the lines of: "Either all the xacts got
> > > > > purged or none. It is only possible to partially remove the xids from
> > > > > this array if one or more of the xids are still running but not all.
> > > > > That can happen if we start decoding from a point (LSN where the
> > > > > snapshot state became consistent) where all the xacts in this were
> > > > > running and then at least one of those got committed and a few are
> > > > > still running. We will never start from such a point because we won't
> > > > > move the slot's restart_lsn past the point where the oldest running
> > > > > transaction's restart_decoding_lsn is."
> > > > >
> > > >
> > > > Unfortunately, this theory doesn't turn out to be true. While
> > > > investigating the latest buildfarm failure [1], I see that it is
> > > > possible that only part of the xacts in the restored catalog modifying
> > > > xacts list needs to be purged. See the attached where I have
> > > > demonstrated it via a reproducible test. It seems the point we were
> > > > missing was that to start from a point where two or more catalog
> > > > modifying were serialized, it requires another open transaction before
> > > > both get committed, and then we need the checkpoint or other way to
> > > > force running_xacts record in-between the commit of initial two
> > > > catalog modifying xacts. There could possibly be other ways as well
> > > > but the theory above wasn't correct.
> > > >
> > >
> > > Thank you for the analysis and the patch. I have the same conclusion.
> > > Since we took this approach only on the master the back branches are
> > > not affected.
> > >
> > > The new test scenario makes sense to me and looks better than the one
> > > I have. Regarding the fix, I think we should use
> > > TransactionIdFollowsOrEquals() instead of
> > > NormalTransactionIdPrecedes():
> > >
> > >  +   for (off = 0; off < builder->catchange.xcnt; off++)
> > >  +   {
> > >  +   if (NormalTransactionIdPrecedes(builder->catchange.xip[off],
> > >  +   builder->xmin))
> > >  +   break;
> > >  +   }
> > >
> >
> > Right, fixed.
>
> Thank you for updating the patch! It looks good to me.
>

Pushed.

-- 
With Regards,
Amit Kapila.

Support tls-exporter as channel binding for TLSv1.3

2022-08-28 Thread Michael Paquier

Hi all,

RFC9266, that has been released not so long ago, has added
tls-exporter as a new channel binding type:
https://www.rfc-editor.org/rfc/rfc5929.html

An advantage over tls-server-end-point, AFAIK, is that this prevents
man-in-the-middle attacks even if the attacker holds the server's
private key, which was the kind of job that tls-unique does for
TLSv1.2, though we've decided at the end to drop it during the PG11
dev cycle because it does things poorly.

This patch provides an implementation, tests and documentation for the
so-said feature.  An environment variable called PGCHANNELBINDINGTYPE
is added, as well as new connection parameter called
channel_binding_type.  The key point of the implementation is
SSL_export_keying_material(), that is available down to 1.0.1 (oldest
version supported on HEAD), so this should not require a ./configure
check.

Perhaps the part about the new libpq parameter could be refactored as
of its own patch, with the addition of channel_binding_type in the
SCRAM status structures.  Note also that tls-exporter is aimed for
TLSv1.3 and newer protocols, but OpenSSL allows the thing to work with
older protocols (testable with ssl_max_protocol_version, for example),
and I don't see a need to prevent this scenario.  An extra thing is
that attempting to use tls-exporter with a backend <= 15 and a client
>= 16 causes a failure during the SASL exchange, where the backend
complains about tls-exporter being unsupported.

Jacob Champion should be considered as the primary author of the
patch, even if I have spent some time on this patch before sending it
here.  I am adding that to the next commit fest.

Thanks,
--
Michael
From bd086e2d8b3151ca52d2e2cd43b8f4a9f522 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Mon, 29 Aug 2022 14:52:41 +0900
Subject: [PATCH] tls-exporter as channel binding for SCRAM/SSL

---
 src/include/libpq/libpq-be.h |  6 +++
 src/backend/libpq/auth-scram.c   | 50 ++--
 src/backend/libpq/be-secure-openssl.c| 18 +
 src/interfaces/libpq/fe-auth-scram.c | 45 +
 src/interfaces/libpq/fe-connect.c| 27 +
 src/interfaces/libpq/fe-secure-openssl.c | 30 ++
 src/interfaces/libpq/libpq-int.h | 14 +++
 src/test/ssl/t/002_scram.pl  | 15 +++
 doc/src/sgml/libpq.sgml  | 28 +
 doc/src/sgml/protocol.sgml   |  3 +-
 10 files changed, 216 insertions(+), 20 deletions(-)

diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h
index 6d452ec6d9..78ff51d053 100644
--- a/src/include/libpq/libpq-be.h
+++ b/src/include/libpq/libpq-be.h
@@ -298,6 +298,12 @@ extern void be_tls_get_peer_subject_name(Port *port, char *ptr, size_t len);
 extern void be_tls_get_peer_issuer_name(Port *port, char *ptr, size_t len);
 extern void be_tls_get_peer_serial(Port *port, char *ptr, size_t len);
 
+extern unsigned char *be_tls_export_keying_material(Port *port,
+	const char *label,
+	const unsigned char *ctx,
+	size_t ctxlen,
+	size_t outlen);
+
 /*
  * Get the server certificate hash for SCRAM channel binding type
  * tls-server-end-point.
diff --git a/src/backend/libpq/auth-scram.c b/src/backend/libpq/auth-scram.c
index ee7f52218a..6029e33596 100644
--- a/src/backend/libpq/auth-scram.c
+++ b/src/backend/libpq/auth-scram.c
@@ -148,6 +148,7 @@ typedef struct
 
 	/* Fields of the first message from client */
 	char		cbind_flag;
+	char	   *channel_binding_type;
 	char	   *client_first_message_bare;
 	char	   *client_username;
 	char	   *client_nonce;
@@ -876,7 +877,6 @@ static void
 read_client_first_message(scram_state *state, const char *input)
 {
 	char	   *p = pstrdup(input);
-	char	   *channel_binding_type;
 
 
 	/*--
@@ -1009,17 +1009,17 @@ read_client_first_message(scram_state *state, const char *input)
 		 errmsg("malformed SCRAM message"),
 		 errdetail("The client selected SCRAM-SHA-256 without channel binding, but the SCRAM message includes channel binding data.")));
 
-			channel_binding_type = read_attr_value(&p, 'p');
+			state->channel_binding_type = read_attr_value(&p, 'p');
 
 			/*
-			 * The only channel binding type we support is
-			 * tls-server-end-point.
+			 * We support tls-server-end-point and tls-exporter.
 			 */
-			if (strcmp(channel_binding_type, "tls-server-end-point") != 0)
+			if (strcmp(state->channel_binding_type, "tls-server-end-point") != 0
+&& strcmp(state->channel_binding_type, "tls-exporter") != 0)
 ereport(ERROR,
 		(errcode(ERRCODE_PROTOCOL_VIOLATION),
 		 errmsg("unsupported SCRAM channel-binding type \"%s\"",
-sanitize_str(channel_binding_type;
+sanitize_str(state->channel_binding_type;
 			break;
 		default:
 			ereport(ERROR,
@@ -1286,18 +1286,46 @@ read_client_final_message(scram_state *state, const char *input)
 
 		Assert(state->cbind_flag == 'p');
 
-		/* Fetch

Re: use ARM intrinsics in pg_lfind32() where available

2022-08-28 Thread Nathan Bossart

On Mon, Aug 29, 2022 at 11:25:50AM +0700, John Naylor wrote:
> + uint32 nelem_per_vector = sizeof(Vector32) / sizeof(uint32);
> + uint32 nelem_per_iteration = 4 * nelem_per_vector;
> 
> Using local #defines would be my style. I don't have a reason to
> object to this way, but adding const makes these vars more clear.

I added const.

> Speaking of const:
> 
> - const __m128i tmp1 = _mm_or_si128(result1, result2);
> - const __m128i tmp2 = _mm_or_si128(result3, result4);
> - const __m128i result = _mm_or_si128(tmp1, tmp2);
> + tmp1 = vector32_or(result1, result2);
> + tmp2 = vector32_or(result3, result4);
> + result = vector32_or(tmp1, tmp2);
> 
> Any reason to throw away the const declarations?

The only reason is because I had to move the declarations to before the
vector32_load() calls.

> +static inline bool
> +vector32_is_highbit_set(const Vector32 v)
> +{
> +#ifdef USE_SSE2
> + return (_mm_movemask_epi8(v) & 0x) != 0;
> +#endif
> +}
> 
> I'm not sure why we need this function -- AFAICS it just adds more
> work on x86 for zero benefit. For our present application, can we just
> cast to Vector8 (for Arm's sake) and call the 8-bit version?

Good idea.

> - * operations using bitwise operations on unsigned integers.
> + * operations using bitwise operations on unsigned integers.  Note that many
> + * of the functions in this file presently do not have non-SIMD
> + * implementations.
> 
> It's unclear to the reader whether this is a matter of 'round-to-it's.
> I'd like to document what I asserted in this thread, that it's likely
> not worthwhile to do anything with a uint64 representing two 32-bit
> ints. (It *is* demonstrably worth it for handling 8 byte-values at a
> time)

Done.

>   * Use saturating subtraction to find bytes <= c, which will present as
> - * NUL bytes in 'sub'.
> + * NUL bytes.
> 
> I'd like to to point out that the reason to do it this way is to
> workaround SIMD architectures frequent lack of unsigned comparison.

Done.

> + * Return the result of subtracting the respective elements of the input
> + * vectors using saturation.
> 
> I wonder if we should explain briefly what saturating arithmetic is. I
> had never encountered it outside of a SIMD programming context.

Done.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 567c5309f3caa87c8cd7fe2de62309eea429d8c5 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Thu, 25 Aug 2022 22:18:30 -0700
Subject: [PATCH v6 1/2] abstract architecture-specific implementation details
 from pg_lfind32()

---
 src/include/port/pg_lfind.h | 55 ---
 src/include/port/simd.h | 88 +++--
 2 files changed, 104 insertions(+), 39 deletions(-)

diff --git a/src/include/port/pg_lfind.h b/src/include/port/pg_lfind.h
index a4e13dffec..1d9be4eb36 100644
--- a/src/include/port/pg_lfind.h
+++ b/src/include/port/pg_lfind.h
@@ -91,16 +91,19 @@ pg_lfind32(uint32 key, uint32 *base, uint32 nelem)
 {
 	uint32		i = 0;
 
-#ifdef USE_SSE2
+#ifndef USE_NO_SIMD
 
 	/*
-	 * A 16-byte register only has four 4-byte lanes. For better
-	 * instruction-level parallelism, each loop iteration operates on a block
-	 * of four registers. Testing has showed this is ~40% faster than using a
-	 * block of two registers.
+	 * For better instruction-level parallelism, each loop iteration operates
+	 * on a block of four registers.  Testing for SSE2 has showed this is ~40%
+	 * faster than using a block of two registers.
 	 */
-	const		__m128i keys = _mm_set1_epi32(key); /* load 4 copies of key */
-	uint32		iterations = nelem & ~0xF;	/* round down to multiple of 16 */
+	const Vector32 keys = vector32_broadcast(key);	/* load copies of key */
+	const uint32 nelem_per_vector = sizeof(Vector32) / sizeof(uint32);
+	const uint32 nelem_per_iteration = 4 * nelem_per_vector;
+
+	/* round down to multiple of elements per iteration */
+	const uint32 tail_idx = nelem & ~(nelem_per_iteration - 1);
 
 #if defined(USE_ASSERT_CHECKING)
 	bool		assert_result = false;
@@ -116,31 +119,33 @@ pg_lfind32(uint32 key, uint32 *base, uint32 nelem)
 	}
 #endif
 
-	for (i = 0; i < iterations; i += 16)
+	for (i = 0; i < tail_idx; i += nelem_per_iteration)
 	{
-		/* load the next block into 4 registers holding 4 values each */
-		const		__m128i vals1 = _mm_loadu_si128((__m128i *) & base[i]);
-		const		__m128i vals2 = _mm_loadu_si128((__m128i *) & base[i + 4]);
-		const		__m128i vals3 = _mm_loadu_si128((__m128i *) & base[i + 8]);
-		const		__m128i vals4 = _mm_loadu_si128((__m128i *) & base[i + 12]);
+		Vector32	vals1, vals2, vals3, vals4,
+	result1, result2, result3, result4,
+	tmp1, tmp2, result;
+
+		/* load the next block into 4 registers */
+		vector32_load(&vals1, &base[i]);
+		vector32_load(&vals2, &base[i + nelem_per_vector]);
+		vector32_load(&vals3, &base[i + nelem_per_vector * 2]);
+		vector32_load(&vals4, &base[i + nelem_per_vector * 3]);
 
 		/* compare each value to the key */
-		const		__m128i result1 = _mm_cmpeq_epi32(k

Re: Reducing the chunk header sizes on all memory context types

2022-08-28 Thread David Rowley

On Mon, 29 Aug 2022 at 10:39, David Rowley  wrote:
> One more try to make CFbot happy.

After a bit more revision, mostly updating outdated comments and
naming adjustments, I've pushed this.

Per the benchmark results I showed in [1], due to the performance of
having the AllocSet free list pointers stored at the end of the
allocated chunk being quite a bit slower than having them at the start
of the chunk, I adjusted the patch to have them at the start.

Time for me to go and watch the buildfarm results come in.

David

[1] 
https://postgr.es/m/caaphdvpuhcpwczkxzuqqgb8yjpnqsvnncbzz6pwphfr2qmm...@mail.gmail.com

Re: Insertion Sort Improvements

2022-08-28 Thread John Naylor

On Fri, Aug 26, 2022 at 9:06 PM Benjamin Coutu  wrote:
>
> Another idea could be to run a binary insertion sort and use a much higher 
> threshold. This could significantly cut down on comparisons (especially with 
> presorted runs, which are quite common in real workloads).

Comparisons that must go to the full tuple are expensive enough that
this idea might have merit in some cases, but that would be a research
project.

> If full binary search turned out to be an issue regarding cache locality, we 
> could do it in smaller chunks,

The main issue with binary search is poor branch prediction. Also, if
large chunks are bad for cache locality, isn't that a strike against
using a "much higher threshold"?

> With less comparisons we should start keeping track of swaps and use that as 
> an efficient way to determine presortedness. We could change the insertion 
> sort threshold to a swap threshold and do insertion sort until we hit the 
> swap threshold. I suspect that would make the current presorted check 
> obsolete (as binary insertion sort without or even with a few swaps should be 
> faster than the current presorted-check).

The thread you linked to discusses partial insertion sort as a
replacement for the pre-sorted check, along with benchmark results and
graphs IIRC. I think it's possibly worth doing, but needs more
investigation to make sure the (few) regressions I saw either: 1. were
just noise or 2. can be ameliorated. As I said in the dual pivot
thread, this would be great for dual pivot since we could reuse
partial insertion sort for choosing the pivots, reducing binary space.

-- 
John Naylor
EDB: http://www.enterprisedb.com

Re: Removing unneeded self joins

2022-08-28 Thread Andrey Lepikhov


On 8/29/22 04:39, Zhihong Yu wrote:



On Fri, Aug 26, 2022 at 3:02 PM Zhihong Yu > wrote:


Hi,
For v36-0001-Remove-self-joins.patch :

bq removes inner join of plane table to itself

plane table -> plain table

For relation_has_unique_index_ext(), it seems when extra_clauses
is NULL, there is no need to compute `exprs`.

Cheers

Done



For remove_self_joins_recurse():

+                   if (bms_num_members(relids) > join_collapse_limit)
+                       break;

The above line just comes out of the switch statement. This check should 
be done again between foreach and switch.

Otherwise the above check wouldn't achieve what you want.

Cheers

Thanks for highlighting the problem.
I guess, usage either of join_collapse_limit or from_collapse_limit 
isn't practical here.
That we really afraid here - many senseless search cycles of self-joins. 
And it may have higher limit than GUCs above. So I introduced a guc, 
called "self_join_search_limit" (so far undocumented) that is an 
explicit limit for a set of plain relations in FROM-list to search 
self-joins.


--
Regards
Andrey Lepikhov
Postgres ProfessionalFrom 6283d6e21214e34d3c1a6351fa9f6ac1aeb75ce8 Mon Sep 17 00:00:00 2001
From: "Andrey V. Lepikhov" 
Date: Fri, 26 Aug 2022 15:17:53 +0300
Subject: [PATCH] Remove self-joins.

A Self Join Removal (SJR) feature removes inner join of plain table to itself
in a query plan if can be proved that the join can be replaced with a scan.
The proof based on innerrel_is_unique machinery.

We can remove a self-join when for each outer row:
1. At most one inner row matches the join clauses.
2. If the join target list contains any inner vars, an inner row
must be (physically) the same row as the outer one.

In this patch we use Rowley's [1] approach to identify a self-join:
1. Collect all mergejoinable join quals which look like a.x = b.x
2. Check innerrel_is_unique() for the qual list from (1). If it
returns true, then outer row matches only the same row from the inner
relation.
3. If uniqueness of outer relation deduces from baserestrictinfo clause like
'x=const' on unique field(s), check that inner has the same clause with the
same constant(s).
4. Compare RowMarks of inner and outer relations. They must have the same
strength.

Some regression tests changed due to self-join removal logic.

[1] https://www.postgresql.org/message-id/raw/CAApHDvpggnFMC4yP-jUO7PKN%3DfXeErW5bOxisvJ0HvkHQEY%3DWw%40mail.gmail.com
---
 doc/src/sgml/config.sgml  |   16 +
 src/backend/optimizer/path/indxpath.c |   38 +
 src/backend/optimizer/plan/analyzejoins.c | 1046 -
 src/backend/optimizer/plan/planmain.c |5 +
 src/backend/utils/misc/guc.c  |   22 +
 src/include/optimizer/paths.h |3 +
 src/include/optimizer/planmain.h  |7 +
 src/test/regress/expected/equivclass.out  |   32 +
 src/test/regress/expected/join.out|  774 +++
 src/test/regress/expected/sysviews.out|3 +-
 src/test/regress/sql/equivclass.sql   |   16 +
 src/test/regress/sql/join.sql |  340 +++
 src/tools/pgindent/typedefs.list  |2 +
 13 files changed, 2278 insertions(+), 26 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a5cd4e44c7..2cfb62f97f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5297,6 +5297,22 @@ ANY num_sync ( 
+  enable_self_join_removal (boolean)
+  
+   enable_self_join_removal configuration parameter
+  
+  
+  
+   
+Enables or disables the query planner's optimization which analyses
+query tree and replaces self joins with semantically equivalent single
+scans. Take into consideration only plain tables.
+The default is on.
+   
+  
+ 
+
  
   enable_seqscan (boolean)
   
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index 045ff2e487..c41e7256be 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -3495,6 +3495,21 @@ bool
 relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
 			  List *restrictlist,
 			  List *exprlist, List *oprlist)
+{
+	return relation_has_unique_index_ext(root, rel, restrictlist,
+		 exprlist, oprlist, NULL);
+}
+
+/*
+ * relation_has_unique_index_ext
+ * if extra_clauses isn't NULL, return baserestrictinfo clauses which were
+ * used to derive uniqueness.
+ */
+bool
+relation_has_unique_index_ext(PlannerInfo *root, RelOptInfo *rel,
+			  List *restrictlist,
+			  List *exprlist, List *oprlist,
+			  List **extra_clauses)
 {
 	ListCell   *ic;
 
@@ -3550,6 +3565,7 @@ relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
 	{
 		IndexOptInfo *ind = (IndexOptInfo *) lfirst(ic);
 		int			c;
+		List	   *exprs = NIL;
 
 		/*
 		 * If the index is not unique, o

Re: privileges for ALTER ROLE/DATABASE SET

2022-08-28 Thread Noah Misch

On Fri, Jul 22, 2022 at 03:25:16PM -0700, Nathan Bossart wrote:
> On Fri, Jul 22, 2022 at 04:16:14PM -0400, Tom Lane wrote:
> > Clearly, you need enough privilege to SET the parameter, and you need
> > some sort of management privilege on the target role or DB.  There
> > might be room to discuss what that per-role/DB privilege needs to be.
> > But I'm very skeptical that we need to manage this at the level
> > of the cross product of GUCs and roles/DBs, which is what you seem
> > to be proposing.  That seems awfully unwieldy, and is there really
> > any use-case for it?
> 
> Actually, I think my vote is to do nothing, except for perhaps updating the
> documentation to indicate that SET privileges on a parameter are sufficient
> for ALTER ROLE/DATABASE SET (given you have the other required privileges
> for altering the role/database).  I can't think of a use-case for allowing
> a role to SET a GUC but not change the session default for another role.

If I wanted to argue for a use case, I'd point to ALTER ROLE/DATABASE SET
surviving REVOKE of SET privileges.  Revoking a SET privilege promptly affects
future SET statements, but the REVOKE issuer would need to take the separate
step of clearing unwanted pg_db_role_setting entries.  Even so, ...

> And I agree that requiring extra permissions for this feels excessive.
> Maybe someone else has a use-case in mind, though.

... I, too, vote to change nothing.  We have lots of cases where REVOKE
doesn't reverse actions taken while the user held the privilege being revoked.
Changing that isn't worth much.

Re: use ARM intrinsics in pg_lfind32() where available

2022-08-28 Thread Tom Lane

John Naylor  writes:
> I wonder if we should explain briefly what saturating arithmetic is. I
> had never encountered it outside of a SIMD programming context.

+1, it's at least worth a sentence to define the term.

regards, tom lane

Re: use ARM intrinsics in pg_lfind32() where available

2022-08-28 Thread John Naylor

On Sun, Aug 28, 2022 at 10:58 AM Nathan Bossart
 wrote:
>
> Here is a new patch set in which I've attempted to address all feedback.

Looks in pretty good shape. Some more comments:

+ uint32 nelem_per_vector = sizeof(Vector32) / sizeof(uint32);
+ uint32 nelem_per_iteration = 4 * nelem_per_vector;

Using local #defines would be my style. I don't have a reason to
object to this way, but adding const makes these vars more clear.
Speaking of const:

- const __m128i tmp1 = _mm_or_si128(result1, result2);
- const __m128i tmp2 = _mm_or_si128(result3, result4);
- const __m128i result = _mm_or_si128(tmp1, tmp2);
+ tmp1 = vector32_or(result1, result2);
+ tmp2 = vector32_or(result3, result4);
+ result = vector32_or(tmp1, tmp2);

Any reason to throw away the const declarations?

+static inline bool
+vector32_is_highbit_set(const Vector32 v)
+{
+#ifdef USE_SSE2
+ return (_mm_movemask_epi8(v) & 0x) != 0;
+#endif
+}

I'm not sure why we need this function -- AFAICS it just adds more
work on x86 for zero benefit. For our present application, can we just
cast to Vector8 (for Arm's sake) and call the 8-bit version?

Aside from that, I plan on rewriting some comments for commit, some of
which pre-date this patch:

- * operations using bitwise operations on unsigned integers.
+ * operations using bitwise operations on unsigned integers.  Note that many
+ * of the functions in this file presently do not have non-SIMD
+ * implementations.

It's unclear to the reader whether this is a matter of 'round-to-it's.
I'd like to document what I asserted in this thread, that it's likely
not worthwhile to do anything with a uint64 representing two 32-bit
ints. (It *is* demonstrably worth it for handling 8 byte-values at a
time)

  * Use saturating subtraction to find bytes <= c, which will present as
- * NUL bytes in 'sub'.
+ * NUL bytes.

I'd like to to point out that the reason to do it this way is to
workaround SIMD architectures frequent lack of unsigned comparison.

+ * Return the result of subtracting the respective elements of the input
+ * vectors using saturation.

I wonder if we should explain briefly what saturating arithmetic is. I
had never encountered it outside of a SIMD programming context.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: standby promotion can create unreadable WAL

2022-08-28 Thread Kyotaro Horiguchi

At Mon, 29 Aug 2022 13:13:52 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> we return it to the caller.  Is it worth to do a small refactoring
> like the attached?  If no, I'm fine with the proposed patch including
> the added assertion.

Mmm. That seems wrong. So forget about that.  The proposed patch looks
fine to me.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: standby promotion can create unreadable WAL

2022-08-28 Thread Kyotaro Horiguchi

At Sun, 28 Aug 2022 10:16:21 +0530, Dilip Kumar  wrote 
in 
> On Fri, Aug 26, 2022 at 7:53 PM Robert Haas  wrote:
> > v2 attached.
> 
> The patch LGTM, this patch will apply on master and v15.  PFA patch
> for back branches.

StandbyMode is obviously wrong.  On the other hand I thought that
!ArchiveRecoveryRequested is somewhat wrong, too (, as I stated in the
pointed thread).  On second thought, I changed my mind that it is
right. After aborted contrec is found, The cause of the confusion is
that I somehow thought that archive recovery continues from the
aborted-contrec record.  However, that assumption is wrong. The next
redo starts from the beginning of the aborted contrecord so we should
forget abouat the old missing/aborted contrec info when archive
recovery is requested.

In the end, the point is that we need to set the global variables only
when XLogPrefetcherReadRecord() (or XLogReadRecord()) returns NULL and
we return it to the caller.  Is it worth to do a small refactoring
like the attached?  If no, I'm fine with the proposed patch including
the added assertion.

# I havent reproduce the issue of the OP in the other thread yet, and
# also not found how to reproduce this issue, though..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/access/transam/xlogrecovery.c 
b/src/backend/access/transam/xlogrecovery.c
index a59a0e826b..d354701423 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -3023,19 +3023,6 @@ ReadRecord(XLogPrefetcher *xlogprefetcher, int emode,
record = XLogPrefetcherReadRecord(xlogprefetcher, &errormsg);
if (record == NULL)
{
-   /*
-* When not in standby mode we find that WAL ends in an 
incomplete
-* record, keep track of that record.  After recovery 
is done,
-* we'll write a record to indicate to downstream WAL 
readers that
-* that portion is to be ignored.
-*/
-   if (!StandbyMode &&
-   !XLogRecPtrIsInvalid(xlogreader->abortedRecPtr))
-   {
-   abortedRecPtr = xlogreader->abortedRecPtr;
-   missingContrecPtr = 
xlogreader->missingContrecPtr;
-   }
-
if (readFile >= 0)
{
close(readFile);
@@ -3125,8 +3112,20 @@ ReadRecord(XLogPrefetcher *xlogprefetcher, int emode,
/* In standby mode, loop back to retry. Otherwise, give 
up. */
if (StandbyMode && !CheckForStandbyTrigger())
continue;
-   else
-   return NULL;
+
+   /*
+* We return NULL to the caller.  If the last record 
has failed
+* with a missing contrecord, inform that to the caller 
as well.
+* In other cases we retry from the beginning of the 
failed record
+* so no need to do this.
+*/
+   if (!XLogRecPtrIsInvalid(xlogreader->abortedRecPtr))
+   {
+   abortedRecPtr = xlogreader->abortedRecPtr;
+   missingContrecPtr = 
xlogreader->missingContrecPtr;
+   }
+
+   return NULL;
}
}
 }

patch: Add missing descriptions for rmgr APIs

2022-08-28 Thread kuroda.hay...@fujitsu.com

Hi hackers,

While reading codes related with logical decoding,
I thought that following comment in rmgrlist.h is not consistent.

> /* symbol name, textual name, redo, desc, identify, startup, cleanup */

This comment describes a set of APIs that the resource manager should have, but 
functions for {mask, decode} are missed here.

Did we have any reasons for that? I thought it might be not friendly, so I 
attached a patch.


Best Regards,
Hayato Kuroda
FUJITSU LIMITED



0001-add-a-missing-comment.patch
Description: 0001-add-a-missing-comment.patch

Re: Handle infinite recursion in logical replication setup

2022-08-28 Thread Amit Kapila

On Mon, Aug 29, 2022 at 8:24 AM vignesh C  wrote:
>
> On Fri, Aug 26, 2022 at 9:52 AM Dilip Kumar  wrote:
> >
> > IMHO, since the user has specifically asked for origin=NONE but we do
> > not have any way to detect the origin during initial sync so I think
> > this could be documented and we can also issue the WARNING.  So that
> > users notice that part and carefully set up the replication.  OTOH, I
> > do not think that giving an error is very inconvenient because we are
> > already providing a new option "origin=NONE" so along with that lets
> > force the user to choose between copy_data=off or copy_data=force and
> > with that, there is no scope for mistakes.
>

Initially, I also thought that giving an error should be okay in this
case but later multiple people voted against that idea primarily
because we won't be able to detect whether the initial sync data
contains data from multiple origins. Consider that even though the
publisher is subscribed to some other publisher but it may not have
gotten any data from it by the time we try to subscribe from it. Also,
one might have removed the subscription that led to data from multiple
origins on the publisher, in this case, may be one can say that we can
consider that the data is now owned by the publisher but I am not
sure. So, considering this, I think giving a WARNING and documenting
how to set up replication correctly sounds like a reasonable way
forward.

> Since Jonathan also had suggested to throw a warning as in [1] and
> Hou-san also had suggested to throw a warning as in [2],
>

Agreed that multiple seems to be in favor of that approach. We can
always promote it to ERROR later if others and or users felt so.

-- 
With Regards,
Amit Kapila.

Re: Handle infinite recursion in logical replication setup

2022-08-28 Thread vignesh C

On Fri, Aug 26, 2022 at 9:52 AM Dilip Kumar  wrote:
>
> On Mon, Aug 22, 2022 at 9:19 AM houzj.f...@fujitsu.com
>  wrote:
> >
> > > Jonathan, Sawada-San, Hou-San, and others, what do you think is the best 
> > > way
> > > to move forward here?
> >
> > I think it's fine to throw a WARNING in this case given that there is a
> > chance of inconsistent data.
>
> IMHO, since the user has specifically asked for origin=NONE but we do
> not have any way to detect the origin during initial sync so I think
> this could be documented and we can also issue the WARNING.  So that
> users notice that part and carefully set up the replication.  OTOH, I
> do not think that giving an error is very inconvenient because we are
> already providing a new option "origin=NONE" so along with that lets
> force the user to choose between copy_data=off or copy_data=force and
> with that, there is no scope for mistakes.

Since Jonathan also had suggested to throw a warning as in [1] and
Hou-san also had suggested to throw a warning as in [2], I have made
changes to throw a warning and also documented the following contents
for the same:
+  
+   If the subscription is created with origin = none and
+   copy_data = true, it will check if the publisher has
+   subscribed to the same table from other publishers and, if so, throw a
+   warning to notify user to check the publisher tables. The user can ensure
+   that publisher tables do not have data which has an origin associated before
+   continuing with any other operations to prevent inconsistent data being
+   replicated.
+  

The changes for the same are available in the v42 patch at [3].
[1] - 
https://www.postgresql.org/message-id/afb653ba-e2b1-33a3-a54c-849f4466e1b4%40postgresql.org
[2] - 
https://www.postgresql.org/message-id/OS0PR01MB5716C383623ADAD64CE4841194719%40OS0PR01MB5716.jpnprd01.prod.outlook.com
[3] - 
https://www.postgresql.org/message-id/CALDaNm2oLWsSYtOFLdOkbrpC94%3DJZzKMCYDJoiZaqAX_Hn%2BU9Q%40mail.gmail.com

Regards,
Vignesh

Re: [PATCH] Add native windows on arm64 support

2022-08-28 Thread Michael Paquier

On Sun, Aug 28, 2022 at 11:51:19AM -0400, Tom Lane wrote:
> I have no idea either.  I agree there *shouldn't* be any connection,
> so if ASLR is somehow triggering this then whatever is failing is
> almost certainly buggy on its own terms.  But there's a lot of
> moving parts here (mumble libxml mumble).  I'm going to wait to see
> if it reproduces before spending much effort.

I have noticed that yesterday, but cannot think much about it.  This
basically changes the position of "" for the first record,
leaving the second one untouched:



I am not used to xmltable(), but I wonder if there is something in one
of these support functions in xml.c that gets influenced by the
randomization.  That sounds a bit hairy as make check passed in
bowerbird, and I have noticed at least two other Windows hosts running
TAP that passed.  Or that's just something with libxml itself.
--
Michael


signature.asc
Description: PGP signature

Re: Fix possible bogus array out of bonds (src/backend/access/brin/brin_minmax_multi.c)

2022-08-28 Thread Kyotaro Horiguchi

At Fri, 26 Aug 2022 10:28:50 -0300, Ranier Vilela  wrote 
in 
> At function has_matching_range, if variable ranges->nranges == 0,
> we exit quickly with a result equal to false.
> 
> This means that nranges can be zero.
> It occurs then that it is possible then to occur an array out of bonds, in
> the initialization of the variable maxvalue.
> So if nranges is equal to zero, there is no need to initialize minvalue and
> maxvalue.
> 
> The patch tries to fix it, avoiding possible errors by using maxvalue.

However it seems that nranges will never be zero, still the fix looks
good to me since it is generally allowed to be zero. I don't find a
similar mistake related to Range.nranges.


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Removing unneeded self joins

2022-08-28 Thread Zhihong Yu

On Fri, Aug 26, 2022 at 3:02 PM Zhihong Yu  wrote:

> Hi,
> For v36-0001-Remove-self-joins.patch :
>
> bq removes inner join of plane table to itself
>
> plane table -> plain table
>
> For relation_has_unique_index_ext(), it seems when extra_clauses is NULL,
> there is no need to compute `exprs`.
>
> Cheers
>

For remove_self_joins_recurse():

+   if (bms_num_members(relids) > join_collapse_limit)
+   break;

The above line just comes out of the switch statement. This check should be
done again between foreach and switch.
Otherwise the above check wouldn't achieve what you want.

Cheers

Re: effective_multixact_freeze_max_age issue

2022-08-28 Thread Nathan Bossart

On Sun, Aug 28, 2022 at 11:38:09AM -0700, Peter Geoghegan wrote:
> On Tue, Aug 2, 2022 at 4:12 PM Peter Geoghegan  wrote:
>> That is, we only need to make sure that the "multiXactCutoff must
>> always be <= oldestMxact" invariant holds once, by checking for it
>> directly. The same thing happens with OldestXmin/FreezeLimit. That
>> seems like a simpler foundation. It's also a lot more logical. Why
>> should the cutoff for freezing be held back by a long running
>> transaction, except to the extent that it is strictly necessary to do
>> so to avoid wrong answers (wrong answers seen by the long running
>> transaction)?
> 
> Anybody have any input on this? I'm hoping that this can be committed soon.
> 
> ISTM that the way that we currently derive FreezeLimit (by starting
> with OldestXmin rather than starting with the same
> ReadNextTransactionId() value that gets used for the aggressiveness
> cutoffs) is just an accident of history. The "Routine vacuuming" docs
> already describe this behavior in terms that sound closer to the
> behavior with the patch than the actual current behavior:
> 
> "When VACUUM scans every page in the table that is not already
> all-frozen, it should set age(relfrozenxid) to a value just a little
> more than the vacuum_freeze_min_age setting that was used (more by the
> number of transactions started since the VACUUM started)"

The idea seems sound to me, and IMO your patch simplifies things nicely,
which might be reason enough to proceed with it.  However, I'm struggling
to understand when this change would help much in practice.  IIUC it will
cause vacuums to freeze a bit more, but outside of extreme cases (maybe
when vacuum_freeze_min_age is set very high and there are long-running
transactions), ISTM it might not have tremendously much impact.  Is the
intent to create some sort of long-term behavior change for autovacuum, or
is this mostly aimed towards consistency among the cutoff calculations?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Cleaning up historical portability baggage

2022-08-28 Thread Thomas Munro

On Mon, Aug 29, 2022 at 9:40 AM Tom Lane  wrote:
> Here's another bit of baggage handling: fixing up the places that
> were afraid to use fflush(NULL).  We could doubtless have done
> this years ago (indeed, I found several places already using it)
> but as long as we're making a push to get rid of obsolete code,
> doing it now seems appropriate.

+1, must be OK (pg_dump and pg_upgrade).

Archeology:

commit 79fcde48b229534fd4a5e07834e5e0e84dd38bee
Author: Tom Lane 
Date:   Sun Nov 29 01:51:56 1998 +

Portability fix for old SunOS releases: fflush(NULL)

https://www.postgresql.org/message-id/199811241847.NAA04690%40tuna.uimage.com

SunOS 4.x was still in some kind of support phase for a few more
years, but I guess they weren't working too hard on conformance and
features, given that SunOS 5 (the big BSD -> System V rewrite) had
come out in '92...

> One thing that's not clear to me is what the appropriate rules
> should be for popen().  POSIX makes clear that you shouldn't
> expect popen() to include an fflush() itself, but we seem quite
> haphazard about whether to do one or not before popen().  In
> the case of popen(..., "r") we can expect that the child can't
> write on our stdout, but stderr could be a problem anyway.
>
> Likewise, there are some places that fflush before system(),
> but they are a minority.  Again it seems like the main risk
> is duplicated or mis-ordered stderr output.
>
> I'm inclined to add fflush(NULL) before any popen() or system()
> that hasn't got one already, but did not do that in the attached.

Couldn't hurt.  (Looking around at our setvbuf() setup to check the
expected stream state in various places... and huh, I hadn't
previously noticed the thing about Windows interpreting line buffering
to mean full buffering.  Pfnghghl...)

Re: Reducing the chunk header sizes on all memory context types

2022-08-28 Thread David Rowley

On Sun, 28 Aug 2022 at 23:04, David Rowley  wrote:
> I'll try that one again...

One more try to make CFbot happy.

David
From 118022b089279ea7a6b1d43ca9a105a3203905e3 Mon Sep 17 00:00:00 2001
From: David Rowley 
Date: Thu, 9 Jun 2022 20:09:22 +1200
Subject: [PATCH v8] Improve performance of and reduce overheads of memory
 management

Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs.  This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().

For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk.  This made
the header 16 bytes in size.  This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.

For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.

The slab allocator had a 16-byte chunk header.

The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types.  For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.

Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to.  Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.

The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header.  Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits).  The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not.  External here means that the chunk header does not store the
30-bit value or the block offset.  The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them.  When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.

Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types.  This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.

With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on.  AllocSet
is able to find the context to which the chunk is owned by first obtaining
a reference to the block by subtracting the block offset as is stored in
the 'hdrmask' field and then referencing the block's 'aset' field.
The Generation context uses the same method, but GenerationBlock did not
have a field pointing back to the owning context, so one is added by this
commit.

In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks.  When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer.  The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that.  Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit.  For geneartion.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we stor

Re: Slight refactoring of state check in pg_upgrade check_ function

2022-08-28 Thread Nathan Bossart

On Sun, Aug 28, 2022 at 10:42:24PM +0200, Daniel Gustafsson wrote:
> I noticed that the pg_upgrade check_ functions were determining failures found
> in a few different ways.  Some keep a boolen flag variable, and some (like
> check_for_incompatible_polymorphics) check the state of the script filehandle
> which is guaranteed to be set (with the error message referring to the path of
> said file).  Others like check_loadable_libraries only check the flag variable
> and fclose the handle assuming it was opened.
> 
> The attached diff changes the functions to do it consistently in one way, by
> checking the state of the filehandle.  Since we are referring to the file by
> path in the printed error message it seemed the cleanest approach, and it 
> saves
> a few lines of code without IMO reducing readability.
> 
> There is no change in functionality, just code consistency.

The patch looks reasonable to me.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work

2022-08-28 Thread Nathan Bossart

On Sat, Aug 27, 2022 at 02:06:32PM +0530, Bharath Rupireddy wrote:
> PSA v17 patch with reorderbuffer.c changes reverted.

LGTM

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Cleaning up historical portability baggage

2022-08-28 Thread Tom Lane

Here's another bit of baggage handling: fixing up the places that
were afraid to use fflush(NULL).  We could doubtless have done
this years ago (indeed, I found several places already using it)
but as long as we're making a push to get rid of obsolete code,
doing it now seems appropriate.

One thing that's not clear to me is what the appropriate rules
should be for popen().  POSIX makes clear that you shouldn't
expect popen() to include an fflush() itself, but we seem quite
haphazard about whether to do one or not before popen().  In
the case of popen(..., "r") we can expect that the child can't
write on our stdout, but stderr could be a problem anyway.

Likewise, there are some places that fflush before system(),
but they are a minority.  Again it seems like the main risk
is duplicated or mis-ordered stderr output.

I'm inclined to add fflush(NULL) before any popen() or system()
that hasn't got one already, but did not do that in the attached.

Thoughts?

regards, tom lane

diff --git a/src/backend/postmaster/fork_process.c b/src/backend/postmaster/fork_process.c
index c75be03d2c..ec67761487 100644
--- a/src/backend/postmaster/fork_process.c
+++ b/src/backend/postmaster/fork_process.c
@@ -37,13 +37,8 @@ fork_process(void)
 
 	/*
 	 * Flush stdio channels just before fork, to avoid double-output problems.
-	 * Ideally we'd use fflush(NULL) here, but there are still a few non-ANSI
-	 * stdio libraries out there (like SunOS 4.1.x) that coredump if we do.
-	 * Presently stdout and stderr are the only stdio output channels used by
-	 * the postmaster, so fflush'ing them should be sufficient.
 	 */
-	fflush(stdout);
-	fflush(stderr);
+	fflush(NULL);
 
 #ifdef LINUX_PROFILE
 
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index e3b19ca1ed..1aaab6c554 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -2503,8 +2503,7 @@ OpenPipeStream(const char *command, const char *mode)
 	ReleaseLruFiles();
 
 TryAgain:
-	fflush(stdout);
-	fflush(stderr);
+	fflush(NULL);
 	pqsignal(SIGPIPE, SIG_DFL);
 	errno = 0;
 	file = popen(command, mode);
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 95f32de4e2..cb3c289889 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -643,8 +643,7 @@ errfinish(const char *filename, int lineno, const char *funcname)
 		 * Any other code you might be tempted to add here should probably be
 		 * in an on_proc_exit or on_shmem_exit callback instead.
 		 */
-		fflush(stdout);
-		fflush(stderr);
+		fflush(NULL);
 
 		/*
 		 * Let the cumulative stats system know. Only mark the session as
@@ -670,8 +669,7 @@ errfinish(const char *filename, int lineno, const char *funcname)
 		 * XXX: what if we are *in* the postmaster?  abort() won't kill our
 		 * children...
 		 */
-		fflush(stdout);
-		fflush(stderr);
+		fflush(NULL);
 		abort();
 	}
 
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 29c28b7315..8567c875fe 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -489,8 +489,7 @@ popen_check(const char *command, const char *mode)
 {
 	FILE	   *cmdfd;
 
-	fflush(stdout);
-	fflush(stderr);
+	fflush(NULL);
 	errno = 0;
 	cmdfd = popen(command, mode);
 	if (cmdfd == NULL)
diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
index 73e20081d1..be2af9f261 100644
--- a/src/bin/pg_ctl/pg_ctl.c
+++ b/src/bin/pg_ctl/pg_ctl.c
@@ -448,8 +448,7 @@ start_postmaster(void)
 	pgpid_t		pm_pid;
 
 	/* Flush stdio channels just before fork, to avoid double-output problems */
-	fflush(stdout);
-	fflush(stderr);
+	fflush(NULL);
 
 #ifdef EXEC_BACKEND
 	pg_disable_aslr();
diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c
index 68c455f84b..d665b257c9 100644
--- a/src/bin/pg_dump/pg_dumpall.c
+++ b/src/bin/pg_dump/pg_dumpall.c
@@ -1578,8 +1578,7 @@ runPgDump(const char *dbname, const char *create_opts)
 
 	pg_log_info("running \"%s\"", cmd->data);
 
-	fflush(stdout);
-	fflush(stderr);
+	fflush(NULL);
 
 	ret = system(cmd->data);
 
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index e2086a07de..018cd310f7 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -123,8 +123,7 @@ get_control_data(ClusterInfo *cluster, bool live_check)
 		/* only pg_controldata outputs the cluster state */
 		snprintf(cmd, sizeof(cmd), "\"%s/pg_controldata\" \"%s\"",
  cluster->bindir, cluster->pgdata);
-		fflush(stdout);
-		fflush(stderr);
+		fflush(NULL);
 
 		if ((output = popen(cmd, "r")) == NULL)
 			pg_fatal("could not get control data using %s: %s",
@@ -191,8 +190,7 @@ get_control_data(ClusterInfo *cluster, bool live_check)
 			 cluster->bindir,
 			 live_check ? "pg_controldata\"" : resetwal_bin,
 			 cluster->pgdata);
-	fflush(stdout);
-	fflush(stderr);
+	fflush(NULL);
 
 	if ((output = popen(cmd, "r")) == NULL)
 		pg_fatal("could not get control data using

Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

2022-08-28 Thread Nathan Bossart

On Thu, Aug 25, 2022 at 10:12:00AM +0530, Bharath Rupireddy wrote:
> Please review the v12 patch attached.

LGTM.  I've marked this as ready-for-committer.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: CI and test improvements

2022-08-28 Thread Andres Freund

Hi,

On 2022-08-28 12:10:29 -0500, Justin Pryzby wrote:
> On Sun, Aug 28, 2022 at 09:07:52AM -0700, Andres Freund wrote:
> > > --- /dev/null
> > > +++ b/src/tools/ci/windows-compiler-warnings
> >
> > Wouldn't that be doable as something like
> > sh -c 'if test -s file; then cat file;exit 1; fi"
> > inside .cirrus.yml?
>
> I had written it inline in a couple ways, like
> - sh -exc 'f=msbuild.warn.log; if [ -s "$f" ]; then cat "$f"; exit 1; else 
> exit 0; fi'
>
> but then separated it out as you suggested in
> 20220227010908.vz2a7dmfzgwg7...@alap3.anarazel.de
>
> after I complained about cmd.exe requiring escaping for && and ||
> That makes writing any shell script a bit perilous and a separate script
> seems better.

I remember that I suggested it - but note that the way I wrote above doesn't
have anything needing escaping. Anyway, what do you think of the multiline
split I suggested?


> > > Subject: [PATCH 03/25] cirrus/ccache: disable compression and show stats
> > >
> > > ccache since 4.0 enables zstd compression by default.
> > >
> > > With default compression enabled 
> > > (https://cirrus-ci.com/task/6692342840164352):
> > > linux has 4.2; 99MB cirrus cache; cache_size_kibibyte 109616
> > > macos has 4.5.1: 47MB cirrus cache; cache_size_kibibyte   52500
> > > freebsd has 3.7.12: 42MB cirrus cache; cache_size_kibibyte134064
> > > windows has 4.6.1; 180MB cirrus cache; cache_size_kibibyte51179
> > > todo: compiler warnings
> > >
> > > With compression disabled (https://cirrus-ci.com/task/4614182514458624):
> > > linux: 91MB cirrus cache; cache_size_kibibyte 316136
> > > macos: 41MB cirrus cache; cache_size_kibibyte 118068
> > > windows: 42MB cirrus cache; cache_size_kibibyte   134064
> > > freebsd is the same
> > >
> > > The stats should either be shown or logged (or maybe run with 
> > > CCACHE_NOSTATS,
> > > to avoid re-uploading cache tarball in a 100% cached build, due only to
> > > updating ./**/stats).
> > >
> > > Note that ccache 4.4 supports CCACHE_STATSLOG, which seems ideal.
> >
> > I stared at this commit message for a while, trying to make sense of it, and
> > couldn't really. I assume you're saying that the cirrus compression is 
> > better
> > with ccache compression disabled, but it's extremely hard to parse out of 
> > it.
>
> Yes, because ccache uses zstd-1, and cirrus uses gzip, which it's going
> to use no matter what ccache does, and gzip's default -6 is better than
> ccache's zstd-1.
>
> > This does too much at once. Show stats, change cache sizes, disable
> > compression.
>
> The cache size change is related to the compression level change; ccache
> prunes based on the local size, which was compressed with zstd-1 and,
> with this patch, not compressed (so ~2x larger).  Also, it's more
> interesting to control the size uploaded to cirrus (after compression
> ith gzip-6).

That's what should have been in the commit message.


> > > From 6a6a97fc869fd1fd8b7ab5da5147f145581634f9 Mon Sep 17 00:00:00 2001
> > > From: Justin Pryzby 
> > > Date: Fri, 24 Jun 2022 00:09:12 -0500
> > > Subject: [PATCH 08/25] cirrus/freebsd: run with more CPUs+RAM and do not
> > >  repartitiion
> > >
> > > There was some historic problem where tests under freebsd took 8+ minutes 
> > > (and
> > > before 4a288a37f took 15 minutes).
> > >
> > > This reduces test time from 10min to 3min.
> > > 4 CPUs 4 tests https://cirrus-ci.com/task/4880240739614720
> > > 4 CPUs 6 tests https://cirrus-ci.com/task/4664440120410112 
> > > https://cirrus-ci.com/task/4586784884523008
> > > 4 CPUs 8 tests https://cirrus-ci.com/task/5001995491737600
> > >
> > > 6 CPUs https://cirrus-ci.com/task/6678321684545536
> > > 8 CPUs https://cirrus-ci.com/task/6264854121021440
> > >
> > > See also:
> > > https://www.postgresql.org/message-id/flat/20220310033347.hgxk4pyarzq4h...@alap3.anarazel.de#f36c0b17e33e31e7925e7e5812998686
> > > 8 jobs 7min https://cirrus-ci.com/task/6186376667332608
> > >
> > > xi-os-only: freebsd
> >
> > Typo.
>
> No - it's deliberate so I can switch to and from "everything" to "this
> only".

I don't see the point in posting patches to be applied if they contain lots of
such things that a potential committer would need to catch and include a lot
of of fixup patches.


> > > @@ -71,8 +69,6 @@ task:
> > >  fingerprint_key: ccache/freebsd
> > >  reupload_on_changes: true
> > >
> > > -  # Workaround around performance issues due to 32KB block size
> > > -  repartition_script: src/tools/ci/gcp_freebsd_repartition.sh
> > >create_user_script: |
> > >  pw useradd postgres
> > >  chown -R postgres:postgres .
> > > --
> >
> > What's the story there - at some point that was important for performance
> > because of the native block size triggering significant read-modify-write
> > cycles with postres' writes. You didn't comment on it in the commit message.
>
> Well, I don't know the history, but it seems to be unneeded now.

It's possible it was mainly needed for testing with aio +

Re: replacing role-level NOINHERIT with a grant-level option

2022-08-28 Thread Nathan Bossart

On Fri, Aug 26, 2022 at 02:46:59PM -0400, Robert Haas wrote:
> -   membership is via admin which has the 
> NOINHERIT
> -   attribute.  After:
> +   membership is via admin which was granted using
> +   WITH INHERIT FALSE After:

nitpick: I believe there should be a period before "After."

Otherwise, LGTM.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Slight refactoring of state check in pg_upgrade check_ function

2022-08-28 Thread Daniel Gustafsson

I noticed that the pg_upgrade check_ functions were determining failures found
in a few different ways.  Some keep a boolen flag variable, and some (like
check_for_incompatible_polymorphics) check the state of the script filehandle
which is guaranteed to be set (with the error message referring to the path of
said file).  Others like check_loadable_libraries only check the flag variable
and fclose the handle assuming it was opened.

The attached diff changes the functions to do it consistently in one way, by
checking the state of the filehandle.  Since we are referring to the file by
path in the printed error message it seemed the cleanest approach, and it saves
a few lines of code without IMO reducing readability.

There is no change in functionality, just code consistency.

--
Daniel Gustafsson   https://vmware.com/



v1-0001-Refactor-check_-functions-to-use-filehandle-for-s.patch
Description: Binary data

Re: [RFC] building postgres with meson - v12

2022-08-28 Thread Andres Freund

Hi,

On 2022-08-28 12:08:07 -0500, Justin Pryzby wrote:
> with_temp_install is repeated twice in prove_check:
>
> > Subject: [PATCH v12 02/15] Split TESTDIR into TESTLOGDIR and TESTDATADIR
> >
> > -   TESTDIR='$(CURDIR)/tmp_check' $(with_temp_install)
> > PGPORT='6$(DEF_PGPORT)' \
> > +   TESTLOGDIR='$(CURDIR)/tmp_check/log' $(with_temp_install) \
> > +   TESTDATADIR='$(CURDIR)/tmp_check' $(with_temp_install) \
> > +   PGPORT='6$(DEF_PGPORT)' \

Oops, must have screwed up resolving a conflict...

> Before running an individual test like "meson test recovery/017_shm",
> it's currently necessary to first manually run "meson test tmp_install".
> Is it possible to make that happen automatically ?

Not in a trivial way that I found. We don't want to reinstall all the time -
it's *quite* expensive on older machines. We could have a lock file in the
test setup so that the first test run installs it, with the others getting
stalled, but that has pretty obvious disadvantages too (like the test timing
being distorted).

Medium term I think we should consider simply not needing the temp install.

FWIW, if you can do the above as 'meson test tmp_install recovery/017_shm'.

> You're running tap tests via a python script.  There's no problem with
> that, but it's different from what's done by the existing makefiles.
> I was able to remove the python indirection - maybe that's better to
> talk about on the CI thread?  That moves some setup for TAP tests
> (TESTDIR, PATH, cd) from Makefile into the existing perl, which means
> less duplication.

I'm doubtful it's worth removing. You'd need to move removing the files from
the last run into both pg_regress and the tap test infrastructure. And I do
think it's nice to afterwards have markers which tests failed, so we can only
collect their logs.

Greetings,

Andres Freund

Re: effective_multixact_freeze_max_age issue

2022-08-28 Thread Peter Geoghegan

On Tue, Aug 2, 2022 at 4:12 PM Peter Geoghegan  wrote:
> That is, we only need to make sure that the "multiXactCutoff must
> always be <= oldestMxact" invariant holds once, by checking for it
> directly. The same thing happens with OldestXmin/FreezeLimit. That
> seems like a simpler foundation. It's also a lot more logical. Why
> should the cutoff for freezing be held back by a long running
> transaction, except to the extent that it is strictly necessary to do
> so to avoid wrong answers (wrong answers seen by the long running
> transaction)?

Anybody have any input on this? I'm hoping that this can be committed soon.

ISTM that the way that we currently derive FreezeLimit (by starting
with OldestXmin rather than starting with the same
ReadNextTransactionId() value that gets used for the aggressiveness
cutoffs) is just an accident of history. The "Routine vacuuming" docs
already describe this behavior in terms that sound closer to the
behavior with the patch than the actual current behavior:

"When VACUUM scans every page in the table that is not already
all-frozen, it should set age(relfrozenxid) to a value just a little
more than the vacuum_freeze_min_age setting that was used (more by the
number of transactions started since the VACUUM started)"

Besides, why should there be an idiosyncratic definition of "age" that
is only used with
vacuum_freeze_min_age/vacuum_multixact_freeze_min_age? Why would
anyone want to do less freezing in the presence of a long running
transaction? It simply makes no sense (unless we're forced to do so to
preserve basic guarantees needed for correctness, such as the
"FreezeLimit <= OldestXmin" invariant).

-- 
Peter Geoghegan

Re: Expand palloc/pg_malloc API

2022-08-28 Thread Peter Eisentraut


On 12.08.22 09:31, Peter Eisentraut wrote:
In talloc, the talloc() function itself allocates an object of a given 
type.  To allocate something of a specified size, you'd use 
talloc_size().  So those names won't map exactly.  I'm fine with 
palloc_object() if that is clearer.


I think the _ptrtype variant isn't that useful anyway, so if it's 
confusing we can leave it out.


I have updated this patch set to rename the _obj() functions to 
_object(), and I have dropped the _ptrtype() variants.


I have also split the patch to put the new API and the example uses into 
separate patches.From 250cc273063a6cd1cb7c7a73a59323844215260d Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Sun, 28 Aug 2022 19:13:34 +0200
Subject: [PATCH v2 1/2] Expand palloc/pg_malloc API for more type safety

This adds additional variants of palloc, pg_malloc, etc. that
encapsulate common usage patterns and provide more type safety.
Inspired by the talloc library.

Discussion: 
https://www.postgresql.org/message-id/flat/bb755632-2a43-d523-36f8-a1e7a389a...@enterprisedb.com
---
 src/include/common/fe_memutils.h | 28 
 src/include/utils/palloc.h   | 22 ++
 2 files changed, 50 insertions(+)

diff --git a/src/include/common/fe_memutils.h b/src/include/common/fe_memutils.h
index 0a4c9126ea..63f2b6a802 100644
--- a/src/include/common/fe_memutils.h
+++ b/src/include/common/fe_memutils.h
@@ -29,6 +29,28 @@ extern void *pg_malloc_extended(size_t size, int flags);
 extern void *pg_realloc(void *pointer, size_t size);
 extern void pg_free(void *pointer);
 
+/*
+ * Variants with easier notation and more type safety
+ */
+
+/*
+ * Allocate space for one object of type "type"
+ */
+#define pg_malloc_object(type) ((type *) pg_malloc(sizeof(type)))
+#define pg_malloc0_object(type) ((type *) pg_malloc0(sizeof(type)))
+
+/*
+ * Allocate space for "count" objects of type "type"
+ */
+#define pg_malloc_array(type, count) ((type *) pg_malloc(sizeof(type) * 
(count)))
+#define pg_malloc0_array(type, count) ((type *) pg_malloc0(sizeof(type) * 
(count)))
+
+/*
+ * Change size of allocation pointed to by "pointer" to have space for "count"
+ * objects of type "type"
+ */
+#define pg_realloc_array(pointer, type, count) ((type *) pg_realloc(pointer, 
sizeof(type) * (count)))
+
 /* Equivalent functions, deliberately named the same as backend functions */
 extern char *pstrdup(const char *in);
 extern char *pnstrdup(const char *in, Size size);
@@ -38,6 +60,12 @@ extern void *palloc_extended(Size size, int flags);
 extern void *repalloc(void *pointer, Size size);
 extern void pfree(void *pointer);
 
+#define palloc_object(type) ((type *) palloc(sizeof(type)))
+#define palloc0_object(type) ((type *) palloc0(sizeof(type)))
+#define palloc_array(type, count) ((type *) palloc(sizeof(type) * (count)))
+#define palloc0_array(type, count) ((type *) palloc0(sizeof(type) * (count)))
+#define repalloc_array(pointer, type, count) ((type *) repalloc(pointer, 
sizeof(type) * (count)))
+
 /* sprintf into a palloc'd buffer --- these are in psprintf.c */
 extern char *psprintf(const char *fmt,...) pg_attribute_printf(1, 2);
 extern size_t pvsnprintf(char *buf, size_t len, const char *fmt, va_list args) 
pg_attribute_printf(3, 0);
diff --git a/src/include/utils/palloc.h b/src/include/utils/palloc.h
index 332575f518..a0b62aa7b0 100644
--- a/src/include/utils/palloc.h
+++ b/src/include/utils/palloc.h
@@ -80,6 +80,28 @@ extern void *palloc_extended(Size size, int flags);
 extern pg_nodiscard void *repalloc(void *pointer, Size size);
 extern void pfree(void *pointer);
 
+/*
+ * Variants with easier notation and more type safety
+ */
+
+/*
+ * Allocate space for one object of type "type"
+ */
+#define palloc_object(type) ((type *) palloc(sizeof(type)))
+#define palloc0_object(type) ((type *) palloc0(sizeof(type)))
+
+/*
+ * Allocate space for "count" objects of type "type"
+ */
+#define palloc_array(type, count) ((type *) palloc(sizeof(type) * (count)))
+#define palloc0_array(type, count) ((type *) palloc0(sizeof(type) * (count)))
+
+/*
+ * Change size of allocation pointed to by "pointer" to have space for "count"
+ * objects of type "type"
+ */
+#define repalloc_array(pointer, type, count) ((type *) repalloc(pointer, 
sizeof(type) * (count)))
+
 /*
  * The result of palloc() is always word-aligned, so we can skip testing
  * alignment of the pointer when deciding which MemSet variant to use.
-- 
2.37.1

From d22fe38b8d5dfee16e2794daeaa3834c8f5aa05b Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Sun, 28 Aug 2022 19:17:49 +0200
Subject: [PATCH v2 2/2] Assorted examples of expanded type-safer
 palloc/pg_malloc API

---
 contrib/dblink/dblink.c  | 12 +++
 src/backend/access/brin/brin.c   | 14 
 src/backend/access/gin/ginfast.c | 17 --
 src/backend/commands/indexcmds.c | 42 +++
 src/backend/executor/nodeHash.c  | 57 +---
 src/bin/pg_dump/common.c

Re: CI and test improvements

2022-08-28 Thread Justin Pryzby

On Sun, Aug 28, 2022 at 09:07:52AM -0700, Andres Freund wrote:
> > --- /dev/null
> > +++ b/src/tools/ci/windows-compiler-warnings
> 
> Wouldn't that be doable as something like
> sh -c 'if test -s file; then cat file;exit 1; fi"
> inside .cirrus.yml?

I had written it inline in a couple ways, like
- sh -exc 'f=msbuild.warn.log; if [ -s "$f" ]; then cat "$f"; exit 1; else exit 
0; fi'

but then separated it out as you suggested in
20220227010908.vz2a7dmfzgwg7...@alap3.anarazel.de

after I complained about cmd.exe requiring escaping for && and ||
That makes writing any shell script a bit perilous and a separate script
seems better.

> > Subject: [PATCH 03/25] cirrus/ccache: disable compression and show stats
> >
> > ccache since 4.0 enables zstd compression by default.
> >
> > With default compression enabled 
> > (https://cirrus-ci.com/task/6692342840164352):
> > linux has 4.2; 99MB cirrus cache; cache_size_kibibyte   109616
> > macos has 4.5.1: 47MB cirrus cache; cache_size_kibibyte 52500
> > freebsd has 3.7.12: 42MB cirrus cache; cache_size_kibibyte  134064
> > windows has 4.6.1; 180MB cirrus cache; cache_size_kibibyte  51179
> > todo: compiler warnings
> >
> > With compression disabled (https://cirrus-ci.com/task/4614182514458624):
> > linux: 91MB cirrus cache; cache_size_kibibyte   316136
> > macos: 41MB cirrus cache; cache_size_kibibyte   118068
> > windows: 42MB cirrus cache; cache_size_kibibyte 134064
> > freebsd is the same
> >
> > The stats should either be shown or logged (or maybe run with 
> > CCACHE_NOSTATS,
> > to avoid re-uploading cache tarball in a 100% cached build, due only to
> > updating ./**/stats).
> >
> > Note that ccache 4.4 supports CCACHE_STATSLOG, which seems ideal.
> 
> I stared at this commit message for a while, trying to make sense of it, and
> couldn't really. I assume you're saying that the cirrus compression is better
> with ccache compression disabled, but it's extremely hard to parse out of it.

Yes, because ccache uses zstd-1, and cirrus uses gzip, which it's going
to use no matter what ccache does, and gzip's default -6 is better than
ccache's zstd-1.

> This does too much at once. Show stats, change cache sizes, disable
> compression.

The cache size change is related to the compression level change; ccache
prunes based on the local size, which was compressed with zstd-1 and,
with this patch, not compressed (so ~2x larger).  Also, it's more
interesting to control the size uploaded to cirrus (after compression
ith gzip-6).

> > From 01e9abd386a4e6cc0125b97617fb42e695898cbf Mon Sep 17 00:00:00 2001
> > From: Justin Pryzby 
> > Date: Tue, 26 Jul 2022 20:30:02 -0500
> > Subject: [PATCH 04/25] cirrus/ccache: add explicit cache keys..
> >
> > Since otherwise, building with ci-os-only will probably fail to use the 
> > normal
> > cache, since the cache key is computed using both the task name and its 
> > *index*
> > in the list of caches (internal/executor/cache.go:184).
> 
> Hm, perhaps worth confirming and/or reporting to cirrus rather?

I know because of reading their source.  Unfortunately, there's no
commit history indicating the intent or rationale.
https://github.com/cirruslabs/cirrus-ci-agent/blob/master/internal/executor/cache.go#L183

> > From 6a6a97fc869fd1fd8b7ab5da5147f145581634f9 Mon Sep 17 00:00:00 2001
> > From: Justin Pryzby 
> > Date: Fri, 24 Jun 2022 00:09:12 -0500
> > Subject: [PATCH 08/25] cirrus/freebsd: run with more CPUs+RAM and do not
> >  repartitiion
> >
> > There was some historic problem where tests under freebsd took 8+ minutes 
> > (and
> > before 4a288a37f took 15 minutes).
> >
> > This reduces test time from 10min to 3min.
> > 4 CPUs 4 tests https://cirrus-ci.com/task/4880240739614720
> > 4 CPUs 6 tests https://cirrus-ci.com/task/4664440120410112 
> > https://cirrus-ci.com/task/4586784884523008
> > 4 CPUs 8 tests https://cirrus-ci.com/task/5001995491737600
> >
> > 6 CPUs https://cirrus-ci.com/task/6678321684545536
> > 8 CPUs https://cirrus-ci.com/task/6264854121021440
> >
> > See also:
> > https://www.postgresql.org/message-id/flat/20220310033347.hgxk4pyarzq4h...@alap3.anarazel.de#f36c0b17e33e31e7925e7e5812998686
> > 8 jobs 7min https://cirrus-ci.com/task/6186376667332608
> >
> > xi-os-only: freebsd
> 
> Typo.

No - it's deliberate so I can switch to and from "everything" to "this
only".

> > @@ -71,8 +69,6 @@ task:
> >  fingerprint_key: ccache/freebsd
> >  reupload_on_changes: true
> >
> > -  # Workaround around performance issues due to 32KB block size
> > -  repartition_script: src/tools/ci/gcp_freebsd_repartition.sh
> >create_user_script: |
> >  pw useradd postgres
> >  chown -R postgres:postgres .
> > --
> 
> What's the story there - at some point that was important for performance
> because of the native block size triggering significant read-modify-write
> cycles with postres' writes. You didn't comment on it in the commit message.

Well, I don't know the history, but it seems to be unneeded now

Re: [RFC] building postgres with meson - v12

2022-08-28 Thread Justin Pryzby

with_temp_install is repeated twice in prove_check:

> Subject: [PATCH v12 02/15] Split TESTDIR into TESTLOGDIR and TESTDATADIR  
>
> 
> -   TESTDIR='$(CURDIR)/tmp_check' $(with_temp_install)
> PGPORT='6$(DEF_PGPORT)' \
> +   TESTLOGDIR='$(CURDIR)/tmp_check/log' $(with_temp_install) \
> +   TESTDATADIR='$(CURDIR)/tmp_check' $(with_temp_install) \
> +   PGPORT='6$(DEF_PGPORT)' \

Before running an individual test like "meson test recovery/017_shm",
it's currently necessary to first manually run "meson test tmp_install".
Is it possible to make that happen automatically ?

You're running tap tests via a python script.  There's no problem with
that, but it's different from what's done by the existing makefiles.
I was able to remove the python indirection - maybe that's better to
talk about on the CI thread?  That moves some setup for TAP tests
(TESTDIR, PATH, cd) from Makefile into the existing perl, which means
less duplication.

Re: CI and test improvements

2022-08-28 Thread Andres Freund

Hi,

On 2022-08-28 09:44:47 -0500, Justin Pryzby wrote:
> On Sat, May 28, 2022 at 10:37:41AM -0500, Justin Pryzby wrote:
> > I'm anticipating the need to further re-arrange the patch set - it's not 
> > clear
> > which patches should go first.  Maybe some patches should be dropped in 
> > favour
> > of the meson project.  I guess some patches will have to be re-implemented 
> > with
> > meson (msvc warnings).
>
> > Maybe the improvements to vcregress should go into v15 ?  CI should run all 
> > the
> > tests (which also serves to document *how* to run all the tests, even if 
> > there
> > isn't a literal check-world target).
>
> On Thu, Jun 23, 2022 at 02:31:25PM -0500, Justin Pryzby wrote:
> > Should any of the test changes go into v15 ?
>
> On Thu, Jul 07, 2022 at 07:22:32PM -0500, Justin Pryzby wrote:
> > Also, cirrus/freebsd task can run 3x faster with more CPUs.
>
> Checking if there's interest in any/none of these patches ?
> I have added several more.
>
> Do you have an idea when the meson branch might be merged ?

I hope to do that fairly soon, but it's of course dependant on review etc. The
plan was to merge it early and mature it in tree to some degree. There's only
so much we can do "from the outside"...


> Will vcregress remain for a while, or will it go away for v16 ?

The plan was for the windows stuff to go away fairly quickly.


> From 99ee0bef5054539aad0e23a24dd9c9cc9ccee788 Mon Sep 17 00:00:00 2001
> From: Justin Pryzby 
> Date: Wed, 25 May 2022 21:53:22 -0500
> Subject: [PATCH 01/25] cirrus/windows: add compiler_warnings_script

Looks good.

> -MSBFLAGS: -m -verbosity:minimal 
> "-consoleLoggerParameters:Summary;ForceNoAlign" /p:TrackFileAccess=false 
> -nologo
> +# -fileLoggerParameters1: write warnings to msbuild.warn.log.
> +MSBFLAGS: -m -verbosity:minimal 
> "-consoleLoggerParameters:Summary;ForceNoAlign" /p:TrackFileAccess=false 
> -nologo -fileLoggerParameters1:warningsonly;logfile=msbuild.warn.log

Except, I think it'd be good to split this line. What do you think about using
something like
MSBFLAGS: >-
  -nologo
  -m -verbosity:minimal
  /p:TrackFileAccess=false
  "-consoleLoggerParameters:Summary;ForceNoAlign"
  -fileLoggerParameters1:warningsonly;logfile=msbuild.warn.log

I think that should work?


>  # If tests hang forever, cirrus eventually times out. In that case log
>  # output etc is not uploaded, making the problem hard to debug. Of course
> @@ -450,6 +451,11 @@ task:
>  cd src/tools/msvc
>  %T_C% perl vcregress.pl ecpgcheck
>
> +  # These should be last, so all the important checks are always run
> +  always:
> +compiler_warnings_script:
> +  - sh src\tools\ci\windows-compiler-warnings msbuild.warn.log
> +
>on_failure:
>  <<: *on_failure
>  crashlog_artifacts:
> diff --git a/src/tools/ci/windows-compiler-warnings 
> b/src/tools/ci/windows-compiler-warnings
> new file mode 100755
> index 000..d6f9a1fc569
> --- /dev/null
> +++ b/src/tools/ci/windows-compiler-warnings
> @@ -0,0 +1,16 @@
> +#! /bin/sh
> +# Success if the given file doesn't exist or is empty, else fail
> +# This is a separate file only to avoid dealing with windows shell quoting 
> and escaping.
> +set -e
> +
> +fn=$1
> +
> +if [ -s "$fn" ]
> +then
> + # Display the file's content, then exit indicating failure
> + cat "$fn"
> + exit 1
> +else
> + # Success
> + exit 0
> +fi
> --
> 2.17.1

Wouldn't that be doable as something like
sh -c 'if test -s file; then cat file;exit 1; fi"
inside .cirrus.yml?



> From 1064a0794e85e06b3a0eca4ed3765f078795cb36 Mon Sep 17 00:00:00 2001
> From: Justin Pryzby 
> Date: Sun, 3 Apr 2022 00:10:20 -0500
> Subject: [PATCH 03/25] cirrus/ccache: disable compression and show stats
>
> ccache since 4.0 enables zstd compression by default.
>
> With default compression enabled 
> (https://cirrus-ci.com/task/6692342840164352):
> linux has 4.2; 99MB cirrus cache; cache_size_kibibyte 109616
> macos has 4.5.1: 47MB cirrus cache; cache_size_kibibyte   52500
> freebsd has 3.7.12: 42MB cirrus cache; cache_size_kibibyte134064
> windows has 4.6.1; 180MB cirrus cache; cache_size_kibibyte51179
> todo: compiler warnings
>
> With compression disabled (https://cirrus-ci.com/task/4614182514458624):
> linux: 91MB cirrus cache; cache_size_kibibyte 316136
> macos: 41MB cirrus cache; cache_size_kibibyte 118068
> windows: 42MB cirrus cache; cache_size_kibibyte   134064
> freebsd is the same
>
> The stats should either be shown or logged (or maybe run with CCACHE_NOSTATS,
> to avoid re-uploading cache tarball in a 100% cached build, due only to
> updating ./**/stats).
>
> Note that ccache 4.4 supports CCACHE_STATSLOG, which seems ideal.

I stared at this commit message for a while, trying to make sense of it, and
couldn't really. I assume you're saying that the cirrus compression is better
with ccache compression disabled, but it's extremely hard to parse out of it.

This does too much at once. Show stats, change

Convert GetDatum() and DatumGet() macros to inline functions

2022-08-28 Thread Peter Eisentraut



I once wrote code like this:

char *oid = get_from_somewhere();
...

values[i++] = ObjectIdGetDatum(oid);

This compiles cleanly and even appears to work in practice, except of 
course it doesn't.


The FooGetDatum() macros just cast whatever you give it to Datum, 
without checking whether the input was really foo.


To address this, I converted these macros to inline functions, which 
enables type checking of the input argument.  For symmetry, I also 
converted the corresponding DatumGetFoo() macros (but those are less 
likely to cover mistakes, since the input argument is always Datum). 
This is patch 0002.


(I left some of the DatumGet... of the varlena types in fmgr.h as 
macros.  These ultimately map to functions that do type checking, so 
there would be little more to be learnt from that.  But we could do 
those for consistency as well.)


This whole thing threw up a bunch of compiler warnings and errors, which 
revealed a number of existing misuses.  These are fixed in patch 0001. 
These include


- using FooGetDatum on things that are already Datum,

- using DatumGetPointer on things that are already pointers,

- using PG_RETURN_TYPE on things that are Datum,

- using PG_RETURN_TYPE of the wrong type,

and others, including my personal favorite:

- using PointerGetDatum where DatumGetPointer should be used.

(AFAICT, unlike my initial example, I don't think any of those would 
cause wrong behavior.)From 4371c74b2d01408653f3917bb838b051537afaa2 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Sun, 28 Aug 2022 10:47:10 +0200
Subject: [PATCH 1/2] Fix incorrect uses of Datum conversion macros

---
 contrib/btree_gist/btree_utils_num.c  |  2 +-
 contrib/dblink/dblink.c   |  2 +-
 contrib/hstore/hstore_op.c|  2 +-
 contrib/pageinspect/heapfuncs.c   |  4 ++--
 src/backend/access/brin/brin_bloom.c  |  2 +-
 src/backend/access/brin/brin_minmax_multi.c   | 12 ++--
 src/backend/access/common/toast_compression.c |  2 +-
 src/backend/access/table/toast_helper.c   |  2 +-
 src/backend/access/transam/xlogfuncs.c|  2 +-
 src/backend/statistics/mcv.c  |  4 ++--
 src/backend/utils/adt/amutils.c   |  2 +-
 src/backend/utils/adt/multirangetypes.c   |  2 +-
 src/backend/utils/adt/pg_lsn.c|  4 ++--
 src/backend/utils/adt/pgstatfuncs.c   |  2 +-
 src/backend/utils/adt/rangetypes_spgist.c |  4 ++--
 src/backend/utils/adt/regexp.c|  2 +-
 src/backend/utils/adt/tsgistidx.c |  2 +-
 src/backend/utils/adt/tsquery_op.c|  2 +-
 src/backend/utils/adt/varlena.c   |  2 +-
 src/backend/utils/adt/xml.c   |  2 +-
 src/backend/utils/resowner/resowner.c |  6 +++---
 src/pl/plperl/plperl.c|  2 +-
 22 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/contrib/btree_gist/btree_utils_num.c 
b/contrib/btree_gist/btree_utils_num.c
index 5632ee0586..05c154afa3 100644
--- a/contrib/btree_gist/btree_utils_num.c
+++ b/contrib/btree_gist/btree_utils_num.c
@@ -153,7 +153,7 @@ gbt_num_fetch(GISTENTRY *entry, const gbtree_ninfo *tinfo)
datum = CashGetDatum(*(Cash *) entry->key);
break;
default:
-   datum = PointerGetDatum(entry->key);
+   datum = entry->key;
}
 
retval = palloc(sizeof(GISTENTRY));
diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index e323fdd0e6..7940387920 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -1336,7 +1336,7 @@ dblink_get_connections(PG_FUNCTION_ARGS)
}
 
if (astate)
-   PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate,
+   PG_RETURN_DATUM(makeArrayResult(astate,

  CurrentMemoryContext));
else
PG_RETURN_NULL();
diff --git a/contrib/hstore/hstore_op.c b/contrib/hstore/hstore_op.c
index 2f540d7ed6..0d4ec16d1e 100644
--- a/contrib/hstore/hstore_op.c
+++ b/contrib/hstore/hstore_op.c
@@ -1064,7 +1064,7 @@ hstore_each(PG_FUNCTION_ARGS)
tuple = heap_form_tuple(funcctx->tuple_desc, dvalues, nulls);
res = HeapTupleGetDatum(tuple);
 
-   SRF_RETURN_NEXT(funcctx, PointerGetDatum(res));
+   SRF_RETURN_NEXT(funcctx, res);
}
 
SRF_RETURN_DONE(funcctx);
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 2ff70405cf..aed2753253 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -383,7 +383,7 @@ tuple_data_split_internal(Oid relid, char *tupdata,
 errmsg("unexpected end of 
tuple data")));
 
if (attr->attlen == -1 && do_detoast)
-

Re: [PATCH] Add native windows on arm64 support

2022-08-28 Thread Tom Lane

Andres Freund  writes:
> On 2022-08-28 10:09:53 -0400, Tom Lane wrote:
>> -x | pre&deeppost
>> +x | pre&deeppost

> Pretty weird, agreed. But hard to see how it could be caused by the
> randomization change, except that perhaps it could highlight a preexisting
> bug?

I have no idea either.  I agree there *shouldn't* be any connection,
so if ASLR is somehow triggering this then whatever is failing is
almost certainly buggy on its own terms.  But there's a lot of
moving parts here (mumble libxml mumble).  I'm going to wait to see
if it reproduces before spending much effort.

regards, tom lane

Re: [PATCH] Add native windows on arm64 support

2022-08-28 Thread Andres Freund

Hi,

On 2022-08-28 10:09:53 -0400, Tom Lane wrote:
> Michael Paquier  writes:
> > True enough.  I have applied the patch to re-enable that on HEAD.
> > Let's now see what happens in the next couple of days.  Popcorn is
> > ready here.
>
> Hmm:
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird&dt=2022-08-28%2010%3A30%3A29
>
> Maybe that's just unrelated noise, but it sure looks weird: after
> passing the core regression tests in "make check", it failed in
> pg_upgrade's run with
>
> diff -w -U3 
> H:/prog/bf/root/HEAD/pgsql.build/src/test/regress/expected/xml.out 
> H:/prog/bf/root/HEAD/pgsql.build/src/bin/pg_upgrade/tmp_check/results/xml.out
> --- H:/prog/bf/root/HEAD/pgsql.build/src/test/regress/expected/xml.out
> Wed May 18 18:30:12 2022
> +++ 
> H:/prog/bf/root/HEAD/pgsql.build/src/bin/pg_upgrade/tmp_check/results/xml.out 
> Sun Aug 28 06:55:14 2022
> @@ -1557,7 +1557,7 @@
>  \\x
>  SELECT * FROM XMLTABLE('*' PASSING 'pre arg?>&deeppost' COLUMNS x xml PATH 
> 'node()', y xml PATH '/');
>  -[ RECORD 1 ]---
> -x | pre&deeppost
> +x | pre&deeppost
>  y | pre&deeppost+
>|

Pretty weird, agreed. But hard to see how it could be caused by the
randomization change, except that perhaps it could highlight a preexisting
bug?

Greetings,

Andres Freund

Re: CI and test improvements

2022-08-28 Thread Justin Pryzby

Resending with a problematic address removed...

On Sat, May 28, 2022 at 10:37:41AM -0500, Justin Pryzby wrote:
> I'm anticipating the need to further re-arrange the patch set - it's not clear
> which patches should go first.  Maybe some patches should be dropped in favour
> of the meson project.  I guess some patches will have to be re-implemented 
> with
> meson (msvc warnings).

> Maybe the improvements to vcregress should go into v15 ?  CI should run all 
> the
> tests (which also serves to document *how* to run all the tests, even if there
> isn't a literal check-world target).

On Thu, Jun 23, 2022 at 02:31:25PM -0500, Justin Pryzby wrote:
> Should any of the test changes go into v15 ?

On Thu, Jul 07, 2022 at 07:22:32PM -0500, Justin Pryzby wrote:
> Also, cirrus/freebsd task can run 3x faster with more CPUs.

Checking if there's interest in any/none of these patches ?
I have added several more.

Do you have an idea when the meson branch might be merged ?

Will vcregress remain for a while, or will it go away for v16 ?

-- 
Justin
>From 99ee0bef5054539aad0e23a24dd9c9cc9ccee788 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Wed, 25 May 2022 21:53:22 -0500
Subject: [PATCH 01/25] cirrus/windows: add compiler_warnings_script

I'm not sure how to write this test in windows shell; it's also not easy to
write it in posix sh, since windows shell is somehow interpretting && and ||...

https://www.postgresql.org/message-id/20220212212310.f645c6vw3njkgxka%40alap3.anarazel.de

See also: 8a1ce5e54f6d144e4f8e19af7c767b026ee0c956

ci-os-only: windows

https://cirrus-ci.com/task/6183879907213312
https://cirrus-ci.com/task/4876271443247104
---
 .cirrus.yml|  8 +++-
 src/tools/ci/windows-compiler-warnings | 16 
 2 files changed, 23 insertions(+), 1 deletion(-)
 create mode 100755 src/tools/ci/windows-compiler-warnings

diff --git a/.cirrus.yml b/.cirrus.yml
index 81eb8a9996d..da16344341b 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -370,7 +370,8 @@ task:
 # ForceNoAlign prevents msbuild from introducing line-breaks for long lines
 # disable file tracker, we're never going to rebuild, and it slows down the
 #   build
-MSBFLAGS: -m -verbosity:minimal "-consoleLoggerParameters:Summary;ForceNoAlign" /p:TrackFileAccess=false -nologo
+# -fileLoggerParameters1: write warnings to msbuild.warn.log.
+MSBFLAGS: -m -verbosity:minimal "-consoleLoggerParameters:Summary;ForceNoAlign" /p:TrackFileAccess=false -nologo -fileLoggerParameters1:warningsonly;logfile=msbuild.warn.log
 
 # If tests hang forever, cirrus eventually times out. In that case log
 # output etc is not uploaded, making the problem hard to debug. Of course
@@ -450,6 +451,11 @@ task:
 cd src/tools/msvc
 %T_C% perl vcregress.pl ecpgcheck
 
+  # These should be last, so all the important checks are always run
+  always:
+compiler_warnings_script:
+  - sh src\tools\ci\windows-compiler-warnings msbuild.warn.log
+
   on_failure:
 <<: *on_failure
 crashlog_artifacts:
diff --git a/src/tools/ci/windows-compiler-warnings b/src/tools/ci/windows-compiler-warnings
new file mode 100755
index 000..d6f9a1fc569
--- /dev/null
+++ b/src/tools/ci/windows-compiler-warnings
@@ -0,0 +1,16 @@
+#! /bin/sh
+# Success if the given file doesn't exist or is empty, else fail
+# This is a separate file only to avoid dealing with windows shell quoting and escaping.
+set -e
+
+fn=$1
+
+if [ -s "$fn" ]
+then
+	# Display the file's content, then exit indicating failure
+	cat "$fn"
+	exit 1
+else
+	# Success
+	exit 0
+fi
-- 
2.17.1

>From 278c842aa7021c2c551102f0023adff3a7bbd495 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sun, 9 Jan 2022 18:25:02 -0600
Subject: [PATCH 02/25] cirrus/vcregress: test modules/contrib with
 NO_INSTALLCHECK=1

--temp-config must be specified with an "=" because otherwise vcregress runs
pg_regress --temp-config test1 test2 [...],
..which means test1 gets eaten as the argument to --temp-config

https://www.postgresql.org/message-id/20220109191649.GL14051%40telsasoft.com
https://www.postgresql.org/message-id/CA%2BhUKGLneD%2Bq%2BE7upHGwn41KGvbxhsKbJ%2BM-y9nvv7_Xjv8Qog%40mail.gmail.com
---
 .cirrus.yml|  4 +-
 contrib/basic_archive/Makefile |  2 +-
 contrib/pg_stat_statements/Makefile|  2 +-
 contrib/test_decoding/Makefile |  2 +-
 src/test/modules/snapshot_too_old/Makefile |  2 +-
 src/test/modules/worker_spi/Makefile   |  2 +-
 src/tools/msvc/vcregress.pl| 46 +++---
 7 files changed, 48 insertions(+), 12 deletions(-)

diff --git a/.cirrus.yml b/.cirrus.yml
index da16344341b..f2861176be2 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -428,9 +428,9 @@ task:
   test_isolation_script: |
 %T_C% perl src/tools/msvc/vcregress.pl isolationcheck
   test_modules_script: |
-%T_C% perl src/tools/msvc/vcregress.pl modulescheck
+%T_C% perl src/tools/msvc/vcre

Re: [PATCH] Add native windows on arm64 support

2022-08-28 Thread Tom Lane

Michael Paquier  writes:
> True enough.  I have applied the patch to re-enable that on HEAD.
> Let's now see what happens in the next couple of days.  Popcorn is
> ready here.

Hmm:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird&dt=2022-08-28%2010%3A30%3A29

Maybe that's just unrelated noise, but it sure looks weird: after
passing the core regression tests in "make check", it failed in
pg_upgrade's run with

diff -w -U3 H:/prog/bf/root/HEAD/pgsql.build/src/test/regress/expected/xml.out 
H:/prog/bf/root/HEAD/pgsql.build/src/bin/pg_upgrade/tmp_check/results/xml.out
--- H:/prog/bf/root/HEAD/pgsql.build/src/test/regress/expected/xml.out  Wed May 
18 18:30:12 2022
+++ 
H:/prog/bf/root/HEAD/pgsql.build/src/bin/pg_upgrade/tmp_check/results/xml.out   
Sun Aug 28 06:55:14 2022
@@ -1557,7 +1557,7 @@
 \\x
 SELECT * FROM XMLTABLE('*' PASSING 'pre&deeppost' COLUMNS x xml PATH 'node()', 
y xml PATH '/');
 -[ RECORD 1 ]---
-x | pre&deeppost
+x | pre&deeppost
 y | pre&deeppost+
   | 
 

I don't recall ever seeing a failure quite like this.

regards, tom lane

Re: Reducing the chunk header sizes on all memory context types

2022-08-28 Thread David Rowley

On Fri, 26 Aug 2022 at 17:16, David Rowley  wrote:
> The CFbot just alerted me to the cplusplus check was failing with the
> v5 patch, so here's v6.

I'll try that one again...

David
From 90e48eef5709e8cdfb474f8798fa0aef1d0158b1 Mon Sep 17 00:00:00 2001
From: David Rowley 
Date: Thu, 9 Jun 2022 20:09:22 +1200
Subject: [PATCH v7] Improve performance of and reduce overheads of memory
 management

Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs.  This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().

For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk.  This made
the header 16 bytes in size.  This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.

For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.

The slab allocator had a 16-byte chunk header.

The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types.  For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.

Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to.  Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.

The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header.  Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits).  The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not.  External here means that the chunk header does not store the
30-bit value or the block offset.  The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them.  When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.

Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types.  This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.

With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on.  AllocSet
is able to find the context to which the chunk is owned by first obtaining
a reference to the block by subtracting the block offset as is stored in
the 'hdrmask' field and then referencing the block's 'aset' field.
The Generation context uses the same method, but GenerationBlock did not
have a field pointing back to the owning context, so one is added by this
commit.

In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks.  When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer.  The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that.  Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit.  For geneartion.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk

Re: Change pfree to accept NULL argument

2022-08-28 Thread Peter Eisentraut


On 22.08.22 20:30, Tom Lane wrote:

I'm not very convinced that the benefits of making pfree() more
like free() are worth those costs.

We could ameliorate the first objection if we wanted to back-patch
0002, I guess.

(FWIW, no objection to your 0001.  0004 and 0005 seem okay too;
they don't touch enough places to create much back-patching risk.)


To conclude this, I have committed those secondary patches and updated 
the utils/mmgr/README with some information from this discussion.

Re: [PATCH] Add native windows on arm64 support

2022-08-28 Thread Michael Paquier

On Sat, Aug 27, 2022 at 12:27:57PM -0700, Andres Freund wrote:
> I checked, and it looks like we didn't add --disable-dynamicbase, so ASLR
> wasn't disabled for mingw builds.

True enough.  I have applied the patch to re-enable that on HEAD.
Let's now see what happens in the next couple of days.  Popcorn is
ready here.
--
Michael


signature.asc
Description: PGP signature

44 matches

Mail list logo