date:20221130

Re: generic plans and "initial" pruning

2022-11-30 Thread Amit Langote

Hi Alvaro,

Thanks for looking at this one.

On Thu, Dec 1, 2022 at 3:12 AM Alvaro Herrera  wrote:
> Looking at 0001, I wonder if we should have a crosscheck that a
> PartitionPruneInfo you got from following an index is indeed constructed
> for the relation that you think it is: previously, you were always sure
> that the prune struct is for this node, because you followed a pointer
> that was set up in the node itself.  Now you only have an index, and you
> have to trust that the index is correct.

Yeah, a crosscheck sounds like a good idea.

> I'm not sure how to implement this, or even if it's doable at all.
> Keeping the OID of the partitioned table in the PartitionPruneInfo
> struct is easy, but I don't know how to check it in ExecInitMergeAppend
> and ExecInitAppend.

Hmm, how about keeping the [Merge]Append's parent relation's RT index
in the PartitionPruneInfo and passing it down to
ExecInitPartitionPruning() from ExecInit[Merge]Append() for
cross-checking?  Both Append and MergeAppend already have a
'apprelids' field that we can save a copy of in the
PartitionPruneInfo.  Tried that in the attached delta patch.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com

PartitionPruneInfo-relids.patch
Description: Binary data

Re: Perform streaming logical transactions by background workers and parallel apply

2022-11-30 Thread Masahiko Sawada

On Wed, Nov 30, 2022 at 10:51 PM houzj.f...@fujitsu.com
 wrote:
>
> On Wednesday, November 30, 2022 9:41 PM houzj.f...@fujitsu.com 
>  wrote:
> >
> > On Tuesday, November 29, 2022 8:34 PM Amit Kapila
> > > Review comments on v53-0001*
> >
> > Attach the new version patch set.
>
> Sorry, there were some mistakes in the previous patch set.
> Here is the correct V54 patch set. I also ran pgindent for the patch set.
>

Thank you for updating the patches. Here are random review comments
for 0001 and 0002 patches.

ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 errmsg("logical replication parallel apply worker
exited abnormally"),
 errcontext("%s", edata.context)));
and

ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 errmsg("logical replication parallel apply worker
exited because of subscription information change")));

I'm not sure ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE is appropriate
here. Given that parallel apply worker has already reported the error
message with the error code, I think we don't need to set the
errorcode for the logs from the leader process.

Also, I'm not sure the term "exited abnormally" is appropriate since
we use it when the server crashes for example. I think ERRORs reported
here don't mean that in general.

---
if (am_parallel_apply_worker() && on_subinfo_change)
{
/*
 * If a parallel apply worker exits due to the subscription
 * information change, we notify the leader apply worker so that the
 * leader can report more meaningful message in time and restart the
 * logical replication.
 */
pq_putmessage('X', NULL, 0);
}

and

ereport(ERROR,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 errmsg("logical replication parallel apply worker
exited because of subscription information change")));

Do we really need an additional message in case of 'X'? When we call
apply_worker_clean_exit with on_subinfo_change = true, we have
reported the error message such as:

ereport(LOG,
(errmsg("logical replication parallel apply worker for
subscription \"%s\" will stop because of a parameter change",
MySubscription->name)));

I think that reporting a similar message from the leader might not be
meaningful for users.

---
-if (options->proto.logical.streaming &&
-PQserverVersion(conn->streamConn) >= 14)
-appendStringInfoString(, ", streaming 'on'");
+if (options->proto.logical.streaming_str)
+appendStringInfo(, ", streaming '%s'",
+
options->proto.logical.streaming_str);

and

+/*
+ * Assign the appropriate option value for streaming option
according to
+ * the 'streaming' mode and the publisher's ability to
support that mode.
+ */
+if (server_version >= 16 &&
+MySubscription->stream == SUBSTREAM_PARALLEL)
+{
+options.proto.logical.streaming_str = pstrdup("parallel");
+MyLogicalRepWorker->parallel_apply = true;
+}
+else if (server_version >= 14 &&
+ MySubscription->stream != SUBSTREAM_OFF)
+{
+options.proto.logical.streaming_str = pstrdup("on");
+MyLogicalRepWorker->parallel_apply = false;
+}
+else
+{
+options.proto.logical.streaming_str = NULL;
+MyLogicalRepWorker->parallel_apply = false;
+}

This change moves the code of adjustment of the streaming option based
on the publisher server version from libpqwalreceiver.c to worker.c.
On the other hand, the similar logic for other parameters such as
"two_phase" and "origin" are still done in libpqwalreceiver.c. How
about passing MySubscription->stream via WalRcvStreamOptions and
constructing a streaming option string in libpqrcv_startstreaming()?
In ApplyWorkerMain(), we just need to set
MyLogicalRepWorker->parallel_apply = true if (server_version >= 16
&& MySubscription->stream == SUBSTREAM_PARALLEL). We won't need
pstrdup for "parallel" and "on", and it's more consistent with other
parameters.

---
+ * We maintain a worker pool to avoid restarting workers for each streaming
+ * transaction. We maintain each worker's information in the

Do we need to describe the pool in the doc?

---
+ * in AccessExclusive mode at transaction finish commands (STREAM_COMMIT and
+ * STREAM_PREAPRE) and release it immediately.

typo, s/STREAM_PREAPRE/STREAM_PREPARE/

---
+/* Parallel apply workers hash table (initialized on first use). */
+static HTAB *ParallelApplyWorkersHash = NULL;
+
+/*
+ * A list to maintain the active parallel apply workers. The information for
+ * the new worker is added to the list after successfully launching it. The
+ * list entry is removed if there are already enough workers in the

Re: SI-read predicate locks on materialized views

2022-11-30 Thread Yugo NAGATA

On Thu, 1 Dec 2022 15:48:21 +0900
Michael Paquier  wrote:

> On Tue, Oct 18, 2022 at 05:29:58PM +0900, Yugo NAGATA wrote:
> > Thank you for your review. I agree that an isolation test is required.
> > The attached patch contains the test using the scenario as explained in
> > the previous post.
> 
> Cool, thanks.  Sorry for my late reply here.  I have put my head on
> that for a few hours and could not see why we should not allow that.
> So committed the change after a few tweaks to the tests with the use
> of custom permutations, mainly.

Thank!

> While looking at all that, I have looked at the past threads like [1],
> just to note that this has never been really mentioned.
> 
> [1]: 
> https://www.postgresql.org/message-id/1371225929.28496.yahoomail...@web162905.mail.bf1.yahoo.com
> --
> Michael


-- 
Yugo NAGATA

Re: WIN32 pg_import_system_collations

2022-11-30 Thread Peter Eisentraut


On 10.11.22 11:08, Juan José Santamaría Flecha wrote:

This looks pretty good to me.  The refactoring of the
non-Windows parts
makes sense.  The Windows parts look reasonable on manual
inspection,
but again, I don't have access to Windows here, so someone else
should
also look it over.

I was going to say that at least it is getting tested on the CI, but
I have found out that meson changes version(). That is fixed in this
version.


Now is currently failing due to [1], so maybe we can leave this patch on 
hold until that's addressed.


[1] 
https://www.postgresql.org/message-id/CAC%2BAXB1wJEqfKCuVcNpoH%3Dgxd61N%3D7c2fR3Ew6YRPpSfEUA%3DyQ%40mail.gmail.com 


What is the status of this now?  I think the other issue has been addressed?

Re: meson PGXS compatibility

2022-11-30 Thread Peter Eisentraut


On 13.10.22 07:23, Andres Freund wrote:

0005: meson: Add PGXS compatibility

The actual meson PGXS compatibility. Plenty more replacements to do, but
suffices to build common extensions on a few platforms.

What level of completeness do we want to require here?


I have tried this with a few extensions.  Seems to work alright.  I don't
think we need to overthink this.  The way it's set up, if someone needs
additional variables set, they can easily be added.


Yea, I am happy enough with it now that the bulk is out of src/meson.build and
strip isn't set to an outright wrong value.


How are you planning to proceed with this?  I thought it was ready, but 
it hasn't moved in a while.

Re: [PATCH] Add native windows on arm64 support

2022-11-30 Thread Michael Paquier

On Mon, Nov 07, 2022 at 03:58:13PM +0900, Michael Paquier wrote:
> The patch has been switched as waiting on author for now.

Seeing no replies after three weeks, I have marked the patch as
returned with feedback for now.
--
Michael


signature.asc
Description: PGP signature

Re: psql - factor out echo code

2022-11-30 Thread Pavel Stehule

st 30. 11. 2022 v 10:43 odesílatel Fabien COELHO 
napsal:

>
> >> Now some of the output generated by test_extdepend gets a bit
> >> confusing:
> >> +-- QUERY:
> >> +
> >> +
> >> +-- QUERY:
> >>
> >> That's not entirely this patch's fault.  Still that's not really
> >> intuitive to see the output of a query that's just a blank spot..
> >
> > Hmmm.
> >
> > What about adding an explicit \echo before these empty outputs to
> mitigate
> > the possible induced confusion?
>
> \echo is not possible.
>
> Attached an attempt to improve the situation by replacing empty lines with
> comments in this test.
>

I can confirm so all regress tests passed

Regards

Pavel


>
> --
> Fabien.

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-30 Thread John Naylor

On Wed, Nov 30, 2022 at 2:28 PM Masahiko Sawada 
wrote:
>
> On Fri, Nov 25, 2022 at 5:00 PM John Naylor
>  wrote:

> > These hacks were my work, but I think we can improve that by having two
versions of NODE_HAS_FREE_SLOT -- one for fixed- and one for variable-sized
nodes. For that to work, in "init-node" we'd need a branch to set fanout to
zero for node256. That should be fine -- it already has to branch for
memset'ing node128's indexes to 0xFF.
>
> Since the node has fanout regardless of fixed-sized and
> variable-sized

As currently coded, yes. But that's not strictly necessary, I think.

>, only node256 is the special case where the fanout in
> the node doesn't match the actual fanout of the node. I think if we
> want to have two versions of NODE_HAS_FREE_SLOT, we can have one for
> node256 and one for other classes. Thoughts? In your idea, for
> NODE_HAS_FREE_SLOT for fixed-sized nodes, you meant like the
> following?
>
> #define FIXED_NODDE_HAS_FREE_SLOT(node, class)
>   (node->base.n.count < rt_size_class_info[class].fanout)

Right, and the other one could be VAR_NODE_...

--
John Naylor
EDB: http://www.enterprisedb.com

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-30 Thread John Naylor

On Wed, Nov 30, 2022 at 11:09 PM Masahiko Sawada 
wrote:
>
> I've investigated this issue and have a question about using atomic
> variables on palloc'ed memory. In non-parallel vacuum cases,
> radix_tree_control is allocated via aset.c. IIUC in 32-bit machines,
> the memory allocated by aset.c is 4-bytes aligned so these atomic
> variables are not always 8-bytes aligned. Is there any way to enforce
> 8-bytes aligned memory allocations in 32-bit machines?

The bigger question in my mind is: Why is there an atomic variable in
backend-local memory?

--
John Naylor
EDB: http://www.enterprisedb.com

Re: SI-read predicate locks on materialized views

2022-11-30 Thread Michael Paquier

On Tue, Oct 18, 2022 at 05:29:58PM +0900, Yugo NAGATA wrote:
> Thank you for your review. I agree that an isolation test is required.
> The attached patch contains the test using the scenario as explained in
> the previous post.

Cool, thanks.  Sorry for my late reply here.  I have put my head on
that for a few hours and could not see why we should not allow that.
So committed the change after a few tweaks to the tests with the use
of custom permutations, mainly.

While looking at all that, I have looked at the past threads like [1],
just to note that this has never been really mentioned.

[1]: 
https://www.postgresql.org/message-id/1371225929.28496.yahoomail...@web162905.mail.bf1.yahoo.com
--
Michael

signature.asc
Description: PGP signature

Re: Perform streaming logical transactions by background workers and parallel apply

2022-11-30 Thread Masahiko Sawada

On Wed, Nov 30, 2022 at 7:54 PM Amit Kapila  wrote:
>
> On Tue, Nov 29, 2022 at 10:18 AM houzj.f...@fujitsu.com
>  wrote:
> >
> > Attach the new version patch which addressed all comments.
> >
>
> Some comments on v53-0002*
> 
> 1. I think testing the scenario where the shm_mq buffer is full
> between the leader and parallel apply worker would require a large
> amount of data and then also there is no guarantee. How about having a
> developer GUC [1] force_apply_serialize which allows us to serialize
> the changes and only after commit the parallel apply worker would be
> allowed to apply it?

+1

The code coverage report shows that we don't cover the partial
serialization codes. This GUC would improve the code coverage.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Improve performance of pg_strtointNN functions

2022-11-30 Thread David Rowley

On Thu, 1 Dec 2022 at 18:27, John Naylor  wrote:
> I don't see why the non-decimal literal patch needs to be "immediately" 
> faster? If doing this first leads to less code churn, that's another 
> consideration, but you haven't made that argument.

My view is that Peter wants to keep the code he's adding for the hex,
octal and binary parsing as similar to the existing code as possible.
I very much understand Peter's point of view on that. Consistency is
good. However, if we commit the hex literals patch first, people might
ask "why don't we use bit-wise operators to make the power-of-2 bases
faster?", which seems like a very legitimate question. I asked it,
anyway...  On the other hand, if Peter adds the bit-wise operators
then the problem of code inconsistency remains.

As an alternative to those 2 options, I'm proposing we commit this
first then the above dilemma disappears completely.

If this was going to cause huge conflicts with Peter's patch then I
might think differently. I feel like it's a fairly trivial task to
rebase.

If the consensus is that we should fix this afterwards, then I'm happy to delay.

David

Re: Improve performance of pg_strtointNN functions

2022-11-30 Thread John Naylor

On Thu, Dec 1, 2022 at 6:42 AM David Rowley  wrote:
>
> I was thinking that we should likely apply this before doing the hex
> literals, which is the main focus of [1].  The reason being is so that
> that patch can immediately have faster conversions by allowing the
> compiler to use bit shifting instead of other means of multiplying by
> a power-of-2 number. I'm hoping this removes a barrier for Peter from
> the small gripe I raised on that thread about the patch having slower
> than required hex, octal and binary string parsing.

I don't see why the non-decimal literal patch needs to be "immediately"
faster? If doing this first leads to less code churn, that's another
consideration, but you haven't made that argument.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: Prefetch the next tuple's memory during seqscans

2022-11-30 Thread John Naylor

On Wed, Nov 23, 2022 at 4:58 AM David Rowley  wrote:

> My current thoughts are that it might be best to go with 0005 to start
> with.

+1

> I know Melanie is working on making some changes in this area,
> so perhaps it's best to leave 0002 until that work is complete.

There seem to be some open questions about that one as well.

I reran the same test in [1] (except I don't have the ability to lock clock
speed or affect huge pages) on an older CPU from 2014 (Intel(R) Xeon(R) CPU
E5-2695 v3 @ 2.30GHz, kernel 3.10 gcc 4.8) with good results:

HEAD:

Testing a1
latency average = 965.462 ms
Testing a2
latency average = 1054.608 ms
Testing a3
latency average = 1078.263 ms
Testing a4
latency average = 1120.933 ms
Testing a5
latency average = 1162.753 ms
Testing a6
latency average = 1298.876 ms
Testing a7
latency average = 1228.775 ms
Testing a8
latency average = 1293.535 ms

0001+0005:

Testing a1
latency average = 791.224 ms
Testing a2
latency average = 876.421 ms
Testing a3
latency average = 911.039 ms
Testing a4
latency average = 981.693 ms
Testing a5
latency average = 998.176 ms
Testing a6
latency average = 979.954 ms
Testing a7
latency average = 1066.523 ms
Testing a8
latency average = 1030.235 ms

I then tested a Power8 machine (also kernel 3.10 gcc 4.8). Configure
reports "checking for __builtin_prefetch... yes", but I don't think it does
anything here, as the results are within noise level. A quick search didn't
turn up anything informative on this platform, and I'm not motivated to dig
deeper. In any case, it doesn't make things worse.

HEAD:

Testing a1
latency average = 1402.163 ms
Testing a2
latency average = 1442.971 ms
Testing a3
latency average = 1599.188 ms
Testing a4
latency average = 1664.397 ms
Testing a5
latency average = 1782.091 ms
Testing a6
latency average = 1860.655 ms
Testing a7
latency average = 1929.120 ms
Testing a8
latency average = 2021.100 ms

0001+0005:

Testing a1
latency average = 1433.080 ms
Testing a2
latency average = 1428.369 ms
Testing a3
latency average = 1542.406 ms
Testing a4
latency average = 1642.452 ms
Testing a5
latency average = 1737.173 ms
Testing a6
latency average = 1828.239 ms
Testing a7
latency average = 1920.909 ms
Testing a8
latency average = 2036.922 ms

[1]
https://www.postgresql.org/message-id/CAFBsxsHqmH_S%3D4apc5agKsJsF6xZ9f6NaH0Z83jUYv3EgySHfw%40mail.gmail.com

--
John Naylor
EDB: http://www.enterprisedb.com

Re: pgsql: Revoke PUBLIC CREATE from public schema, now owned by pg_databas

2022-11-30 Thread Robert Haas

On Wed, Nov 30, 2022 at 5:35 PM Tom Lane  wrote:
> Also, I'd like to structure things so that the first para covers what
> you need to know in a clean v15+ installation, and details that only
> apply in upgrade scenarios are in the second para.  The upgrade scenario
> is going to be interesting to fewer and fewer people over time, so let's
> not clutter the lede with it.

Right, that was my main feeling about this.

> So maybe about like this?
>
> Constrain ordinary users to user-private schemas.  To implement
> this pattern, for every user needing to create non-temporary
> objects, create a schema with the same name as that user.  (Recall
> that the default search path starts with $user, which resolves to
> the user name. Therefore, if each user has a separate schema, they
> access their own schemas by default.)  Also ensure that no other
> schemas have public CREATE privileges.  This pattern is a secure
> schema usage pattern unless an untrusted user is the database
> owner or holds the CREATEROLE privilege, in which case no secure
> schema usage pattern exists.
>
> In PostgreSQL 15 and later, the default configuration supports
> this usage pattern.  In prior versions, or when using a database
> that has been upgraded from a prior version, you will need to
> remove the public CREATE privilege from the public schema (issue
> REVOKE CREATE ON SCHEMA public FROM PUBLIC).  Then consider
> auditing the public schema for objects named like objects in
> schema pg_catalog.
>
> This is close to what Robert wrote, but not exactly the same,
> so probably it will make neither of you happy ;-)

I haven't looked at how it's different from what I wrote exactly, but
it seems fine to me.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Allow placeholders in ALTER ROLE w/o superuser

2022-11-30 Thread Alexander Korotkov

On Wed, Nov 23, 2022 at 1:53 AM Steve Chavez  wrote:
> So from my side this all looks good!

Thank you for your feedback.

The next revision of the patch is attached.  It contains code
improvements, comments and documentation.  I'm going to also write
sode tests.  pg_db_role_setting doesn't seem to be well-covered with
tests.  I will probably need to write a new module into
src/tests/modules to check now placeholders interacts with dynamically
defined GUCs.

--
Regards,
Alexander Korotkov


0001-ALTER-ROLE-.-SET-.-TO-.-USER-SET-v2.patch
Description: Binary data

Re: Allow round() function to accept float and double precision

2022-11-30 Thread David Rowley

On Thu, 1 Dec 2022 at 15:41, David G. Johnston
 wrote:
> I don't get the point of adding a function here (or at least one called 
> round) - the type itself is inexact so, as you say, it is actually more of a 
> type conversion with an ability to specify precision, which is exactly what 
> you get today when you write 1.48373::numeric(20,3) - though it is a bit 
> annoying having to specify an arbitrary precision.

An additional problem with that which you might have missed is that
you'd need to know what to specify in the precision part of the
typemod.  You might start getting errors one day if you don't select a
value large enough. That problem does not exist with round().  Having
to specify 131072 each time does not sound like a great solution, it's
not exactly a very memorable number.

David

Re: Allow round() function to accept float and double precision

2022-11-30 Thread David G. Johnston

On Wed, Nov 30, 2022 at 6:45 PM Tom Lane  wrote:

> David Rowley  writes:
>
>
> I'm unsure what the repercussions of the fact that REAL and FLOAT8 are
> > not represented as decimals.
>
> The main thing is that I think the output will still have to be
> NUMERIC, or you're going to get complaints about "inaccurate"
> results.  Before we got around to inventing infinities for NUMERIC,
> that choice would have been problematic, but now I think it's OK.
>
>
I don't get the point of adding a function here (or at least one called
round) - the type itself is inexact so, as you say, it is actually more of
a type conversion with an ability to specify precision, which is exactly
what you get today when you write 1.48373::numeric(20,3) - though it is a
bit annoying having to specify an arbitrary precision.

At present round does allow you to specify a negative position to round at
positions to the left of the decimal point (this is undocumented though...)
which the actual cast cannot do, but that seems like a marginal case.

Maybe call it: make_exact(numeric, integer) ?

I do get a feeling like I'm being too pedantic here though...

David J.

RE: Partial aggregates pushdown

2022-11-30 Thread fujii.y...@df.mitsubishielectric.co.jp

Hi Mr.Pyhalov.

> One more issue I started to think about - now we don't check
> partialagg_minversion for "simple" aggregates at all. Is it correct? It seems 
> that ,
> for example, we could try to pushdown bit_or(int8) to old servers, but it 
> didn't
> exist, for example, in 8.4.  I think it's a broader issue (it would be also 
> the case
> already if we push down
> aggregates) and shouldn't be fixed here. But there is an issue -
> is_shippable() is too optimistic.
I think it is correct for now.
F.38.7 of [1] says "A limitation however is that postgres_fdw generally assumes 
that 
immutable built-in functions and operators are safe to send to the remote 
server for 
execution, if they appear in a WHERE clause for a foreign table." and says that 
we can 
avoid this limitation by rewriting query.
It looks that postgres_fdw follows this policy in case of UPPERREL_GROUP_AGG 
aggregate pushdown.
If a aggreagate has not internal aggtranstype and has not aggfinalfn ,
partialaggfn of it is equal to itself.
So I think that it is adequate to follow this policy in case of partial 
aggregate pushdown
 for such aggregates.

> >> 2) Do we really have to look at pg_proc in partial_agg_ok() and
> >> deparseAggref()? Perhaps, looking at aggtranstype is enough?
> > You are right. I fixed according to your comment.
> >
> 
> partial_agg_ok() still looks at pg_proc.
Sorry for taking up your time. I fixed it.

[1] https://www.postgresql.org/docs/current/postgres-fdw.html

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R Center Mitsubishi Electric Corporation


0001-Partial-aggregates-push-down-v15.patch
Description: 0001-Partial-aggregates-push-down-v15.patch

Re: Add LZ4 compression in pg_dump

2022-11-30 Thread Michael Paquier

On Wed, Nov 30, 2022 at 05:11:44PM +, gkokola...@pm.me wrote:
> Fair enough. The atteched v11 does that. 0001 introduces compression
> specification and is using it throughout. 0002 paves the way to the
> new interface by homogenizing the use of cfp. 0003 introduces the new
> API and stores the compression algorithm in the custom format header
> instead of the compression level integer. Finally 0004 adds support for
> LZ4.

I have been looking at 0001, and..  Hmm.  I am really wondering
whether it would not be better to just nuke this warning into orbit.
This stuff enforces non-compression even if -Z has been used to a
non-default value.  This has been moved to its current location by
cae2bb1 as of this thread:
https://www.postgresql.org/message-id/20160526.185551.242041780.horiguchi.kyotaro%40lab.ntt.co.jp

However, this is only active if -Z is used when not building with
zlib.  At the end, it comes down to whether we want to prioritize the
portability of pg_dump commands specifying a -Z/--compress across
environments knowing that these may or may not be built with zlib,
vs the amount of simplification/uniformity we would get across the
binaries in the tree once we switch everything to use the compression
specifications.  Now that pg_basebackup and pg_receivewal are managed
by compression specifications, and that we'd want more compression
options for pg_dump, I would tend to do the latter and from now on
complain if attempting to do a pg_dump -Z under --without-zlib with a
compression level > 0.  zlib is also widely available, and we don't
document the fact that non-compression is enforced in this case,
either.  (Two TAP tests with the custom format had to be tweaked.)

As per the patch, it is true that we do not need to bump the format of
the dump archives, as we can still store only the compression level
and guess the method from it.  I have added some notes about that in
ReadHead and WriteHead to not forget.

Most of the changes are really-straight forward, and it has resisted
my tests, so I think that this is in a rather-commitable shape as-is.
--
Michael
From a4fa522d0259e8969cde32798a917321cced0415 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Thu, 1 Dec 2022 11:03:41 +0900
Subject: [PATCH v12] Teach pg_dump about compress_spec and use it throughout.

Align pg_dump with the rest of the binaries which use common compression. It is
teaching pg_dump.c about the common compression definitions and interfaces. Then
it propagates those throughout the code.
---
 src/bin/pg_dump/compress_io.c   | 107 
 src/bin/pg_dump/compress_io.h   |  20 ++--
 src/bin/pg_dump/pg_backup.h |   7 +-
 src/bin/pg_dump/pg_backup_archiver.c|  76 +-
 src/bin/pg_dump/pg_backup_archiver.h|  10 +-
 src/bin/pg_dump/pg_backup_custom.c  |   6 +-
 src/bin/pg_dump/pg_backup_directory.c   |  13 ++-
 src/bin/pg_dump/pg_backup_tar.c |  11 +-
 src/bin/pg_dump/pg_dump.c   |  67 +++-
 src/bin/pg_dump/t/001_basic.pl  |  34 +--
 src/bin/pg_dump/t/002_pg_dump.pl|   3 +-
 src/test/modules/test_pg_dump/t/001_base.pl |  16 +++
 doc/src/sgml/ref/pg_dump.sgml   |  34 +--
 src/tools/pgindent/typedefs.list|   1 -
 14 files changed, 242 insertions(+), 163 deletions(-)

diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
index 62f940ff7a..8f0d6d6210 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_io.c
@@ -64,7 +64,7 @@
 /* typedef appears in compress_io.h */
 struct CompressorState
 {
-	CompressionAlgorithm comprAlg;
+	pg_compress_specification compression_spec;
 	WriteFunc	writeF;

 #ifdef HAVE_LIBZ
@@ -74,9 +74,6 @@ struct CompressorState
 #endif
 };

-static void ParseCompressionOption(int compression, CompressionAlgorithm *alg,
-   int *level);
-
 /* Routines that support zlib compressed data I/O */
 #ifdef HAVE_LIBZ
 static void InitCompressorZlib(CompressorState *cs, int level);
@@ -93,57 +90,30 @@ static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
 static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
    const char *data, size_t dLen);

-/*
- * Interprets a numeric 'compression' value. The algorithm implied by the
- * value (zlib or none at the moment), is returned in *alg, and the
- * zlib compression level in *level.
- */
-static void
-ParseCompressionOption(int compression, CompressionAlgorithm *alg, int *level)
-{
-	if (compression == Z_DEFAULT_COMPRESSION ||
-		(compression > 0 && compression <= 9))
-		*alg = COMPR_ALG_LIBZ;
-	else if (compression == 0)
-		*alg = COMPR_ALG_NONE;
-	else
-	{
-		pg_fatal("invalid compression code: %d", compression);
-		*alg = COMPR_ALG_NONE;	/* keep compiler quiet */
-	}
-
-	/* The level is just the passed-in value. */
-	if (level)
-		*level = compression;
-}
-
 /* Public interface routines */

 /*

Re: Allow round() function to accept float and double precision

2022-11-30 Thread Tom Lane

David Rowley  writes:
> We do have some weirdness in some existing overloaded functions.
> pg_size_pretty() is an example.

> If you run: SELECT pg_size_pretty(1000); you get:
> ERROR:  function pg_size_pretty(integer) is not unique

Yeah, you have to be careful about that when proposing to overload
a function name.

> That occurs because we don't know if we should promote the INT into a
> BIGINT or into a NUMERIC. We have a pg_size_pretty() function for each
> of those.  I don't think the same polymorphic type resolution problem
> exists for REAL, FLOAT8 and NUMERIC.

I would counsel against bothering with a REAL version.  FLOAT8 will
cover that case just fine.

> I'm unsure what the repercussions of the fact that REAL and FLOAT8 are
> not represented as decimals.

The main thing is that I think the output will still have to be
NUMERIC, or you're going to get complaints about "inaccurate"
results.  Before we got around to inventing infinities for NUMERIC,
that choice would have been problematic, but now I think it's OK.

regards, tom lane

Re: Allow round() function to accept float and double precision

2022-11-30 Thread David Rowley

On Thu, 1 Dec 2022 at 07:39, Sayyid Ali Sajjad Rizavi
 wrote:
>
> Whenever rounding a number to a fixed number of decimal points in a 
> calculation, we need to cast the number into a numeric before using 
> round((col1/100.0)::numeric, 2).
>
> It would be convenient for everyone if round() also accepts float and double 
> precision.
>
> Is this something I could work with? And is that feasible?

I don't immediately see any issues with adding such a function.

We do have some weirdness in some existing overloaded functions.
pg_size_pretty() is an example.

If you run: SELECT pg_size_pretty(1000); you get:
ERROR:  function pg_size_pretty(integer) is not unique

That occurs because we don't know if we should promote the INT into a
BIGINT or into a NUMERIC. We have a pg_size_pretty() function for each
of those.  I don't think the same polymorphic type resolution problem
exists for REAL, FLOAT8 and NUMERIC. If a literal has a decimal point,
it's a NUMERIC, so it'll just use the numeric version of round().

I'm unsure what the repercussions of the fact that REAL and FLOAT8 are
not represented as decimals.  So I'm not quite sure what real
guarantees there are that the number is printed out with the number of
decimal places that you've rounded the number to.

Doing:

create function round(n float8, d int) returns float8 as $$ begin
return round(n::numeric, d)::float8; end; $$ language plpgsql;

and running things like:

select round(3.333::float8,10);

I'm not seeing any issues.

David

Re: Strange failure on mamba

2022-11-30 Thread Tom Lane

Andres Freund  writes:
> On 2022-11-30 18:33:06 -0500, Tom Lane wrote:
>> Even if somebody comes up with a rewrite to avoid doing interesting stuff in
>> the postmaster's signal handlers, we surely wouldn't risk back-patching it.

> Would that actually fix anything, given netbsd's brokenness? If we used a
> latch like mechanism, the signal handler would still use functions in libc. So
> postmaster could deadlock, at least during the first execution of a signal
> handler? So I think 8acd8f869 continues to be important...

I agree that "-z now" is a good idea for performance reasons, but
what we're seeing is that it's only a partial fix for netbsd's issue,
since it doesn't apply to shared libraries that the postmaster pulls
in.

I'm not sure about your thesis that things are fundamentally broken.
It does seem like if a signal handler does SetLatch then that could
require PLT resolution, and if it interrupts something else doing
PLT resolution then we have a problem.  But if it were a live
problem then we'd have seen instances outside of the postmaster's
select() wait, and we haven't.

I'm kind of inclined to band-aid that select() call as previously
suggested, and see where we end up.

regards, tom lane

Re: Strange failure on mamba

2022-11-30 Thread Andres Freund

Hi,

On 2022-11-30 18:33:06 -0500, Tom Lane wrote:
> Also, I dug into my stuck processes some more, and I have to take
> back the claim that this is happening later than postmaster startup.
> All the stuck children are ones that either are launched on request
> from the startup process, or are launched as soon as we get the
> termination report for the startup process.  So it's plausible that
> the problem is happening during the postmaster's first select()
> wait.  I then got dirty with the assembly code, and found out that
> where the stack trace stops is an attempt to resolve this call:
> 
>0xfd6f7a48 <__select50+76>:  bl  0xfd700ed0 
> <803c.got2.plt_pic32._sys___select50>
> 
> which is inside libpthread.so and is trying to call something in libc.so.
> So we successfully got to the select() function from PostmasterMain, but
> that has a non-prelinked call to someplace else, and kaboom.

This whole area just seems quite broken in netbsd :(.

We're clearly doing stuff in a signal handler that we really shouldn't, but
not being able to call any functions implemented in libc, even if they're
async signal safe (as e.g. select is) means signals are basically not
usable. Afaict this basically means that signals are *never* safe on netbsd,
as long as there's a single external function call in a signal handler.

> I've adjusted mamba to set LD_BIND_NOW=1 in its environment.
> I've verified that that causes the call inside __select50
> to get resolved before we reach main(), so I'm hopeful that
> it will cure the issue.  But it'll probably be a few weeks
> before we can be sure.
> 
> Don't have a good idea about a non-band-aid fix.

It's also a band aid, but perhaps a bit more reliable: We could link
statically to libc and libpthread.

Another approach could be to iterate over the loaded shared libraries during
postmaster startup and force symbols to be resolved. IIRC there's functions
that'd allow that. But it seems like a lot of work to work around an OS bug.

> Perhaps we should revert 8acd8f869 altogether, but then what?

FWIW, I think we should consider using those flags everywhere for the backend
- they make copy-on-write more effective and decrease connection overhead a
bit, because otherwise each backend process does the same symbol resolutions
again and again, dirtying memory post-fork.

> Even if somebody comes up with a rewrite to avoid doing interesting stuff in
> the postmaster's signal handlers, we surely wouldn't risk back-patching it.

Would that actually fix anything, given netbsd's brokenness? If we used a
latch like mechanism, the signal handler would still use functions in libc. So
postmaster could deadlock, at least during the first execution of a signal
handler? So I think 8acd8f869 continues to be important...

Greetings,

Andres Freund

Re: Tests for psql \g and \o

2022-11-30 Thread Michael Paquier

On Wed, Nov 30, 2022 at 12:33:59PM -0600, Justin Pryzby wrote:
> I think you could do that with a perl 0-liner.

Right.  And this could do something similar to
025_stuck_on_old_timeline.pl in terms of finding the binary for perl.
--
Michael


signature.asc
Description: PGP signature

Improve performance of pg_strtointNN functions

2022-11-30 Thread David Rowley

Over on [1], Dean and I were discussing why both gcc and clang don't
seem to want to optimize the multiplication that we're doing in
pg_strtoint16, pg_strtoint32 and pg_strtoint64 into something more
efficient.  It seems that this is down to the overflow checks.
Removing seems to allow the compiler to better optimize the compiled
code.  See the use of LEA instead of IMUL in [2].

Instead of using the pg_mul_sNN_overflow functions, we can just
accumulate the number in an unsigned version of the type and do an
overflow check by checking if the accumulated value has gone beyond a
10th of the maximum *signed* value for the type.  We then just need to
do some final overflow checks after the accumulation is done.  What
those depend on if it's a negative or positive number.

I ran a few microbenchmarks with the attached str2int.c file and I see
about a 10-20% performance increase:

$ ./str2int -1 1
n = -1, e = 1
pg_strtoint32 in 3.207926 seconds
pg_strtoint32_v2 in 2.763062 seconds (16.100399% faster)

v2 is the updated version of the function

On Windows, the gains are generally a bit more. I think this must be
due to the lack of overflow intrinsic functions.

>str2int.exe -1 1
n = -1, e = 1
pg_strtoint32 in 9.416000 seconds
pg_strtoint32_v2 in 8.099000 seconds (16.261267% faster)

I was thinking that we should likely apply this before doing the hex
literals, which is the main focus of [1].  The reason being is so that
that patch can immediately have faster conversions by allowing the
compiler to use bit shifting instead of other means of multiplying by
a power-of-2 number. I'm hoping this removes a barrier for Peter from
the small gripe I raised on that thread about the patch having slower
than required hex, octal and binary string parsing.

David

[1] 
https://postgr.es/m/caaphdvrl6_+wkgpqrhr7gh_6xy3hxm6a3qcsz5forurjdff...@mail.gmail.com
[2] https://godbolt.org/z/7YoMT63q1
#include 
#include 
#include 
#include 
#include 

#define PG_INT32_MIN(-0x7FFF-1)
#define PG_INT32_MAX(0x7FFF)

#define true 1
#define false 0

#ifndef _MSC_VER
#define HAVE__BUILTIN_CLZ
#define HAVE__BUILTIN_OP_OVERFLOW
#define likely(x)   __builtin_expect((x) != 0, 1)
#define unlikely(x) __builtin_expect((x) != 0, 0)
#else
#define likely(x)   x
#define unlikely(x) x
#endif

typedef char bool;
typedef unsigned int uint32;
typedef int int32;
typedef char int8;
typedef unsigned char uint8;
typedef long long int64;

static inline bool
pg_sub_s32_overflow(int32 a, int32 b, int32 *result)
{
#if defined(HAVE__BUILTIN_OP_OVERFLOW)
return __builtin_sub_overflow(a, b, result);
#else
int64   res = (int64) a - (int64) b;

if (res > PG_INT32_MAX || res < PG_INT32_MIN)
{
*result = 0x5EED;   /* to avoid spurious warnings */
return true;
}
*result = (int32) res;
return false;
#endif
}

static inline bool
pg_mul_s32_overflow(int32 a, int32 b, int32 *result)
{
#if defined(HAVE__BUILTIN_OP_OVERFLOW)
return __builtin_mul_overflow(a, b, result);
#else
int64   res = (int64) a * (int64) b;

if (res > PG_INT32_MAX || res < PG_INT32_MIN)
{
*result = 0x5EED;   /* to avoid spurious warnings */
return true;
}
*result = (int32) res;
return false;
#endif
}

int32
pg_strtoint32(const char *s)
{
const char *ptr = s;
int32   tmp = 0;
boolneg = false;

/* skip leading spaces */
while (likely(*ptr) && isspace((unsigned char) *ptr))
ptr++;

/* handle sign */
if (*ptr == '-')
{
ptr++;
neg = true;
}
else if (*ptr == '+')
ptr++;

/* require at least one digit */
if (unlikely(!isdigit((unsigned char) *ptr)))
goto invalid_syntax;

while (*ptr && isdigit((unsigned char) *ptr))
{
int8digit = (*ptr++ - '0');

if (unlikely(pg_mul_s32_overflow(tmp, 10, )) ||
unlikely(pg_sub_s32_overflow(tmp, digit, )))
goto out_of_range;
}
/* allow trailing whitespace, but not other trailing chars */
while (*ptr != '\0' && isspace((unsigned char) *ptr))
ptr++;

if (unlikely(*ptr != '\0'))
goto invalid_syntax;

if (!neg)
{
/* could fail if input is most negative number */
if (unlikely(tmp == PG_INT32_MIN))
goto out_of_range;
tmp = -tmp;
}

return tmp;

out_of_range:
fprintf(stderr, "value \"%s\" is out of range for type %s",
s, "integer");

invalid_syntax:
fprintf(stderr, "invalid

Re: Non-decimal integer literals

2022-11-30 Thread Tom Lane

David Rowley  writes:
> I agree that it should be a separate patch.  But thinking about what
> Tom mentioned in [1], I had in mind this patch would need to wait
> until the new standard is out so that we have a more genuine reason
> for breaking existing queries.

Well, we already broke them in v15: that example now gives

regression=# select 0x42e;
ERROR:  trailing junk after numeric literal at or near "0x"
LINE 1: select 0x42e;
   ^

So there's probably no compatibility reason not to drop the
other shoe.

regards, tom lane

Re: Strange failure on mamba

2022-11-30 Thread Tom Lane

Andres Freund  writes:
> On 2022-11-30 00:55:42 -0500, Tom Lane wrote:
>> Googling LD_BIND_NOW suggests that that's a Linux thing; do you know that
>> it should have an effect on NetBSD?

> I'm not at all sure it does, but I did see it listed in
> https://man.netbsd.org/ld.elf_so.1
>  LD_BIND_NOW  If defined immediate binding of Procedure Link Table
>   (PLT) entries is performed instead of the default lazy
>   method.

I checked the source code, and learned that (1) yes, rtld does pay
attention to this, and (2) the documentation lies: it has to be not
only defined, but nonempty, to get any effect.

Also, I dug into my stuck processes some more, and I have to take
back the claim that this is happening later than postmaster startup.
All the stuck children are ones that either are launched on request
from the startup process, or are launched as soon as we get the
termination report for the startup process.  So it's plausible that
the problem is happening during the postmaster's first select()
wait.  I then got dirty with the assembly code, and found out that
where the stack trace stops is an attempt to resolve this call:

   0xfd6f7a48 <__select50+76>:  bl  0xfd700ed0 
<803c.got2.plt_pic32._sys___select50>

which is inside libpthread.so and is trying to call something in libc.so.
So we successfully got to the select() function from PostmasterMain, but
that has a non-prelinked call to someplace else, and kaboom.

In short, looks like Andres' theory is right.  It means that 8acd8f869
didn't actually fix anything, though it reduced the probability of the
failure by reducing the number of vulnerable PLT-indirect calls.

I've adjusted mamba to set LD_BIND_NOW=1 in its environment.
I've verified that that causes the call inside __select50
to get resolved before we reach main(), so I'm hopeful that
it will cure the issue.  But it'll probably be a few weeks
before we can be sure.

Don't have a good idea about a non-band-aid fix.  Perhaps we
should revert 8acd8f869 altogether, but then what?  Even if
somebody comes up with a rewrite to avoid doing interesting
stuff in the postmaster's signal handlers, we surely wouldn't
risk back-patching it.

It's possible that doing nothing is okay, at least in the
short term.  It's probably nigh impossible to hit this
issue on modern multi-CPU hardware.  Or perhaps we could revive
the idea of having postmaster.c do one dummy select() call
before it unblocks signals.

regards, tom lane

Re: Non-decimal integer literals

2022-11-30 Thread David Rowley

On Thu, 1 Dec 2022 at 00:34, Dean Rasheed  wrote:
> So something
> like:
>
> // Accumulate positive value using unsigned int, with approximate
> // overflow check. If acc >= 1 - INT_MIN / 10, then acc * 10 is
> // sure to exceed -INT_MIN.
> unsigned int cutoff = 1 - INT_MIN / 10;
> unsigned int acc = 0;
>
> while (*ptr && isdigit((unsigned char) *ptr))
> {
> if (unlikely(acc >= cutoff))
> goto out_of_range;
> acc = acc * 10 + (*ptr - '0');
> ptr++;
> }
>
> and similar for other bases, allowing the coding for all bases to be
> kept similar.

Seems like a good idea to me. Couldn't the cutoff check just be "acc >
INT_MAX / 10"?

> I think it's probably best to consider this as a follow-on patch
> though. It shouldn't delay getting the main feature committed.

I agree that it should be a separate patch.  But thinking about what
Tom mentioned in [1], I had in mind this patch would need to wait
until the new standard is out so that we have a more genuine reason
for breaking existing queries.

I've drafted up a full patch for improving the current base-10 code,
so I'll go post that on another thread.

David

[1] https://postgr.es/m/3260805.1631106...@sss.pgh.pa.us

Re: pgsql: Revoke PUBLIC CREATE from public schema, now owned by pg_databas

2022-11-30 Thread David G. Johnston

On Wed, Nov 30, 2022 at 3:35 PM Tom Lane  wrote:

>
> BTW, is "create a schema with the same name" sufficient detail?
> You have to either make it owned by that user, or explicitly
> grant CREATE permission on it.  I'm not sure if that detail
> belongs here, but it feels like maybe it does.
>
>
I'd mention the ownership variant and suggest using the AUTHORIZATION
clause, with an explicit example.

CREATE SCHEMA role_name AUTHORIZATION role_name;

David J.

Re: pgsql: Revoke PUBLIC CREATE from public schema, now owned by pg_databas

2022-11-30 Thread Isaac Morland

On Wed, 30 Nov 2022 at 17:35, Tom Lane  wrote:

BTW, is "create a schema with the same name" sufficient detail?
> You have to either make it owned by that user, or explicitly
> grant CREATE permission on it.  I'm not sure if that detail
> belongs here, but it feels like maybe it does.

It might be worth mentioning AUTHORIZATION. The easiest way to create an
appropriately named schema for a user is "CREATE SCHEMA AUTHORIZATION
username".

Re: heapgettup refactoring

2022-11-30 Thread Melanie Plageman

On Wed, Nov 16, 2022 at 10:49 AM Peter Eisentraut
 wrote:
>
> On 04.11.22 16:51, Melanie Plageman wrote:
> > Thanks for the review!
> > Attached is v2 with feedback addressed.
>
> Your 0001 had already been pushed.
>
> I have pushed your 0002.
>
> I have also pushed the renaming of page -> block, dp -> page separately.
>   This should reduce the size of your 0003 a bit.
>
> Please produce an updated version of the 0003 page for further review.

Thanks for looking at this!
I have attached a patchset with only the code changes contained in the
previous patch 0003. I have broken the refactoring down into many
smaller pieces for ease of review.

- Melanie
From 2e63656e75b00b24d286533ae8c8ef291876d4a8 Mon Sep 17 00:00:00 2001
From: Melanie Plageman 
Date: Wed, 30 Nov 2022 14:18:08 -0500
Subject: [PATCH v3 5/7] Add heapgettup() helpers

Add heapgettup_start_page() and heapgettup_continue_page(), helper
functions for heapgettup().
---
 src/backend/access/heap/heapam.c | 93 +---
 1 file changed, 61 insertions(+), 32 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index a4365beee1..f9b2d5cadb 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -575,6 +575,65 @@ heapgettup_initial_page(HeapScanDesc scan, ScanDirection dir)
 	return scan->rs_nblocks - 1;
 }
 
+static inline Page
+heapgettup_start_page(HeapScanDesc scan, BlockNumber block, ScanDirection dir,
+		int *linesleft, OffsetNumber *lineoff)
+{
+	Page page;
+
+	Assert(scan->rs_inited);
+	Assert(BufferIsValid(scan->rs_cbuf));
+
+	/* Caller is responsible for ensuring buffer is locked if needed */
+	page = BufferGetPage(scan->rs_cbuf);
+
+	TestForOldSnapshot(scan->rs_base.rs_snapshot, scan->rs_base.rs_rd, page);
+
+	*linesleft = PageGetMaxOffsetNumber((Page) page) - FirstOffsetNumber + 1;
+
+	if (ScanDirectionIsForward(dir))
+		*lineoff = FirstOffsetNumber;
+	else
+		*lineoff = (OffsetNumber) (*linesleft);
+
+	/* lineoff now references the physically previous or next tid */
+	return page;
+}
+
+static inline Page
+heapgettup_continue_page(HeapScanDesc scan, BlockNumber block, ScanDirection
+		dir, int *linesleft, OffsetNumber *lineoff)
+{
+	Page page;
+
+	Assert(scan->rs_inited);
+	Assert(BufferIsValid(scan->rs_cbuf));
+
+	/* Caller is responsible for ensuring buffer is locked if needed */
+	page = BufferGetPage(scan->rs_cbuf);
+
+	TestForOldSnapshot(scan->rs_base.rs_snapshot, scan->rs_base.rs_rd, page);
+
+	if (ScanDirectionIsForward(dir))
+	{
+		*lineoff = OffsetNumberNext(scan->rs_cindex);
+		*linesleft = PageGetMaxOffsetNumber(page) - (*lineoff) + 1;
+	}
+	else
+	{
+		/*
+		* The previous returned tuple may have been vacuumed since the
+		* previous scan when we use a non-MVCC snapshot, so we must
+		* re-establish the lineoff <= PageGetMaxOffsetNumber(page)
+		* invariant
+		*/
+		*lineoff = Min(PageGetMaxOffsetNumber(page), OffsetNumberPrev(scan->rs_cindex));
+		*linesleft = *lineoff;
+	}
+	/* block and lineoff now reference the physically next tid */
+	return page;
+}
+
 /* 
  *		heapgettup - fetch next heap tuple
  *
@@ -632,45 +691,15 @@ heapgettup(HeapScanDesc scan,
 		}
 
 		heapgetpage((TableScanDesc) scan, block);
-
 		LockBuffer(scan->rs_cbuf, BUFFER_LOCK_SHARE);
-		page = BufferGetPage(scan->rs_cbuf);
-		TestForOldSnapshot(snapshot, scan->rs_base.rs_rd, page);
-
-		linesleft = PageGetMaxOffsetNumber((Page) page) - FirstOffsetNumber + 1;
-
-		if (ScanDirectionIsForward(dir))
-			lineoff = FirstOffsetNumber;
-		else
-			lineoff = (OffsetNumber) linesleft;
+		page = heapgettup_start_page(scan, block, dir, , );
 	}
 	else
 	{
 		block = scan->rs_cblock; /* current page */
 
 		LockBuffer(scan->rs_cbuf, BUFFER_LOCK_SHARE);
-
-		page = BufferGetPage(scan->rs_cbuf);
-		TestForOldSnapshot(snapshot, scan->rs_base.rs_rd, page);
-
-		if (ScanDirectionIsForward(dir))
-		{
-			lineoff = OffsetNumberNext(scan->rs_cindex);
-			/* block and lineoff now reference the physically next tid */
-			linesleft = PageGetMaxOffsetNumber(page) - lineoff + 1;
-		}
-		else
-		{
-			/*
-			 * The previous returned tuple may have been vacuumed since the
-			 * previous scan when we use a non-MVCC snapshot, so we must
-			 * re-establish the lineoff <= PageGetMaxOffsetNumber(page)
-			 * invariant
-			 */
-			lineoff = Min(PageGetMaxOffsetNumber(page), OffsetNumberPrev(scan->rs_cindex));
-			/* block and lineoff now reference the physically previous tid */
-			linesleft = lineoff;
-		}
+		page = heapgettup_continue_page(scan, block, dir, , );
 	}
 
 	/*
-- 
2.34.1

From 8d4106f9f045999c5b3432fc19ee8f216b85df2d Mon Sep 17 00:00:00 2001
From: Melanie Plageman 
Date: Wed, 30 Nov 2022 10:42:12 -0500
Subject: [PATCH v3 2/7] Push lpp variable closer to usage in heapgetpage()

---
 src/backend/access/heap/heapam.c | 41 
 1 file changed, 20 insertions(+), 21 deletions(-)

diff --git a/src/backend/access/heap/heapam.c

Re: pgsql: Revoke PUBLIC CREATE from public schema, now owned by pg_databas

2022-11-30 Thread Tom Lane

Robert Haas  writes:
> On Wed, Nov 30, 2022 at 10:01 AM Noah Misch  wrote:
>> Could remove the paragraph about v14.  Could have that paragraph say
>> explicitly that the REVOKE is a no-op.  Would either of those be an
>> improvement?

> Well, I thought what I proposed was a nice improvement, but I guess if
> you don't like it I'm not inclined to spend a lot of time discussing
> other possibilities. If we get some opinions from more people that may
> make it clearer which direction to go; if I'm the only one that
> doesn't like the way it is now, it's probably not that important.

Hey, I'll step up to the plate ;-)

I agree that it's confusing to tell people to do a REVOKE that might do
nothing.  A parenthetical note explaining that might help, but the text
is pretty dense already, so really I'd rather have that info in a
separate para.

Also, I'd like to structure things so that the first para covers what
you need to know in a clean v15+ installation, and details that only
apply in upgrade scenarios are in the second para.  The upgrade scenario
is going to be interesting to fewer and fewer people over time, so let's
not clutter the lede with it.

So maybe about like this?

Constrain ordinary users to user-private schemas.  To implement
this pattern, for every user needing to create non-temporary
objects, create a schema with the same name as that user.  (Recall
that the default search path starts with $user, which resolves to
the user name. Therefore, if each user has a separate schema, they
access their own schemas by default.)  Also ensure that no other
schemas have public CREATE privileges.  This pattern is a secure
schema usage pattern unless an untrusted user is the database
owner or holds the CREATEROLE privilege, in which case no secure
schema usage pattern exists.

In PostgreSQL 15 and later, the default configuration supports
this usage pattern.  In prior versions, or when using a database
that has been upgraded from a prior version, you will need to
remove the public CREATE privilege from the public schema (issue
REVOKE CREATE ON SCHEMA public FROM PUBLIC).  Then consider
auditing the public schema for objects named like objects in
schema pg_catalog.

This is close to what Robert wrote, but not exactly the same,
so probably it will make neither of you happy ;-)

BTW, is "create a schema with the same name" sufficient detail?
You have to either make it owned by that user, or explicitly
grant CREATE permission on it.  I'm not sure if that detail
belongs here, but it feels like maybe it does.

regards, tom lane

Re: Patch: Global Unique Index

2022-11-30 Thread Greg Stark

On Tue, 29 Nov 2022 at 21:16, Tom Lane  wrote:
>
> I actually think that that problem should be soluble with a
> slightly different approach.  The thing that feels insoluble
> is that you can't do this without acquiring sufficient locks
> to prevent addition of new partitions while the insertion is
> in progress.  That will be expensive in itself, and it will
> turn ATTACH PARTITION into a performance disaster.

I think there`s a lot of room to manoeuvre here. This is a new feature
that doesn't need to be 100% complete or satisfy any existing
standard. There are lots of options for compromises that leave room
for future improvements.

1) We could just say sure ATTACH is slow if you're attaching an
non-empty partition
2) We could invent a concept like convalidated and let people attach a
partition without validating the uniqueness and then validate it later
concurrently
3) We could say ATTACH doesn't work now and come up with a better
strategy in the future

Also, don't I vaguely recall something in exclusion constraints about
having some kind of in-memory "intent" list where you declared that
you're about to insert a value, you validate it doesn't violate the
constraint and then you're free to insert it because anyone else will
see your intent in memory? There might be a need for some kind of
global object that only holds inserted keys long enough that other
sessions are guaranteed to see the key in the correct index. And that
could maybe even be in memory rather than on disk.

This isn't a simple project but I don't think it's impossible as long
as we keep an open mind about the requirements.

--
greg

Re: Enable pg_stat_statements extension for limited statements only

2022-11-30 Thread Sayyid Ali Sajjad Rizavi

Yes, I agree that infrequent statements don't need stats. Actually I was
distracted with the use case that I had in mind other than stats, maybe
bringing that up will help.

If someone's interested how frequent are deletes being run on a particular
table, or what was the exact query that ran. Basically keeping track of
queries. Although now I'm less convinced if a considerable amount of people
will be interested in this, but let me know what you think.


On Wed, Nov 30, 2022 at 10:15 AM Tom Lane  wrote:

> Sayyid Ali Sajjad Rizavi  writes:
> > Hi, I'd like to propose a change and get advice if I should work on it.
> > The extension pg_stat_statements is very helpful, but the downside is
> that
> > it will take up too much disk space when storing query stats if it's
> > enabled for all statements like SELECT, INSERT, UPDATE, DELETE.
>
> It will only take up a lot of disk space if you let it, by setting
> the pg_stat_statements.max parameter too high.
>
> > For example, deletes do not happen too frequently; so I'd like to be able
> > to enable pg_stat_statements only for the DELETE statement, maybe using
> > some flags.
>
> I'm a little skeptical of the value of that.  Why would you want stats
> only for infrequent statements?
>
> I'm not denying that there might be usefulness in filtering what
> pg_stat_statements will track, but it's not clear to me that
> this particular proposal will be useful to many people.
>
> I wonder whether there would be more use in filters expressed
> as regular expressions to match against the statement text.
> That would allow, for example, tracking statements that mention
> a particular table as well as statements with a particular
> head keyword.  I could see usefulness in both a positive filter
> (must match this to get tracked) and a negative one (must not
> match this to get tracked).
>
> regards, tom lane
>

Allow round() function to accept float and double precision

2022-11-30 Thread Sayyid Ali Sajjad Rizavi

Whenever rounding a number to a fixed number of decimal points in a
calculation, we need to cast the number into a numeric before using
round((col1/100.0)::numeric, 2).

It would be convenient for everyone if round() also accepts float and
double precision.

Is this something I could work with? And is that feasible?

Re: Tests for psql \g and \o

2022-11-30 Thread Justin Pryzby

On Wed, Nov 30, 2022 at 02:50:16PM +0900, Michael Paquier wrote:
> On Wed, Nov 23, 2022 at 09:18:57PM +0100, Daniel Verite wrote:
> > PFA a new patch addressing these issues.
> 
> Thanks, the tests part of the main regression test suite look good to
> me, so I have applied them after fixing a few typos and tweaking the
> style of the test.  Regarding the tests with pipes, I had cold feet
> with the dependencies on cat for non-WIN32 or findstr for WIN32.

I think you could do that with a perl 0-liner.

$ echo foo |perl -pe ''
foo

-- 
Justin

Re: Tests for psql \g and \o

2022-11-30 Thread Daniel Verite

Michael Paquier wrote:

> Thanks, the tests part of the main regression test suite look good to
> me, so I have applied them after fixing a few typos and tweaking the
> style of the test.

Thanks!

> Regarding the tests with pipes, I had cold feet with the
> dependencies on cat for non-WIN32 or findstr for WIN32.

OK. If the issue is that these programs might be missing, I guess
we could check that beforehand with IPC::Run and skip the
corresponding psql tests if they're not available or not working
as expected.

Best regards,
-- 
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite

Re: Enable pg_stat_statements extension for limited statements only

2022-11-30 Thread Tom Lane

Sayyid Ali Sajjad Rizavi  writes:
> Hi, I'd like to propose a change and get advice if I should work on it.
> The extension pg_stat_statements is very helpful, but the downside is that
> it will take up too much disk space when storing query stats if it's
> enabled for all statements like SELECT, INSERT, UPDATE, DELETE.

It will only take up a lot of disk space if you let it, by setting
the pg_stat_statements.max parameter too high.

> For example, deletes do not happen too frequently; so I'd like to be able
> to enable pg_stat_statements only for the DELETE statement, maybe using
> some flags.

I'm a little skeptical of the value of that.  Why would you want stats
only for infrequent statements?

I'm not denying that there might be usefulness in filtering what
pg_stat_statements will track, but it's not clear to me that
this particular proposal will be useful to many people.

I wonder whether there would be more use in filters expressed
as regular expressions to match against the statement text.
That would allow, for example, tracking statements that mention
a particular table as well as statements with a particular
head keyword.  I could see usefulness in both a positive filter
(must match this to get tracked) and a negative one (must not
match this to get tracked).

regards, tom lane

Re: generic plans and "initial" pruning

2022-11-30 Thread Alvaro Herrera

Looking at 0001, I wonder if we should have a crosscheck that a
PartitionPruneInfo you got from following an index is indeed constructed
for the relation that you think it is: previously, you were always sure
that the prune struct is for this node, because you followed a pointer
that was set up in the node itself.  Now you only have an index, and you
have to trust that the index is correct.

I'm not sure how to implement this, or even if it's doable at all.
Keeping the OID of the partitioned table in the PartitionPruneInfo
struct is easy, but I don't know how to check it in ExecInitMergeAppend
and ExecInitAppend.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Find a bug in a program, and fix it, and the program will work today.
Show the program how to find and fix a bug, and the program
will work forever" (Oliver Silfridge)

Enable pg_stat_statements extension for limited statements only

2022-11-30 Thread Sayyid Ali Sajjad Rizavi

Hi, I'd like to propose a change and get advice if I should work on it.

The extension pg_stat_statements is very helpful, but the downside is that
it will take up too much disk space when storing query stats if it's
enabled for all statements like SELECT, INSERT, UPDATE, DELETE.

For example, deletes do not happen too frequently; so I'd like to be able
to enable pg_stat_statements only for the DELETE statement, maybe using
some flags.

Another possibility is if we can limit the tables to which
pg_stat_statements logs
results.

Re: New docs chapter on Transaction Management and related changes

2022-11-30 Thread Tom Lane

Alvaro Herrera  writes:
> I find it a bit shocking to have had it backpatched, even to 15 -- a
> whole chapter in the documentation?  I don't see why it wouldn't be
> treated like any other "major feature" patch, which we only consider for
> the development branch.  Also, this is a first cut -- presumably we'll
> want to copy-edit it before it becomes released material.

I think that last point is fairly convincing.  I've not read the
new material, but I didn't get further than the first line of
the new chapter file before noting a copy-and-paste error:

--- /dev/null
+++ b/doc/src/sgml/xact.sgml
@@ -0,0 +1,205 @@
+

That doesn't leave me with a warm feeling that it's ready to ship.
I too vote for reverting it out of the released branches.

regards, tom lane

Re: New docs chapter on Transaction Management and related changes

2022-11-30 Thread Alvaro Herrera

On 2022-Nov-30, Bruce Momjian wrote:

> On Wed, Nov 30, 2022 at 07:33:44AM +0100, Peter Eisentraut wrote:
> > On 30.11.22 02:51, Bruce Momjian wrote:
> > > Patch applied back to PG 11.  Thanks to Simon for getting this important
> > > information in our docs, and for the valuable feedback from others that
> > > made this even better.
> > 
> > I request that the backpatching of this be reverted.  We don't want to have
> > major chapter numbers in the documentation changing between minor releases.
> 
> Uh, how do others feel about this?  I wanted to get this information to
> all our supported releases as soon as possible, rather than waiting for
> PG 16 and for people to upgrade.

I find it a bit shocking to have had it backpatched, even to 15 -- a
whole chapter in the documentation?  I don't see why it wouldn't be
treated like any other "major feature" patch, which we only consider for
the development branch.  Also, this is a first cut -- presumably we'll
want to copy-edit it before it becomes released material.

Now, keep in mind that not having it backpatched does not mean that it
is invisible to users.  It is definitely visible, if they use the doc
URL with /devel/ in it.  And this information has been missing for 20+
years, how come it is so urgent to have it everywhere now?

I agree that it should be reverted from all branches other than master.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/

Re: pg_regress/pg_isolation_regress: Fix possible nullptr dereference.

2022-11-30 Thread Tom Lane

Xing Guo  writes:
> While playing with pg_regress and pg_isolation_regress, I noticed that
> there's a potential nullptr deference in both of them.
> How to reproduce:
> Specify the `--dbname=` option without providing any database name.

Hmm, yeah, I see that too.

> Patch is attached.

This patch seems like a band-aid, though.  The reason nobody's
noticed this for decades is that it doesn't make a lot of sense
to allow tests to run in your default database: the odds of them
screwing up something valuable are high, and the odds that they'll
fail if started in a nonempty database are even higher.

I think the right answer is to treat it as an error if we end up
with an empty dblist (or even a zero-length name).

regards, tom lane

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-30 Thread Peter Geoghegan

On Wed, Nov 30, 2022 at 8:13 AM Robert Haas  wrote:
> I haven't checked the patches to see whether they look correct, and
> I'm concerned in particular about upgrade scenarios. But if there's a
> way we can get that part committed, I think it would be a clear win.

+1

-- 
Peter Geoghegan

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-30 Thread Robert Haas

On Tue, Nov 29, 2022 at 9:35 PM Chris Travers  wrote:
> My proposal would be to make the threshold configurable and start warning on 
> every transaction after that.  There are a couple reasons to do that.
>
> The first is that noisy warnings are extremely easy to see.  You get them in 
> cron emails, from psql, in the db logs etc.  Having them every million makes 
> them harder to catch.
>
> The point here is not to ensure there are no problems, but to make sure that 
> an existing layer in the current swiss cheese model of safety doesn't go 
> away.  Will it stop all problems?  No.  But the current warning strategy is 
> effective, given how many times we hear of cases of people having to take 
> drastic action to avoid impending xid wraparound.
>
> If someone has an insert only database and maye doesn't want to ever freeze, 
> they can set the threshold to -1 or something.  I would suggest keeping the 
> default as at 2 billion to be in line with existing limitations and 
> practices.  People can then adjust as they see fit.
>
> Warning text might be something like "XID Lag Threshold Exceeded.  Is 
> autovacuum clearing space and keeping up?"

None of this seems unreasonable to me. If we want to allow more
configurability, we could also let you choose the threshold and the
frequency of warnings (every N transactions).

But, I think we might be getting down a little bit in the weeds. It's
not clear that everybody's on board with the proposed page format
changes. I'm not completely opposed, but I'm also not wild about the
approach. It's probably not a good idea to spend all of our energy
debating the details of how to reform xidWrapLimit without having some
consensus on those points. It is, in a word, bikeshedding: on-disk
page format changes are hard, but everyone understands warning
messages.

Lest we miss the forest for the trees, there is an aspect of this
patch that I find to be an extremely good idea and think we should try
to get committed even if the rest of the patch set ends up in the
rubbish bin. Specifically, there are a couple of patches in here that
have to do with making SLRUs indexed by 64-bit integers rather than by
32-bit integers. We've had repeated bugs in the area of handling SLRU
wraparound in the past, some of which have caused data loss. Just by
chance, I ran across a situation just yesterday where an SLRU wrapped
around on disk for reasons that I don't really understand yet and
chaos ensued. Switching to an indexing system for SLRUs that does not
ever wrap around would probably enable us to get rid of a whole bunch
of crufty code, and would also likely improve the general reliability
of the system in situations where wraparound is threatened. It seems
like a really, really good idea.

I haven't checked the patches to see whether they look correct, and
I'm concerned in particular about upgrade scenarios. But if there's a
way we can get that part committed, I think it would be a clear win.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-30 Thread Masahiko Sawada

On Wed, Nov 23, 2022 at 2:10 AM Andres Freund  wrote:
>
> On 2022-11-21 17:06:56 +0900, Masahiko Sawada wrote:
> > Sure. I've attached the v10 patches. 0004 is the pure refactoring
> > patch and 0005 patch introduces the pointer tagging.
>
> This failed on cfbot, with som many crashes that the VM ran out of disk for
> core dumps. During testing with 32bit, so there's probably something broken
> around that.
>
> https://cirrus-ci.com/task/4635135954386944
>
> A failure is e.g. at: 
> https://api.cirrus-ci.com/v1/artifact/task/4635135954386944/testrun/build-32/testrun/adminpack/regress/log/initdb.log
>
> performing post-bootstrap initialization ... 
> ../src/backend/lib/radixtree.c:1696:21: runtime error: member access within 
> misaligned address 0x590faf74 for type 'struct radix_tree_control', which 
> requires 8 byte alignment
> 0x590faf74: note: pointer points here
>   90 11 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  
> 00 00 00 00 00 00 00 00
>   ^

radix_tree_control struct has two pg_atomic_uint64 variables, and the
assertion check in pg_atomic_init_u64() failed:

static inline void
pg_atomic_init_u64(volatile pg_atomic_uint64 *ptr, uint64 val)
{
/*
 * Can't necessarily enforce alignment - and don't need it - when using
 * the spinlock based fallback implementation. Therefore only assert when
 * not using it.
 */
#ifndef PG_HAVE_ATOMIC_U64_SIMULATION
AssertPointerAlignment(ptr, 8);
#endif
pg_atomic_init_u64_impl(ptr, val);
}

I've investigated this issue and have a question about using atomic
variables on palloc'ed memory. In non-parallel vacuum cases,
radix_tree_control is allocated via aset.c. IIUC in 32-bit machines,
the memory allocated by aset.c is 4-bytes aligned so these atomic
variables are not always 8-bytes aligned. Is there any way to enforce
8-bytes aligned memory allocations in 32-bit machines?

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Add sub-transaction overflow status in pg_stat_activity

2022-11-30 Thread Robert Haas

On Wed, Nov 23, 2022 at 3:56 PM Andres Freund  wrote:
> Indeed. This is why I was thinking that just alerting for overflowed xact
> isn't particularly helpful. You really want to see how much they overflow and
> how often.

I think if we just expose the is-overflowed feld and the count, people
can poll. It works fine for wait events and I think it's fine here,
too.

> But even that might not be that helpful. Perhaps what we actually need is an
> aggregate measure showing the time spent doing subxact lookups due to
> overflowed snapshots? Seeing a substantial amount of time spent doing subxact
> lookups would be much more accurate call to action than seeing a that some
> sessions have a lot of subxacts.

That's not responsive to the need that I have. I need users to be able
to figure out which backend(s) are overflowing their snapshots -- and
perhaps how badly and how often --- not which backends are incurring
an expense as a result. There may well be a use case for the latter
thing but it's a different problem.

> I wonder if we could lower the impact of suboverflowed snapshots by improving
> the representation in PGPROC and SnapshotData. What if we
>
> a) Recorded the min and max assigned subxid in PGPROC
>
> b) Instead of giving up in GetSnapshotData() once we see a suboverflowed
>PGPROC, store the min/max subxid of the proc in SnapshotData. We could
>reliably "steal" space for that from ->subxip, as we won't need to store
>subxids for that proc.
>
> c) When determining visibility with a suboverflowed snapshot we use the
>ranges from b) to check whether we need to do a subtrans lookup. I think
>that'll often prevent subtrans lookups.

Wouldn't you basically need to take the union of all the ranges,
probably by keeping the lowest min and the highest max? I'm not sure
how much that would really help at that point.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: pg_dump bugs reported as pg_upgrade bugs

2022-11-30 Thread Robert Haas

On Wed, Nov 30, 2022 at 12:23 AM Tom Lane  wrote:
> pg_dump scripts are *designed* to be tolerant of errors, mainly so
> that you can restore into a situation that's not exactly like where
> you dumped from, with the possible need to resolve errors or decide
> that they're not problems.  So your depiction of what happens in
> dump/restore is not showing a problem; it's about using those tools
> as they were intended to be used.
>
> Indeed, there's a bit of disconnect there with pg_upgrade, which would
> like to present a zero-user-involvement, nothing-to-see-here facade.

Yes. I think it's good that the pg_dump scripts are designed to be
tolerant of errors, but I also think that we've got to clearly
envisage that error-tolerance as the backup plan. Fifteen years ago,
it may have been acceptable to imagine that every dump-and-restore was
going to be performed by a human being who could make an intelligent
judgement about whether the errors that occurred were concerning or
not, but today, at the scale that PostgreSQL is being used, that's not
realistic. Unattended operation is common, and the number of instances
vastly outstrips the number of people who are truly knowledgeable
about the internals. The goalposts have moved because the project is
successful and widely adopted. All of this is true even apart from
pg_upgrade, but the existence of pg_upgrade and the fact that
pg_upgrade is the only way to perform a quick major version upgrade
exacerbates the problem quite a bit.

I don't know what consequences this has concretely, really. I have no
specific change to propose. I just think that we need to wrench
ourselves out of a mind-set where we imagine that some errors are OK
because the DBA will know how to fix things up. The DBA is a script.
If there's a knowledgeable person at all they have 10,000 instances to
look after and don't have time to fiddle with each one. The aspects of
PostgreSQL that tend to require manual fiddling (HA, backups,
upgrades, autovacuum) are huge barriers to wider adoption and
large-scale deployment in a way that probably just wasn't true when
the project wasn't as successful as it now is.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: pgsql: Revoke PUBLIC CREATE from public schema, now owned by pg_databas

2022-11-30 Thread Robert Haas

On Wed, Nov 30, 2022 at 10:01 AM Noah Misch  wrote:
> On Wed, Nov 30, 2022 at 08:39:23AM -0500, Robert Haas wrote:
> > On Wed, Nov 30, 2022 at 2:07 AM Noah Misch  wrote:
> > > In general, the documentation should prefer simpler decision trees.
> >
> > True, but I found the current text confusing, which is also something
> > to consider.
>
> Could remove the paragraph about v14.  Could have that paragraph say
> explicitly that the REVOKE is a no-op.  Would either of those be an
> improvement?

Well, I thought what I proposed was a nice improvement, but I guess if
you don't like it I'm not inclined to spend a lot of time discussing
other possibilities. If we get some opinions from more people that may
make it clearer which direction to go; if I'm the only one that
doesn't like the way it is now, it's probably not that important.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: New docs chapter on Transaction Management and related changes

2022-11-30 Thread Bruce Momjian

On Wed, Nov 30, 2022 at 08:25:19AM -0700, David G. Johnston wrote:
> On Wed, Nov 30, 2022 at 8:02 AM Bruce Momjian  wrote:
> On Wed, Nov 30, 2022 at 07:10:35AM -0700, David G. Johnston wrote:
> If everyone agrees this new chapter is helpful, and as helpful to PG 11
> users as PG 16 users, why would we not give users this information in
> our docs now?  What is the downside?  Chapter numbers?  Translations?
> 
> Admittedly the policy is more about "we don't expend any effort to write back
> branch patches for this kind of material" rather than "we don't do back
> patching because it causes problems".  But it is an existing policy, applied
> consistently through the years, and treating the documentation like a book,
> even though it is published in a non-physical medium, is a reasonable 
> guideline
> to follow.

Well, we have not backpatched cosmetic or wording improvements into back
branches, usually, though unclear wording has been backpatched because
the value is more significant than the disruption.  I think we look at
doc backpatching on an individual basis, because there are limited
risks, unlike code changes.

> My desire to get it out in an official release early goes against this policy,
> and I'm fine waiting for v16 on that basis.  The only reason I'm good with
> updating v15 is that I basically consider anything in the first 3 point
> releases of a major version to be a "delta" release.

Well, yeah, I think the PG 15-16 is kind of an odd approach, though I
can see the value of doing that since we could say anyone who cares
about these details should be on the most recent major release.  I think
you are reinforcing my basic approach that doc changes can't have a
simple blanket policy, unlike code, because of the limited risks and
significan value.

> One back-patchable idea to consider would be adding a note at the top of the
> page(s) highlighting the fact that said material has been superseded by more
> current documentation, with a link.  But the idea of changing long-released
> (see my delta comment above) material doesn't sit well with me or the policy.

Uh, I don't share that concern, as long as it is mentioned in the minor
release notes.

> I assume this new chapter would be mentioned in the minor release notes.
> 
> We don't do release notes for documentation changes.

Uh, I certainly have done it for significant doc improvements in major
releases, so I don't see a problem in doing it for minor releases,
especially since this information has been needed in our docs for years.

What I am basically saying is that "we have always done it that way" is
an insufficient reason for me --- we should have some logic for _why_ we
have a policy, and I am not seeing that here.

This is obviously a bigger issue than this particular patch.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

Embrace your flaws.  They make you human, rather than perfect,
which you will never be.

Re: New docs chapter on Transaction Management and related changes

2022-11-30 Thread David G. Johnston

On Wed, Nov 30, 2022 at 8:02 AM Bruce Momjian  wrote:

> On Wed, Nov 30, 2022 at 07:10:35AM -0700, David G. Johnston wrote:
> > On Wed, Nov 30, 2022 at 6:52 AM Bruce Momjian  wrote:
> > I'd maybe accept having it back-patched to v15 on that basis but not any
> > further.
> >
> > But I agree that our general behavior is to only apply this scope of
> update of
> > the documentation to HEAD.
>
> If everyone agrees this new chapter is helpful, and as helpful to PG 11
> users as PG 16 users, why would we not give users this information in
> our docs now?  What is the downside?  Chapter numbers?  Translations?
>

Admittedly the policy is more about "we don't expend any effort to write
back branch patches for this kind of material" rather than "we don't do
back patching because it causes problems".  But it is an existing policy,
applied consistently through the years, and treating the documentation like
a book, even though it is published in a non-physical medium, is a
reasonable guideline to follow.

My desire to get it out in an official release early goes against this
policy, and I'm fine waiting for v16 on that basis.  The only reason I'm
good with updating v15 is that I basically consider anything in the first 3
point releases of a major version to be a "delta" release.

One back-patchable idea to consider would be adding a note at the top of
the page(s) highlighting the fact that said material has been superseded by
more current documentation, with a link.  But the idea of changing
long-released (see my delta comment above) material doesn't sit well with
me or the policy.

> I assume this new chapter would be mentioned in the minor release notes.
>
>
We don't do release notes for documentation changes.

David J.

Re: New docs chapter on Transaction Management and related changes

2022-11-30 Thread Bruce Momjian

On Wed, Nov 30, 2022 at 07:10:35AM -0700, David G. Johnston wrote:
> On Wed, Nov 30, 2022 at 6:52 AM Bruce Momjian  wrote:
> I'd maybe accept having it back-patched to v15 on that basis but not any
> further.
> 
> But I agree that our general behavior is to only apply this scope of update of
> the documentation to HEAD.

If everyone agrees this new chapter is helpful, and as helpful to PG 11
users as PG 16 users, why would we not give users this information in
our docs now?  What is the downside?  Chapter numbers?  Translations?

I assume this new chapter would be mentioned in the minor release notes.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

Embrace your flaws.  They make you human, rather than perfect,
which you will never be.

pg_regress/pg_isolation_regress: Fix possible nullptr dereference.

2022-11-30 Thread Xing Guo

Hi hackers,

While playing with pg_regress and pg_isolation_regress, I noticed that
there's a potential nullptr deference in both of them.

How to reproduce:

Specify the `--dbname=` option without providing any database name.

//pg_regress --dbname=   foo
//pg_isolation_regress --dbname=   foo

Patch is attached.

-- 
Best Regards,
Xing
diff --git a/src/test/isolation/isolation_main.c 
b/src/test/isolation/isolation_main.c
index 31a0e6b709..a88e9da255 100644
--- a/src/test/isolation/isolation_main.c
+++ b/src/test/isolation/isolation_main.c
@@ -89,7 +89,7 @@ isolation_start_test(const char *testname,
offset += snprintf(psql_cmd + offset, sizeof(psql_cmd) - offset,
   "\"%s\" \"dbname=%s\" < \"%s\" > 
\"%s\" 2>&1",
   isolation_exec,
-  dblist->str,
+  dblist ? dblist->str : "",
   infile,
   outfile);
if (offset >= sizeof(psql_cmd))
diff --git a/src/test/regress/pg_regress_main.c 
b/src/test/regress/pg_regress_main.c
index a4b354c9e6..4e089b0f83 100644
--- a/src/test/regress/pg_regress_main.c
+++ b/src/test/regress/pg_regress_main.c
@@ -81,7 +81,7 @@ psql_start_test(const char *testname,
   "\"%s%spsql\" -X -a -q -d \"%s\" %s 
< \"%s\" > \"%s\" 2>&1",
   bindir ? bindir : "",
   bindir ? "/" : "",
-  dblist->str,
+  dblist ? dblist->str : "",
   "-v HIDE_TABLEAM=on -v 
HIDE_TOAST_COMPRESSION=on",
   infile,
   outfile);

Re: pgsql: Revoke PUBLIC CREATE from public schema, now owned by pg_databas

2022-11-30 Thread Noah Misch

On Wed, Nov 30, 2022 at 08:39:23AM -0500, Robert Haas wrote:
> On Wed, Nov 30, 2022 at 2:07 AM Noah Misch  wrote:
> > In general, the documentation should prefer simpler decision trees.
> 
> True, but I found the current text confusing, which is also something
> to consider.

Could remove the paragraph about v14.  Could have that paragraph say
explicitly that the REVOKE is a no-op.  Would either of those be an
improvement?

Re: Partial aggregates pushdown

2022-11-30 Thread Alexander Pyhalov


fujii.y...@df.mitsubishielectric.co.jp писал 2022-11-30 13:01:


2) Do we really have to look at pg_proc in partial_agg_ok() and
deparseAggref()? Perhaps, looking at aggtranstype is enough?

You are right. I fixed according to your comment.



partial_agg_ok() still looks at pg_proc.

--
Best regards,
Alexander Pyhalov,
Postgres Professional

Re: New docs chapter on Transaction Management and related changes

2022-11-30 Thread David G. Johnston

On Wed, Nov 30, 2022 at 6:52 AM Bruce Momjian  wrote:

> On Wed, Nov 30, 2022 at 07:33:44AM +0100, Peter Eisentraut wrote:
> > On 30.11.22 02:51, Bruce Momjian wrote:
> > > Patch applied back to PG 11.  Thanks to Simon for getting this
> important
> > > information in our docs, and for the valuable feedback from others that
> > > made this even better.
> >
> > I request that the backpatching of this be reverted.  We don't want to
> have
> > major chapter numbers in the documentation changing between minor
> releases.
>
> Uh, how do others feel about this?  I wanted to get this information to
> all our supported releases as soon as possible, rather than waiting for
> PG 16 and for people to upgrade.  To me it seems the chapter renumbering
> is worth that benefit.
>

I'd maybe accept having it back-patched to v15 on that basis but not any
further.

But I agree that our general behavior is to only apply this scope of update
of the documentation to HEAD.

David J.

Re: pg_dump bugs reported as pg_upgrade bugs

2022-11-30 Thread Bruce Momjian

On Wed, Nov 30, 2022 at 12:22:57AM -0500, Tom Lane wrote:
> Bruce Momjian  writes:
> > FYI, you might wonder why so many bugs reported on pg_upgrade eventually
> > are bugs in pg_dump.  Well, of course, partly is it because pg_upgrade
> > relies on pg_dump, but a bigger issue is that pg_upgrade will fail if
> > pg_dump or its restoration generate _any_ errors.  My guess is that many
> > people are using pg_dump and restore and just ignoring errors or fixing
> > them later, while this is not possible when using pg_upgrade.
> 
> pg_dump scripts are *designed* to be tolerant of errors, mainly so
> that you can restore into a situation that's not exactly like where
> you dumped from, with the possible need to resolve errors or decide
> that they're not problems.  So your depiction of what happens in
> dump/restore is not showing a problem; it's about using those tools
> as they were intended to be used.
> 
> Indeed, there's a bit of disconnect there with pg_upgrade, which would
> like to present a zero-user-involvement, nothing-to-see-here facade.

Agreed, a disconnect, plus if it is a table or index restore that fails,
pg_upgrade would fail later because there would be no system catalogs to
move the data into.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

Embrace your flaws.  They make you human, rather than perfect,
which you will never be.

Re: New docs chapter on Transaction Management and related changes

2022-11-30 Thread Bruce Momjian

On Wed, Nov 30, 2022 at 07:33:44AM +0100, Peter Eisentraut wrote:
> On 30.11.22 02:51, Bruce Momjian wrote:
> > Patch applied back to PG 11.  Thanks to Simon for getting this important
> > information in our docs, and for the valuable feedback from others that
> > made this even better.
> 
> I request that the backpatching of this be reverted.  We don't want to have
> major chapter numbers in the documentation changing between minor releases.

Uh, how do others feel about this?  I wanted to get this information to
all our supported releases as soon as possible, rather than waiting for
PG 16 and for people to upgrade.  To me it seems the chapter renumbering
is worth that benefit.

I could put the new chapter inside an existing numbered section in the
back branches if people prefer that, but then PG 16 would have it in a
different place, which seems bad.

> More generally, major documentation additions shouldn't be backpatched, for
> the same reasons we don't backpatch features.

I don't see how documentation additions and feature additions are
comparable.  Feature additions change the behavior of the system, and
potentially introduce application breakage and bugs, while documentation
additions, if they apply to the supported version, have no such risks.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

Embrace your flaws.  They make you human, rather than perfect,
which you will never be.

Re: pgsql: Revoke PUBLIC CREATE from public schema, now owned by pg_databas

2022-11-30 Thread Robert Haas

On Wed, Nov 30, 2022 at 2:07 AM Noah Misch  wrote:
> In general, the documentation should prefer simpler decision trees.

True, but I found the current text confusing, which is also something
to consider.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Partial aggregates pushdown

2022-11-30 Thread Alexander Pyhalov


fujii.y...@df.mitsubishielectric.co.jp писал 2022-11-30 13:01:

Hi Mr.Pyhalov.

1) In previous version of the patch aggregates, which had 
partialaggfn, were ok
to push down. And it was a definite sign that aggregate can be pushed 
down.
Now we allow pushing down an aggregate, which prorettype is not 
internal and
aggfinalfn is not defined. Is it safe for all user-defined (or 
builtin) aggregates,
even if they are generally shippable? Aggcombinefn is executed locally 
and we
check that aggregate function itself is shippable. Is it enough? 
Perhaps, we
could use partialagg_minversion (like aggregates with 
partialagg_minversion

== -1 should not be pushed down) or introduce separate explicit flag?

In what case partial aggregate pushdown is unsafe for aggregate which
has not internal aggtranstype
 and has no aggfinalfn?
By reading [1], I believe that if aggcombinefn of such aggregate
recieves return values of original
 aggregate functions in each remote then it must produce same value
that would have resulted
 from scanning all the input in a single operation.



One more issue I started to think about - now we don't check 
partialagg_minversion for "simple" aggregates at all. Is it correct? It 
seems that , for example, we could try to pushdown bit_or(int8) to old 
servers, but it didn't exist, for example, in 8.4.  I think it's a 
broader issue (it would be also the case already if we push down 
aggregates) and shouldn't be fixed here. But there is an issue - 
is_shippable() is too optimistic.


--
Best regards,
Alexander Pyhalov,
Postgres Professional

Re: daitch_mokotoff module

2022-11-30 Thread Ian Lawrence Barwick

Hi Dag

2022年2月3日(木) 23:27 Dag Lem :
>
> Hi,
>
> Just some minor adjustments to the patch:
>
> * Removed call to locale-dependent toupper()
> * Cleaned up input normalization

This patch was marked as "Waiting on Author" in the CommitFest entry [1]
but I see you provided an updated version which hasn't received any feedback,
so I've move this to the next CommitFest [2] and set it to "Needs Review".

[1] https://commitfest.postgresql.org/40/3451/
[2] https://commitfest.postgresql.org/41/3451/

> I have been asked to sign up to review a commitfest patch or patches -
> unfortunately I've been ill with COVID-19 and it's not until now that
> I feel well enough to have a look.
>
> Julien: I'll have a look at https://commitfest.postgresql.org/36/3468/
> as you suggested (https://commitfest.postgresql.org/36/3379/ seems to
> have been reviewed now).
>
> If there are other suggestions for a patch or patches to review for
> someone new to PostgreSQL internals, I'd be grateful for that.

I see you provided some feedback on https://commitfest.postgresql.org/36/3468/,
though the patch seems to have not been accepted (but not conclusively rejected
either). If you still have the chance to review another patch (or more) it would
be much appreciated, as there's quite a few piling up. Things like documentation
or small improvements to client applications are always a good place to start.
Reviews can be provided at any time, there's no need to wait for the next
CommitFest.

Regards

Ian Barwick

Re: Atomic rename feature for Windows.

2022-11-30 Thread Ian Lawrence Barwick

2022年10月19日(水) 6:06 Thomas Munro :
>
> On Wed, Apr 6, 2022 at 10:40 AM Victor Spirin  wrote:
> > Updated patch: we use the posix semantic features in Windows build 17763
> > and up.
> > We found an issue with this feature on Windows Server 2016 without
> > updates (Windows 1607 Build 14393)
>
> Hi Victor,
>
> I rebased and simplified this, and added a lot of tests to be able to
> understand what it does.  I think all systems that didn't have this
> are now EOL and we don't need to support them in PG16, but perhaps our
> _WIN32_WINNT is not quite high enough (this requires Win10 RS1, which
> itself was EOL'd in 2019); the main question I have now is what
> happens when you run this on non-NTFS filesystems, and whether we want
> to *require* this to work because the non-POSIX support will probably
> finish up untested.  I posted all that over on a new thread where I am
> tidying up lots of related stuff, and I didn't want to repost the
> proposed testing framework in multiple threads...
>
> https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BajSQ_8eu2AogTncOnZ5me2D-Cn66iN_-wZnRjLN%2Bicg%40mail.gmail.com

Hi

As Thomas has incorporated this patch into another CommitFest entry
[1], we'll close
the entry for this thread [2]. If Victor or anyone else would like to
follow up, it would
probably be best to do that on the thread linked above.

[1] https://commitfest.postgresql.org/40/3951/
[2] https://commitfest.postgresql.org/40/3347/


Regards

Ian Barwick

Re: Patch: Global Unique Index

2022-11-30 Thread Laurenz Albe

On Wed, 2022-11-30 at 10:09 +0100, Vik Fearing wrote:
> On 11/29/22 17:29, Laurenz Albe wrote:
> > On Tue, 2022-11-29 at 13:58 +0100, Vik Fearing wrote:
> > > I disagree.  A user does not need to know that a table is partitionned,
> > > and if the user wants a unique constraint on the table then making them
> > > type an extra word to get it is just annoying.
> > 
> > Hmm.  But if I created a primary key without thinking too hard about it,
> > only to discover later that dropping old partitions has become a problem,
> > I would not be too happy either.
> 
> I have not looked at this patch, but my understanding of its design is 
> the "global" part of the index just makes sure to check a unique index 
> on each partition.  I don't see from that how dropping old partitions 
> would be a problem.

Right, I should have looked closer.  But, according to the parallel discussion,
ATTACH PARTITION might be a problem.  A global index is likely to be a footgun
one way or the other, so I think it should at least have a safety on
(CREATE PARTITIONED GLOBAL INDEX or something).

Yours,
Laurenz Albe

Re: O(n) tasks cause lengthy startups and checkpoints

2022-11-30 Thread Bharath Rupireddy

On Wed, Nov 30, 2022 at 4:52 PM Bharath Rupireddy
 wrote:
>
> On Wed, Nov 30, 2022 at 10:48 AM Nathan Bossart
>  wrote:
> >
> >
> > cfbot is not happy with v16.  AFAICT this is just due to poor placement, so
> > here is another attempt with the tests moved to a new location.  Apologies
> > for the noise.
>
> Thanks for the patches. I spent some time on reviewing v17 patch set
> and here are my comments:
>
> 0001:
> 1. I think the custodian process needs documentation - it needs a
> definition in glossary.sgml and perhaps a dedicated page describing
> what tasks it takes care of.
>
> 2.
> +LWLockReleaseAll();
> +ConditionVariableCancelSleep();
> +AbortBufferIO();
> +UnlockBuffers();
> +ReleaseAuxProcessResources(false);
> +AtEOXact_Buffers(false);
> +AtEOXact_SMgr();
> +AtEOXact_Files(false);
> +AtEOXact_HashTables(false);
> Do we need all of these in the exit path? Isn't the stuff that
> ShutdownAuxiliaryProcess() does enough for the custodian process?
> AFAICS, the custodian process uses LWLocks (which the
> ShutdownAuxiliaryProcess() takes care of) and it doesn't access shared
> buffers and so on.
> Having said that, I'm fine to keep them for future use and all of
> those cleanup functions exit if nothing related occurs.
>
> 3.
> + * Advertise out latch that backends can use to wake us up while we're
> Typo - %s/out/our
>
> 4. Is it a good idea to add log messages in the DoCustodianTasks()
> loop? Maybe at a debug level? The log message can say the current task
> the custodian is processing. And/Or setting the custodian's status on
> the ps display is also a good idea IMO.
>
> 0002 and 0003:
> 1.
> +CHECKPOINT;
> +DO $$
> I think we need to ensure that there are some snapshot files before
> the checkpoint. Otherwise, it may happen that the above test case
> exits without the custodian process doing anything.
>
> 2. I think the best way to test the custodian process code is by
> adding a TAP test module to see actually the custodian process kicks
> in. Perhaps, add elog(DEBUGX,...) messages to various custodian
> process functions and see if we see the logs in server logs.
>
> 0004:
> I think the 0004 patch can be merged into 0001, 0002 and 0003 patches.
> Otherwise the patch LGTM.
>
> Few thoughts:
> 1. I think we can trivially extend the custodian process to remove any
> future WAL files on the old timeline, something like the attached
> 0001-Move-removal-of-future-WAL-files-on-the-old-timeline.text file).
> While this offloads the recovery a bit, the server may archive such
> WAL files before the custodian removes them. We can do a bit more to
> stop the server from archiving such WAL files, but that needs more
> coding. I don't think we need to do all that now, perhaps, we can give
> it a try once the basic custodian stuff gets in.
> 2. Moving RemovePgTempFiles() to the custodian can bring up the server
> soon. The idea is that the postmaster just renames the temp
> directories and informs the custodian so that it can go delete such
> temp files and directories. I have personally seen cases where the
> server spent a good amount of time cleaning up temp files. We can park
> it for later.
> 3. Moving RemoveOldXlogFiles() to the custodian can make checkpoints faster.
> 4. PreallocXlogFiles() - if we ever have plans to make pre-allocation
> more aggressive (pre-allocate more than 1 WAL file), perhaps letting
> custodian do that is a good idea. Again, too many tasks for a single
> process.

Another comment:
IIUC, there's no custodian_delay GUC as we want to avoid unnecessary
wakeups for power savings (being discussed in the other thread).
However, can it happen that the custodian missed to capture SetLatch
wakeups by other backends? In other words, can the custodian process
be sleeping when there's work to do?

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Introduce a new view for checkpointer related stats

2022-11-30 Thread Bharath Rupireddy

On Wed, Nov 30, 2022 at 12:44 PM Drouvot, Bertrand
 wrote:
>
> +CREATE VIEW pg_stat_checkpointer AS
> +SELECT
> +pg_stat_get_timed_checkpoints() AS checkpoints_timed,
> +pg_stat_get_requested_checkpoints() AS checkpoints_req,
> +pg_stat_get_checkpoint_write_time() AS checkpoint_write_time,
> +pg_stat_get_checkpoint_sync_time() AS checkpoint_sync_time,
> +pg_stat_get_buf_written_checkpoints() AS buffers_checkpoint,
> +pg_stat_get_buf_written_backend() AS buffers_backend,
> +pg_stat_get_buf_fsync_backend() AS buffers_backend_fsync,
> +pg_stat_get_checkpointer_stat_reset_time() AS stats_reset;
> +
>
> I still think that having checkpoints_ prefix in a pg_stat_checkpointer view 
> sounds "weird" (made sense when they were part of pg_stat_bgwriter)
>
> maybe we could have something like this instead?
>
> +CREATE VIEW pg_stat_checkpointer AS
> +SELECT
> +pg_stat_get_timed_checkpoints() AS num_timed,
> +pg_stat_get_requested_checkpoints() AS num_req,
> +pg_stat_get_checkpoint_write_time() AS total_write_time,
> +pg_stat_get_checkpoint_sync_time() AS total_sync_time,
> +pg_stat_get_buf_written_checkpoints() AS buffers_checkpoint,
> +pg_stat_get_buf_written_backend() AS buffers_backend,
> +pg_stat_get_buf_fsync_backend() AS buffers_backend_fsync,
> +pg_stat_get_checkpointer_stat_reset_time() AS stats_reset;
> +

I don't have a strong opinion about changing column names. However, if
we were to change it, I prefer to use names that
PgStat_CheckpointerStats has. BTW, that's what
PgStat_BgWriterStats/pg_stat_bgwriter and
PgStat_ArchiverStats/pg_stat_archiver uses.
typedef struct PgStat_CheckpointerStats
{
PgStat_Counter timed_checkpoints;
PgStat_Counter requested_checkpoints;
PgStat_Counter checkpoint_write_time;   /* times in milliseconds */
PgStat_Counter checkpoint_sync_time;
PgStat_Counter buf_written_checkpoints;
PgStat_Counter buf_written_backend;
PgStat_Counter buf_fsync_backend;
TimestampTz stat_reset_timestamp;
} PgStat_CheckpointerStats;

> That's a nit in any case and the patch LGTM.

Thanks.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

RE: Perform streaming logical transactions by background workers and parallel apply

2022-11-30 Thread Hayato Kuroda (Fujitsu)

Dear hackers,

> 1. I think testing the scenario where the shm_mq buffer is full
> between the leader and parallel apply worker would require a large
> amount of data and then also there is no guarantee. How about having a
> developer GUC [1] force_apply_serialize which allows us to serialize
> the changes and only after commit the parallel apply worker would be
> allowed to apply it?
> 
> I am not sure if we can reliably test the serialization of partial
> changes (like some changes have been already sent to parallel apply
> worker and then serialization happens) but at least we can test the
> serialization of complete xacts and their execution via parallel apply
> worker.

I agreed for adding the developer options, because the part that LA serialize
changes and PAs read and apply them might be complex. I have reported some
bugs around here.

One idea: A threshold(integer) can be introduced as the developer GUC.
LA skips to send data or jumps to serialization part to PA via shm_mq_send() 
when
it has sent more than (threshold) times. This may be able to test the 
partial-serialization case.
Default(-1) means no-op, and 0 means all changes must be serialized.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Re: Non-decimal integer literals

2022-11-30 Thread Dean Rasheed

On Wed, 30 Nov 2022 at 05:50, David Rowley  wrote:
>
> I spent a bit more time trying to figure out why the compiler does
> imul instead of bit shifting and it just seems to be down to a
> combination of signed-ness plus the overflow check. See [1]. Neither
> of the two compilers I tested could use bit shifting with a signed
> type when overflow checking is done, which is what we're doing in the
> new code.
>

Ah, interesting. That makes me think that it might be possible to get
some performance gains for all bases (including 10) by separating the
overflow check from the multiplication, and giving the compiler the
best chance to decide on the optimal way to do the multiplication. For
example, on my Intel box, GCC prefers a pair of LEA instructions over
an IMUL, to multiply by 10.

I like your previous idea of using an unsigned integer for the
accumulator, because then the overflow check in the loop doesn't need
to be exact, as long as an exact check is done later. That way, there
are fewer conditional branches in the loop, and the possibility for
the compiler to choose the fastest multiplication method. So something
like:

// Accumulate positive value using unsigned int, with approximate
// overflow check. If acc >= 1 - INT_MIN / 10, then acc * 10 is
// sure to exceed -INT_MIN.
unsigned int cutoff = 1 - INT_MIN / 10;
unsigned int acc = 0;

while (*ptr && isdigit((unsigned char) *ptr))
{
if (unlikely(acc >= cutoff))
goto out_of_range;
acc = acc * 10 + (*ptr - '0');
ptr++;
}

and similar for other bases, allowing the coding for all bases to be
kept similar.

I think it's probably best to consider this as a follow-on patch
though. It shouldn't delay getting the main feature committed.

Regards,
Dean

Re: O(n) tasks cause lengthy startups and checkpoints

2022-11-30 Thread Bharath Rupireddy

On Wed, Nov 30, 2022 at 10:48 AM Nathan Bossart
 wrote:
>
>
> cfbot is not happy with v16.  AFAICT this is just due to poor placement, so
> here is another attempt with the tests moved to a new location.  Apologies
> for the noise.

Thanks for the patches. I spent some time on reviewing v17 patch set
and here are my comments:

0001:
1. I think the custodian process needs documentation - it needs a
definition in glossary.sgml and perhaps a dedicated page describing
what tasks it takes care of.

2.
+LWLockReleaseAll();
+ConditionVariableCancelSleep();
+AbortBufferIO();
+UnlockBuffers();
+ReleaseAuxProcessResources(false);
+AtEOXact_Buffers(false);
+AtEOXact_SMgr();
+AtEOXact_Files(false);
+AtEOXact_HashTables(false);
Do we need all of these in the exit path? Isn't the stuff that
ShutdownAuxiliaryProcess() does enough for the custodian process?
AFAICS, the custodian process uses LWLocks (which the
ShutdownAuxiliaryProcess() takes care of) and it doesn't access shared
buffers and so on.
Having said that, I'm fine to keep them for future use and all of
those cleanup functions exit if nothing related occurs.

3.
+ * Advertise out latch that backends can use to wake us up while we're
Typo - %s/out/our

4. Is it a good idea to add log messages in the DoCustodianTasks()
loop? Maybe at a debug level? The log message can say the current task
the custodian is processing. And/Or setting the custodian's status on
the ps display is also a good idea IMO.

0002 and 0003:
1.
+CHECKPOINT;
+DO $$
I think we need to ensure that there are some snapshot files before
the checkpoint. Otherwise, it may happen that the above test case
exits without the custodian process doing anything.

2. I think the best way to test the custodian process code is by
adding a TAP test module to see actually the custodian process kicks
in. Perhaps, add elog(DEBUGX,...) messages to various custodian
process functions and see if we see the logs in server logs.

0004:
I think the 0004 patch can be merged into 0001, 0002 and 0003 patches.
Otherwise the patch LGTM.

Few thoughts:
1. I think we can trivially extend the custodian process to remove any
future WAL files on the old timeline, something like the attached
0001-Move-removal-of-future-WAL-files-on-the-old-timeline.text file).
While this offloads the recovery a bit, the server may archive such
WAL files before the custodian removes them. We can do a bit more to
stop the server from archiving such WAL files, but that needs more
coding. I don't think we need to do all that now, perhaps, we can give
it a try once the basic custodian stuff gets in.
2. Moving RemovePgTempFiles() to the custodian can bring up the server
soon. The idea is that the postmaster just renames the temp
directories and informs the custodian so that it can go delete such
temp files and directories. I have personally seen cases where the
server spent a good amount of time cleaning up temp files. We can park
it for later.
3. Moving RemoveOldXlogFiles() to the custodian can make checkpoints faster.
4. PreallocXlogFiles() - if we ever have plans to make pre-allocation
more aggressive (pre-allocate more than 1 WAL file), perhaps letting
custodian do that is a good idea. Again, too many tasks for a single
process.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Perform streaming logical transactions by background workers and parallel apply

2022-11-30 Thread Amit Kapila

On Tue, Nov 29, 2022 at 10:18 AM houzj.f...@fujitsu.com
 wrote:
>
> Attach the new version patch which addressed all comments.
>

Some comments on v53-0002*

1. I think testing the scenario where the shm_mq buffer is full
between the leader and parallel apply worker would require a large
amount of data and then also there is no guarantee. How about having a
developer GUC [1] force_apply_serialize which allows us to serialize
the changes and only after commit the parallel apply worker would be
allowed to apply it?

I am not sure if we can reliably test the serialization of partial
changes (like some changes have been already sent to parallel apply
worker and then serialization happens) but at least we can test the
serialization of complete xacts and their execution via parallel apply
worker.

2.
+ /*
+ * The stream lock is released when processing changes in a
+ * streaming block, so the leader needs to acquire the lock here
+ * before entering PARTIAL_SERIALIZE mode to ensure that the
+ * parallel apply worker will wait for the leader to release the
+ * stream lock.
+ */
+ if (in_streamed_transaction &&
+ action != LOGICAL_REP_MSG_STREAM_STOP)
+ {
+ pa_lock_stream(winfo->shared->xid, AccessExclusiveLock);

This comment is not completely correct because we can even acquire the
lock for the very streaming chunk. This check will work but doesn't
appear future-proof or at least not very easy to understand though I
don't have a better suggestion at this stage. Can we think of a better
check here?

3. I have modified a few comments in v53-0002* patch and the
incremental patch for the same is attached.

[1] - https://www.postgresql.org/docs/devel/runtime-config-developer.html

-- 
With Regards,
Amit Kapila.

changes_amit_v53_0002.patch
Description: Binary data

Re: HOT chain validation in verify_heapam()

2022-11-30 Thread Himanshu Upadhyaya

On Thu, Nov 10, 2022 at 3:38 AM Andres Freund  wrote:

>
> > + }
> > +
> > + /*
> > +  * Loop over offset and populate predecessor array from
> all entries
> > +  * that are present in successor array.
> > +  */
> > + ctx.attnum = -1;
> > + for (ctx.offnum = FirstOffsetNumber; ctx.offnum <= maxoff;
> > +  ctx.offnum = OffsetNumberNext(ctx.offnum))
> > + {
> > + ItemId  curr_lp;
> > + ItemId  next_lp;
> > + HeapTupleHeader curr_htup;
> > + HeapTupleHeader next_htup;
> > + TransactionId curr_xmax;
> > + TransactionId next_xmin;
> > +
> > + OffsetNumber nextoffnum = successor[ctx.offnum];
> > +
> > + curr_lp = PageGetItemId(ctx.page, ctx.offnum);
>
> Why do we get the item when nextoffnum is 0?
>

Fixed by moving  PageGetItemId() call after the 'if' check.


> > + if (nextoffnum == 0 || !lp_valid[ctx.offnum] ||
> !lp_valid[nextoffnum])
> > + {
> > + /*
> > +  * This is either the last updated tuple
> in the chain or a
> > +  * corruption raised for this tuple.
> > +  */
>
> "or a corruption raised" isn't quite right grammatically.
>

done.

>
> > + continue;
> > + }
> > + if (ItemIdIsRedirected(curr_lp))
> > + {
> > + next_lp = PageGetItemId(ctx.page,
> nextoffnum);
> > + if (ItemIdIsRedirected(next_lp))
> > + {
> > + report_corruption(,
> > +
>  psprintf("redirected line pointer pointing to another redirected line
> pointer at offset %u",
> > +
> (unsigned) nextoffnum));
> > + continue;
> > + }
> > + next_htup = (HeapTupleHeader) PageGetItem(
> ctx.page, next_lp);
> > + if (!HeapTupleHeaderIsHeapOnly(next_htup))
> > + {
> > + report_corruption(,
> > +
>  psprintf("redirected tuple at line pointer offset %u is not heap only
> tuple",
> > +
> (unsigned) nextoffnum));
> > + }
> > + if ((next_htup->t_infomask & HEAP_UPDATED)
> == 0)
> > + {
> > + report_corruption(,
> > +
>  psprintf("redirected tuple at line pointer offset %u is not heap updated
> tuple",
> > +
> (unsigned) nextoffnum));
> > + }
> > + continue;
> > + }
> > +
> > + /*
> > +  * Add a line pointer offset to the predecessor
> array if xmax is
> > +  * matching with xmin of next tuple (reaching via
> its t_ctid).
> > +  * Prior to PostgreSQL 9.4, we actually changed
> the xmin to
> > +  * FrozenTransactionId
>
> I'm doubtful it's a good idea to try to validate the 9.4 case. The
> likelihood
> of getting that right seems low and I don't see us gaining much by even
> trying.
>
>
>
removed code with regards to frozen tuple checks.

> so we must add offset to predecessor
> > +  * array(irrespective of xmax-xmin matching) if
> updated tuple xmin
> > +  * is frozen, so that we can later do validation
> related to frozen
> > +  * xmin. Raise corruption if we have two tuples
> having the same
> > +  * predecessor.
> > +  * We add the offset to the predecessor array
> irrespective of the
> > +  * transaction (t_xmin) status. We will do
> validation related to
> > +  * the transaction status (and also all other
> validations) when we
> > +  * loop over the predecessor array.
> > +  */
> > + curr_htup = (HeapTupleHeader) PageGetItem(ctx.page,
> curr_lp);
> > + curr_xmax = HeapTupleHeaderGetUpdateXid(curr_htup);
> > + next_lp = PageGetItemId(ctx.page, nextoffnum);
> > + next_htup = (HeapTupleHeader) PageGetItem(ctx.page,
> next_lp);
> > + next_xmin = HeapTupleHeaderGetXmin(next_htup);
> > + if (TransactionIdIsValid(curr_xmax) &&
> > + (TransactionIdEquals(curr_xmax, next_xmin)
> ||
> > +  next_xmin == FrozenTransactionId))
> > + {
> >

RE: Partial aggregates pushdown

2022-11-30 Thread fujii.y...@df.mitsubishielectric.co.jp

Hi Mr.Pyhalov.

> 1) In previous version of the patch aggregates, which had partialaggfn, were 
> ok
> to push down. And it was a definite sign that aggregate can be pushed down.
> Now we allow pushing down an aggregate, which prorettype is not internal and
> aggfinalfn is not defined. Is it safe for all user-defined (or builtin) 
> aggregates,
> even if they are generally shippable? Aggcombinefn is executed locally and we
> check that aggregate function itself is shippable. Is it enough? Perhaps, we
> could use partialagg_minversion (like aggregates with partialagg_minversion
> == -1 should not be pushed down) or introduce separate explicit flag?
In what case partial aggregate pushdown is unsafe for aggregate which has not 
internal aggtranstype
 and has no aggfinalfn?
By reading [1], I believe that if aggcombinefn of such aggregate recieves 
return values of original
 aggregate functions in each remote then it must produce same value that would 
have resulted 
 from scanning all the input in a single operation.

> 2) Do we really have to look at pg_proc in partial_agg_ok() and
> deparseAggref()? Perhaps, looking at aggtranstype is enough?
You are right. I fixed according to your comment.

> 3) I'm not sure if CREATE AGGREGATE tests with invalid
> PARTIALAGGFUNC/PARTIALAGG_MINVERSION should be in postgres_fdw
> tests or better should be moved to src/test/regress/sql/create_aggregate.sql,
> as they are not specific to postgres_fdw
Thank you. I moved these tests to src/test/regress/sql/create_aggregate.sql.

[1] https://www.postgresql.org/docs/15/xaggr.html#XAGGR-PARTIAL-AGGREGATES

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R Center Mitsubishi Electric Corporation


0001-Partial-aggregates-push-down-v14.patch
Description: 0001-Partial-aggregates-push-down-v14.patch

Re: psql - factor out echo code

2022-11-30 Thread Fabien COELHO




Now some of the output generated by test_extdepend gets a bit
confusing:
+-- QUERY:
+
+
+-- QUERY:

That's not entirely this patch's fault.  Still that's not really
intuitive to see the output of a query that's just a blank spot..


Hmmm.

What about adding an explicit \echo before these empty outputs to mitigate 
the possible induced confusion?


\echo is not possible.

Attached an attempt to improve the situation by replacing empty lines with 
comments in this test.


--
Fabien.diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 7672ed9e9d..16873aff62 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -5284,18 +5284,9 @@ echo_hidden_command(const char *query)
 {
 	if (pset.echo_hidden != PSQL_ECHO_HIDDEN_OFF)
 	{
-		printf(_("* QUERY **\n"
- "%s\n"
- "**\n\n"), query);
-		fflush(stdout);
+		echoQuery(stdout, true, query);
 		if (pset.logfile)
-		{
-			fprintf(pset.logfile,
-	_("* QUERY **\n"
-	  "%s\n"
-	  "**\n\n"), query);
-			fflush(pset.logfile);
-		}
+			echoQuery(pset.logfile, true, query);
 
 		if (pset.echo_hidden == PSQL_ECHO_HIDDEN_NOEXEC)
 			return false;
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index b989d792aa..30fd5bcda1 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -543,6 +543,18 @@ PrintTiming(double elapsed_msec)
 		   elapsed_msec, days, (int) hours, (int) minutes, seconds);
 }
 
+/*
+ * Echo user query
+ */
+void
+echoQuery(FILE *out, bool is_internal, const char *query)
+{
+	if (is_internal)
+		fprintf(out, _("-- INTERNAL QUERY:\n%s\n\n"), query);
+	else
+		fprintf(out, _("-- QUERY:\n%s\n\n"), query);
+	fflush(out);
+}
 
 /*
  * PSQLexec
@@ -569,18 +581,9 @@ PSQLexec(const char *query)
 
 	if (pset.echo_hidden != PSQL_ECHO_HIDDEN_OFF)
 	{
-		printf(_("* QUERY **\n"
- "%s\n"
- "**\n\n"), query);
-		fflush(stdout);
+		echoQuery(stdout, true, query);
 		if (pset.logfile)
-		{
-			fprintf(pset.logfile,
-	_("* QUERY **\n"
-	  "%s\n"
-	  "**\n\n"), query);
-			fflush(pset.logfile);
-		}
+			echoQuery(pset.logfile, true, query);
 
 		if (pset.echo_hidden == PSQL_ECHO_HIDDEN_NOEXEC)
 			return NULL;
@@ -796,10 +799,7 @@ ExecQueryTuples(const PGresult *result)
  * assumes that MainLoop did that, so we have to do it here.
  */
 if (pset.echo == PSQL_ECHO_ALL && !pset.singlestep)
-{
-	puts(query);
-	fflush(stdout);
-}
+	echoQuery(stdout, false, query);
 
 if (!SendQuery(query))
 {
@@ -1058,13 +1058,7 @@ SendQuery(const char *query)
 	}
 
 	if (pset.logfile)
-	{
-		fprintf(pset.logfile,
-_("* QUERY **\n"
-  "%s\n"
-  "**\n\n"), query);
-		fflush(pset.logfile);
-	}
+		echoQuery(pset.logfile, false, query);
 
 	SetCancelConn(pset.db);
 
diff --git a/src/bin/psql/common.h b/src/bin/psql/common.h
index f0820dd7d5..359c48143e 100644
--- a/src/bin/psql/common.h
+++ b/src/bin/psql/common.h
@@ -32,6 +32,7 @@ extern void psql_setup_cancel_handler(void);
 extern PGresult *PSQLexec(const char *query);
 extern int	PSQLexecWatch(const char *query, const printQueryOpt *opt, FILE *printQueryFout);
 
+extern void echoQuery(FILE *out, bool is_internal, const char *query);
 extern bool SendQuery(const char *query);
 
 extern bool is_superuser(void);
diff --git a/src/test/modules/test_extensions/expected/test_extdepend.out b/src/test/modules/test_extensions/expected/test_extdepend.out
index 0b62015d18..a08ef8d19d 100644
--- a/src/test/modules/test_extensions/expected/test_extdepend.out
+++ b/src/test/modules/test_extensions/expected/test_extdepend.out
@@ -11,17 +11,17 @@ SELECT * FROM test_extdep_commands;
   CREATE EXTENSION test_ext5 SCHEMA test_ext
   SET search_path TO test_ext
   CREATE TABLE a (a1 int)
- 
+  -- function b
   CREATE FUNCTION b() RETURNS TRIGGER LANGUAGE plpgsql AS   +
 $$ BEGIN NEW.a1 := NEW.a1 + 42; RETURN NEW; END; $$
   ALTER FUNCTION b() DEPENDS ON EXTENSION test_ext5
- 
+  -- trigger c
   CREATE TRIGGER c BEFORE INSERT ON a FOR EACH ROW EXECUTE PROCEDURE b()
   ALTER TRIGGER c ON a DEPENDS ON EXTENSION test_ext5
- 
+  -- materialized view d
   CREATE MATERIALIZED VIEW d AS SELECT * FROM a
   ALTER MATERIALIZED VIEW d DEPENDS ON EXTENSION test_ext5
- 
+  -- index e
   CREATE INDEX e ON a (a1)
   ALTER INDEX e DEPENDS ON EXTENSION test_ext5
   RESET search_path
@@ -29,24 +29,58 @@ SELECT * FROM test_extdep_commands;
 
 -- First, test that dependent objects go away when the extension is dropped.
 SELECT * FROM test_extdep_commands \gexec
+-- QUERY:
  CREATE SCHEMA test_ext
+
+-- QUERY:
  CREATE EXTENSION test_ext5 SCHEMA test_ext
+
+-- QUERY:
  SET search_path TO test_ext
+
+-- QUERY:
  CREATE TABLE a (a1 int)
 
+-- QUERY:
+ -- function b
+
+-- QUERY:
  CREATE FUNCTION b() RETURNS TRIGGER LANGUAGE plpgsql AS
$$

Re: psql - factor out echo code

2022-11-30 Thread Fabien COELHO





Hmm.  The refactoring is worth it as much as the differentiation
between QUERY and INTERNAL QUERY as the same pattern is repeated 5
times.

Now some of the output generated by test_extdepend gets a bit
confusing:
+-- QUERY:
+
+
+-- QUERY:

That's not entirely this patch's fault.  Still that's not really
intuitive to see the output of a query that's just a blank spot..


Hmmm.

What about adding an explicit \echo before these empty outputs to mitigate 
the possible induced confusion?


--
Fabien.

Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2022-11-30 Thread Jelte Fennema

> This version of the patch no longer applies, a rebased version is needed.

Attached is a patch that applies cleanly again and is also changed
to use the recently introduced libpq_append_conn_error.

I also attached a patch that runs pgindent after the introduction of
libpq_append_conn_error. I noticed that this hadn't happened when
trying to run pgindent on my own changes.

v9-0001-libpq-Run-pgindent-after-a9e9a9f32b3.patch
Description: v9-0001-libpq-Run-pgindent-after-a9e9a9f32b3.patch


v9-0002-Add-non-blocking-version-of-PQcancel.patch
Description: v9-0002-Add-non-blocking-version-of-PQcancel.patch

Re: Patch: Global Unique Index

2022-11-30 Thread Vik Fearing


On 11/29/22 17:29, Laurenz Albe wrote:

On Tue, 2022-11-29 at 13:58 +0100, Vik Fearing wrote:

I disagree.  A user does not need to know that a table is partitionned,
and if the user wants a unique constraint on the table then making them
type an extra word to get it is just annoying.


Hmm.  But if I created a primary key without thinking too hard about it,
only to discover later that dropping old partitions has become a problem,
I would not be too happy either.


I have not looked at this patch, but my understanding of its design is 
the "global" part of the index just makes sure to check a unique index 
on each partition.  I don't see from that how dropping old partitions 
would be a problem.

--
Vik Fearing

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-30 Thread Masahiko Sawada

On Tue, Nov 29, 2022 at 1:36 PM John Naylor
 wrote:
>
> While creating a benchmark for inserting into node128-inner, I found a bug. 
> If a caller deletes from a node128, the slot index is set to invalid, but the 
> child pointer is still valid. Do that a few times, and every child pointer is 
> valid, even if no slot index points to it. When the next inserter comes 
> along, something surprising happens. This function:
>
> /* Return an unused slot in node-128 */
> static int
> node_inner_128_find_unused_slot(rt_node_inner_128 *node, uint8 chunk)
> {
>   int slotpos = 0;
>
>   Assert(!NODE_IS_LEAF(node));
>   while (node_inner_128_is_slot_used(node, slotpos))
>   slotpos++;
>
>   return slotpos;
> }
>
> ...passes an integer to this function, whose parameter is a uint8:
>
> /* Is the slot in the node used? */
> static inline bool
> node_inner_128_is_slot_used(rt_node_inner_128 *node, uint8 slot)
> {
>   Assert(!NODE_IS_LEAF(node));
>   return (node->children[slot] != NULL);
> }
>
> ...so instead of growing the node unnecessarily or segfaulting, it enters an 
> infinite loop doing this:
>
> add eax, 1
> movzx   ecx, al
> cmp QWORD PTR [rbx+264+rcx*8], 0
> jne .L147
>
> The fix is easy enough -- set the child pointer to null upon deletion,

Good catch!

> but I'm somewhat astonished that the regression tests didn't hit this. I do 
> still intend to replace this code with something faster, but before I do so 
> the tests should probably exercise the deletion paths more. Since VACUUM

Indeed, there are some tests for deletion but all of them delete all
keys in the node so we end up deleting the node. I've added tests of
repeating deletion and insertion as well as additional assertions.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Logical Replication Custom Column Expression

2022-11-30 Thread Stavros Koureas

Στις Τρί 29 Νοε 2022 στις 3:27 μ.μ., ο/η Ashutosh Bapat <
ashutosh.bapat@gmail.com> έγραψε:
> That would be too restrictive - not necessarily in your application
> but generally. There could be some tables where consolidating rows
> with same PK from different publishers into a single row in subscriber
> would be desirable. I think we need to enable the property for every
> subscriber that intends to add publisher column to the desired and
> subscribed tables. But there should be another option per table which
> will indicate that receiver should add publisher when INSERTING row to
> that table.

So we are discussing the scope level of this property, if this property
will be implemented on subscriber level or on subscriber table.
In that case I am not sure how this will be implemented as currently
postgres subscribers can have multiple tables streamed from a single
publisher.
In that case we may have an additional syntax on subscriber, for example:

CREATE SUBSCRIPTION sub1 CONNECTION 'host=localhost port=5432 user=postgres
password=XX dbname=publisher1' PUBLICATION pub1 with (enabled = false,
create_slot = false, slot_name = NONE, tables = {tableA:union, tableB:none,
});

Something like this?

> And yes, probably you need to change the way you reply to email on
> this list. Top-posting is generally avoided. See
> https://wiki.postgresql.org/wiki/Mailing_Lists.

Thanks for bringing this into the discussion :)

Re: ExecRTCheckPerms() and many prunable partitions

2022-11-30 Thread Alvaro Herrera

Hello

On 2022-Nov-29, Amit Langote wrote:

> Thanks for taking a look and all the fixup patches.  Was working on
> that test I said we should add and then was spending some time
> cleaning things up and breaking some things out into their patches,
> mainly for the ease of review.

Right, excellent.  Thanks for this new version.  It looks pretty good to
me now.

> On Tue, Nov 29, 2022 at 6:27 PM Alvaro Herrera  
> wrote:

> > The other changes are cosmetic.
> 
> Thanks, I've merged all.  I do wonder that it is only in PlannedStmt
> that the list is called something that is not "rtepermlist", but I'm
> fine with it if you prefer that.

I was unsure about that one myself; I just changed it because that
struct uses camelCaseNaming, which the others do not, so it seemed fine
in the other places but not there.  As for changing "list" to "infos",
it seems to me we tend to avoid naming a list as "list", so.  (Maybe I
would change the others to be foo_rteperminfos.  Unless these naming
choices were already bikeshedded to its present form upthread and I
missed it?)

> As I mentioned above, I've broken a couple of other changes out into
> their own patches that I've put before the main patch.  0001 adds
> ExecGetRootToChildMap().  I thought it would be better to write in the
> commit message why the new map is necessary for the main patch.

I was thinking about this one and it seemed too closely tied to
ExecGetInsertedCols to be committed separately.  Notice how there is a
comment that mentions that function in your 0001, but that function
itself still uses ri_RootToPartitionMap, so before your 0003 the comment
is bogus.  And there's now quite some duplicity between
ri_RootToPartitionMap and ri_RootToChildMap, which I think it would be
better to reduce.  I mean, rather than add a new field it would be
better to repurpose the old one:

- ExecGetRootToChildMap should return TupleConversionMap *
- every place that accesses ri_RootToPartitionMap directly should be
  using ExecGetRootToChildMap() instead
- ExecGetRootToChildMap passes build_attrmap_by_name_if_req
  !resultRelInfo->ri_RelationDesc->rd_rel->relispartition
  as third argument to build_attrmap_by_name_if_req (rather than
  constant true), so that we keep the tuple compatibility checking we
  have there currently.

> 0002 contains changes that has to do with changing how we access
> checkAsUser in some foreign table planning/execution code sites.
> Thought it might be better to describe it separately too.

I'll get this one pushed soon, it seems good to me.  (I'll edit to not
use Oid as boolean.)

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/

Re: Partial aggregates pushdown

2022-11-30 Thread Alexander Pyhalov


Hi, Yuki.

1) In previous version of the patch aggregates, which had partialaggfn, 
were ok to push down. And it was a definite sign that aggregate can be 
pushed down. Now we allow pushing down an aggregate, which prorettype is 
not internal and aggfinalfn is not defined. Is it safe for all 
user-defined (or builtin) aggregates, even if they are generally 
shippable? Aggcombinefn is executed locally and we check that aggregate 
function itself is shippable. Is it enough? Perhaps, we could use 
partialagg_minversion (like aggregates with partialagg_minversion == -1 
should not be pushed down) or introduce separate explicit flag?


2) Do we really have to look at pg_proc in partial_agg_ok() and 
deparseAggref()? Perhaps, looking at aggtranstype is enough?


3) I'm not sure if CREATE AGGREGATE tests with invalid 
PARTIALAGGFUNC/PARTIALAGG_MINVERSION should be in postgres_fdw tests or 
better should be moved to src/test/regress/sql/create_aggregate.sql, as 
they are not specific to postgres_fdw


--
Best regards,
Alexander Pyhalov,
Postgres Professional

Re: pg_dump: Remove "blob" terminology

2022-11-30 Thread Daniel Gustafsson

> On 30 Nov 2022, at 08:04, Peter Eisentraut 
>  wrote:
> 
> For historical reasons, pg_dump refers to large objects as "BLOBs". This term 
> is not used anywhere else in PostgreSQL, and it also means something 
> different in the SQL standard and other SQL systems.
> 
> This patch renames internal functinos, code comments, documentation, etc. to 
> use the "large object" or "LO" terminology instead.  There is no 
> functionality change, so the archive format still uses the name "BLOB" for 
> the archive entry.  Additional long command-line options are added with the 
> new naming.

+1 on doing this.  No pointy bits stood out when reading, just a few small
comments:

The commit message contains a typo: functinos

  * called for both BLOB and TABLE data; it is the responsibility of
- * the format to manage each kind of data using StartBlob/StartData.
+ * the format to manage each kind of data using StartLO/StartData.

Should BLOB be changed to BLOBS here (and in similar comments) to make it
clearer that it refers to the archive entry and the concept of a binary large
object in general?

Theres an additional mention in src/test/modules/test_pg_dump/t/001_base.pl:

# Tests which are considered 'full' dumps by pg_dump, but there.
# are flags used to exclude specific items (ACLs, blobs, etc).

--
Daniel Gustafsson   https://vmware.com/

81 matches

Mail list logo