date:20230131

Re: meson: pkgconfig difference

2023-01-31 Thread Andres Freund

Hi, 

On January 31, 2023 11:40:52 PM PST, Peter Eisentraut 
 wrote:
>I think there is a tiny typo in src/interfaces/ecpg/ecpglib/meson.build:
>
>diff --git a/src/interfaces/ecpg/ecpglib/meson.build 
>b/src/interfaces/ecpg/ecpglib/meson.build
>index dba9e3c3d9..da8d304f54 100644
>--- a/src/interfaces/ecpg/ecpglib/meson.build
>+++ b/src/interfaces/ecpg/ecpglib/meson.build
>@@ -57,7 +57,7 @@ pkgconfig.generate(
>   description: 'PostgreSQL libecpg library',
>   url: pg_url,
>   libraries: ecpglib_so,
>-  libraries_private: [frontend_shlib_code, thread_dep],
>+  libraries_private: [frontend_stlib_code, thread_dep],
>   requires_private: ['libpgtypes', 'libpq'],
> )
>
>This makes it match the other libraries.
>
>Without this change, we get
>
>Libs.private: ... -lpgport_shlib -lpgcommon_shlib
>
>which seems wrong.

Ugh, yes, that's wrong. Do you want me to apply the fix?

Regards,

Andres

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

meson: pkgconfig difference

2023-01-31 Thread Peter Eisentraut


I think there is a tiny typo in src/interfaces/ecpg/ecpglib/meson.build:

diff --git a/src/interfaces/ecpg/ecpglib/meson.build 
b/src/interfaces/ecpg/ecpglib/meson.build

index dba9e3c3d9..da8d304f54 100644
--- a/src/interfaces/ecpg/ecpglib/meson.build
+++ b/src/interfaces/ecpg/ecpglib/meson.build
@@ -57,7 +57,7 @@ pkgconfig.generate(
   description: 'PostgreSQL libecpg library',
   url: pg_url,
   libraries: ecpglib_so,
-  libraries_private: [frontend_shlib_code, thread_dep],
+  libraries_private: [frontend_stlib_code, thread_dep],
   requires_private: ['libpgtypes', 'libpq'],
 )

This makes it match the other libraries.

Without this change, we get

Libs.private: ... -lpgport_shlib -lpgcommon_shlib

which seems wrong.

Re: Progress report of CREATE INDEX for nested partitioned tables

2023-01-31 Thread Ilya Gladyshev



> 1 февр. 2023 г., в 08:29, Justin Pryzby  написал(а):
> 
> On Tue, Jan 31, 2023 at 07:32:20PM +0400, Ilya Gladyshev wrote:
>>> 17 янв. 2023 г., в 23:44, Tomas Vondra  
>>> написал(а):
>>> Do we actually need the new parts_done field? I mean, we already do
>>> track the value - at PROGRESS_CREATEIDX_PARTITIONS_DONE index in the
>>> st_progress_param array. Can't we simply read it from there? Then we
>>> would not have ABI issues with the new field added to IndexStmt.
>> 
>> I think it’s a good approach and it could be useful outside of scope of this 
>> patch too. So I have attached a patch, that introduces 
>> pgstat_progress_incr_param function for this purpose. There’s one thing I am 
>> not sure about, IIUC, we can assume that the only process that can write 
>> into MyBEEntry of the current backend is the current backend itself, 
>> therefore looping to get consistent reads from this array is not required. 
>> Please correct me, if I am wrong here.
> 
> Thanks for the updated patch.
> 
> I think you're right - pgstat_begin_read_activity() is used for cases
> involving other backends.  It ought to be safe for a process to read its
> own status bits, since we know they're not also being written.
> 
> You changed DefineIndex() to update progress for the leaf indexes' when
> called recursively.  The caller updates the progress for "attached"
> indexes, but not created ones.  That allows providing fine-granularity
> progress updates when using intermediate partitions, right ?  (Rather
> than updating the progress by more than one at a time in the case of
> intermediate partitioning).
> 
> If my understanding is right, that's subtle, and adds a bit of
> complexity to the current code, so could use careful commentary.  I
> suggest:
> 
> * If the index was attached, update progress for all its direct and
> * indirect leaf indexes all at once.  If the index was built by calling
> * DefineIndex() recursively, the called function is responsible for
> * updating the progress report for built indexes.
> 
> ...
> 
> * If this is the top-level index, we're done. When called recursively
> * for child tables, the done partition counter is incremented now,
> * rather than in the caller.

Yes, you are correct about the intended behavior, I added your comments to the 
patch.

> I guess you know that there were compiler warnings (related to your
> question).
> https://cirrus-ci.com/task/6571212386598912
> 
> pgstat_progress_incr_param() could call pgstat_progress_update_param()
> rather than using its own Assert() and WRITE_ACTIVITY calls.  I'm not
> sure which I prefer, though.
> 
> Also, there are whitespace/tab/style issues in
> pgstat_progress_incr_param().
> 
> -- 
> Justin

Thank you for the review, I fixed the aforementioned issues in the v2.


v2-0001-create-index-progress-increment.patch
Description: Binary data

Re: Question regarding "Make archiver process an auxiliary process. commit"

2023-01-31 Thread Michael Paquier

On Tue, Jan 31, 2023 at 08:30:13PM -0800, Nathan Bossart wrote:
> I'm not sure why I thought time.h was no longer needed.  time() is clearly
> used elsewhere in this file.  Here's a new version with that added back.

Ah, I see.  The key point is that curtime and last_copy_time will most
likely be the same value as time() is second-based, so timeout is
basically always PGARCH_AUTOWAKE_INTERVAL.  There is no need to care
about time_to_stop, as we just go through and exit if it happens to be
switched to true.  Applied v3, keeping time_to_stop as it is in v2 and
v3 so as we don't loop again on a postmaster death.
--
Michael


signature.asc
Description: PGP signature

RE: pub/sub - specifying optional parameters without values.

2023-01-31 Thread houzj.f...@fujitsu.com

On Tuesday, January 31, 2023 10:49 PM Tom Lane  wrote:

Hi,

> Amit Kapila  writes:
> > On Tue, Jan 31, 2023 at 4:25 AM Tom Lane  wrote:
> >> Hmph.  I generally think that options defined like this (it's a
> >> boolean, except it isn't) are a bad idea, and would prefer to see
> >> that API rethought while we still can.
> 
> > We have discussed this during development and considered using a
> > separate option like parallel = on (or say parallel_workers = n) but
> > there were challenges with the same. See discussion in email [1]. We
> > also checked that we have various other places using something similar
> > for options. For example COPY commands option: HEADER [ boolean |
> > MATCH ].
> 
> Yeah, and it's bad experiences with the existing cases that make me not want 
> to
> add more.  Every one of those was somebody taking the easy way out.  It
> generally leads to parsing oddities, such as not accepting all the same 
> spellings
> of "boolean" as before.

I understand the worry of parsing oddities. I think we have tried to make the
streaming option keep accepting all the same spellings of boolean(e.g. the 
option still
accept(1/0/true/false...)). And this is similar to some other option like COPY
HEADER option which accepts all the boolean value and the 'match' value. Some
other GUCs like wal_compression also behave similarly:
0/1/true/false/on/off/lz1/pglz are all valid values.

Best Regards,
Hou zj

Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"

2023-01-31 Thread Thomas Munro

On Wed, Feb 1, 2023 at 5:04 PM Anton A. Melnikov  wrote:
> On 31.01.2023 14:38, Thomas Munro wrote:
> > Here's an experimental patch for that alternative.  I wonder if
> > someone would want to be able to turn it off for some reason -- maybe
> > some NFS problem?  It's less back-patchable, but maybe more
> > principled?
>
> It looks very strange to me that there may be cases where the cluster data
> is stored in NFS. Can't figure out how this can be useful.

Heh.  There are many interesting failure modes, but people do it.  I
guess my more general question when introducing any new system call
into this code is how some unusual system I can't even access is going
to break.  Maybe some obscure filesystem will fail with EOPNOTSUPP, or
take 5 seconds and then fail because there is no lock server
configured or whatever, so that's why I don't think we can back-patch
it, and we probably need a way to turn it off.

> i think this variant of the patch is a normal solution
> of the problem, not workaround. Found no problems on Linux.
> +1 for this variant.

I prefer it too.

> Might add a custom error message for EDEADLK
> since it absent in errcode_for_file_access()?

Ah, good thought.  I think it shouldn't happen™, so it's OK that
errcode_for_file_access() would classify it as ERRCODE_INTERNAL_ERROR.

Other interesting errors are:

ENOLCK: system limits exceeded; PANIC seems reasonable
EOPNOTSUPP: this file doesn't support locking (seen on FreeBSD man
pages, not on POSIX)

> > I don't know if Windows suffers from this type of problem.
> > Unfortunately its equivalent functionality LockFile() looks non-ideal
> > for this purpose: if your program crashes, the documentation is very
> > vague on when exactly it is released by the OS, but it's not
> > immediately on process exit.  That seems non-ideal for a control file
> > you might want to access again very soon after a crash, to be able to
> > recover.
>
> Unfortunately i've not had time to reproduce the problem and apply this patch 
> on
> Windows yet but i'm going to do it soon on windows 10. If a crc error
> will occur there, then we might use the workaround from the first
> variant of the patch.

Thank you for investigating.  I am afraid to read your results.

One idea would be to proceed with LockFile() for Windows, with a note
suggesting you file a bug with your OS vendor if you ever need it to
get unstuck.  Googling this subject, I read that MongoDB used to
suffer from stuck lock files, until an OS bug report led to recent
versions releasing locks more promptly.  I find that sufficiently
scary that I would want to default the feature to off on Windows, even
if your testing shows that it does really need it.

> > A thought in passing: if UpdateMinRecoveryPoint() performance is an
> > issue, maybe we should figure out how to use fdatasync() instead of
> > fsync().
>
> May be choose it in accordance with GUC wal_sync_method?

Here's a patch like that.  I don't know if it's a good idea for
wal_sync_method to affect other kinds of files or not, but, then, it
already does (fsync_writethough changes this behaviour).   I would
only want to consider this if we also stop choosing "open_datasync" as
a default on at least Windows.
From f55c98daefc4eec7f05413edf92b3b1e7070e1ef Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Tue, 31 Jan 2023 21:08:12 +1300
Subject: [PATCH] Apply wal_sync_method to pg_control file.

When writing out the control file frequently on a standby, we can go a
little faster on some systems by using fdatasync() instead of fsync(),
with no loss of safety.  We already respected the special
"wal_sync_method=fsync_writethrough" via a circuitous route, but we
might as well support the full range of methods.
---
 src/backend/access/transam/xlog.c  | 19 -
 src/common/controldata_utils.c | 59 --
 src/include/common/controldata_utils.h |  7 ++-
 3 files changed, 70 insertions(+), 15 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index fb4c860bde..d0a7a3d7eb 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4165,7 +4165,24 @@ ReadControlFile(void)
 static void
 UpdateControlFile(void)
 {
-	update_controlfile(DataDir, ControlFile, true);
+	int			sync_op;
+
+	switch (sync_method)
+	{
+		case SYNC_METHOD_FDATASYNC:
+			sync_op = UPDATE_CONTROLFILE_FDATASYNC;
+			break;
+		case SYNC_METHOD_OPEN:
+			sync_op = UPDATE_CONTROLFILE_O_SYNC;
+			break;
+		case SYNC_METHOD_OPEN_DSYNC:
+			sync_op = UPDATE_CONTROLFILE_O_DSYNC;
+			break;
+		default:
+			sync_op = UPDATE_CONTROLFILE_FSYNC;
+	}
+
+	update_controlfile(DataDir, ControlFile, sync_op);
 }

 /*
diff --git a/src/common/controldata_utils.c b/src/common/controldata_utils.c
index 9723587466..1bc1c6f8d4 100644
--- a/src/common/controldata_utils.c
+++ b/src/common/controldata_utils.c
@@ -136,18 +136,39 @@ get_controlfile(const char *DataDir, bool *crc_ok_p)
  *

Re: Allow an extention to be updated without a script

2023-01-31 Thread Julien Rouhaud

Hi,

On Tue, Jan 31, 2023 at 11:17:22AM +0900, Yugo NAGATA wrote:
>
> On Mon, 30 Jan 2023 16:05:52 -0500
> Tom Lane  wrote:
>
> > If you have no update script, why call it a new version?  The point
> > of extension versions is to distinguish different states of the
> > extension's SQL objects.  We do not consider mods in underlying C code
> > to justify a new version.
>
> Although, as you say, the extension manager doesn't track changes in C code
> functions, a new version could be released with changes in only in C
> functions that implement a new feature or fix some bugs because it looks
> a new version from user's view.  Actually, there are several extensions
> that provide empty update scripts in order to allow user to install such
> new versions, for example;
>
> [...]
> - hypopg
>  (https://github.com/HypoPG/hypopg/blob/REL1_STABLE/hypopg--1.3.1--1.3.2.sql)
> [...]

Indeed, almost all users don't really understand the difference between the
module / C code and the extension, and that gets worse when
shared_preload_libraries gets in the way.

I personally choose to ship "empty" extension versions so that people can
upgrade it if they want to have e.g. the OS level version match the SQL level
version.  I know some extensions that chose a different approach: keep the
first 2 digits for anything that involves extension changes and have a 3rd
digit for C level bugfix only.  But they get very frequent bug reports about
version mismatch any time a C bugfix is released, so I will again personally
keep shipping those useless versions.  That being said, I agree with Tom here
and we shouldn't add hacks for that.

Re: [Commitfest 2023-01] has started

2023-01-31 Thread Michael Paquier

On Wed, Feb 01, 2023 at 01:54:50PM +0800, Julien Rouhaud wrote:
> On Wed, Feb 1, 2023 at 10:44 AM vignesh C  wrote:
>> I don't have permissions to close the commitfest, could one of them
>> help in closing the commitfest.

Wow.  Thanks for looking at all these entries!
--
Michael


signature.asc
Description: PGP signature

Re: heapgettup refactoring

2023-01-31 Thread David Rowley

On Tue, 31 Jan 2023 at 12:18, Melanie Plageman
 wrote:
>
> On Fri, Jan 27, 2023 at 10:34 PM David Rowley  wrote:
> > 4. I think it might be a good idea to use unlikely() in if
> > (!scan->rs_inited). The idea is to help coax the compiler into moving
> > that code off to a cold path. That's likely especially important if
> > heapgettup_initial_block is inlined, which I see it is marked as.
>
> I've gone ahead and added unlikely. However, should I perhaps skip
> inlining the heapgettup_initial_block() function?

I'm not sure of the exact best combination of functions to mark as
inline. I did try the v7 patchset from 0002 to 0006 on top of c2891175
and I found that the performance is slightly better after removing
inline from all 4 of the helper functions. However, I think if we do
unlikely() and the function is moved into the cold path then it
matters less if it's inlined.

create table a (a int);
insert into a select x from generate_series(1,100)x;
vacuum freeze a;

$ cat seqscan.sql
select * from a where a = 0;
$ cat countall.sql
select count(*) from a;

seqscan.sql filters out all rows and countall.sql returns all rows and
does an aggregate so we don't have to return all those in the query.

max_parallel_workers_per_gather=0;

master
$ psql -c "select pg_prewarm('a')" postgres > /dev/null && for i in
{1..3}; do pgbench -n -f seqscan.sql -M prepared -T 10 postgres | grep
tps; done
tps = 25.464091 (without initial connection time)
tps = 25.117001 (without initial connection time)
tps = 25.141646 (without initial connection time)

$ psql -c "select pg_prewarm('a')" postgres > /dev/null && for i in
{1..3}; do pgbench -n -f countall.sql -M prepared -T 10 postgres |
grep tps; done
tps = 27.906307 (without initial connection time)
tps = 27.527580 (without initial connection time)
tps = 27.563035 (without initial connection time)

master + v7
$ psql -c "select pg_prewarm('a')" postgres > /dev/null && for i in
{1..3}; do pgbench -n -f seqscan.sql -M prepared -T 10 postgres | grep
tps; done
tps = 25.920370 (without initial connection time)
tps = 25.680052 (without initial connection time)
tps = 24.988895 (without initial connection time)

$ psql -c "select pg_prewarm('a')" postgres > /dev/null && for i in
{1..3}; do pgbench -n -f countall.sql -M prepared -T 10 postgres |
grep tps; done
tps = 33.783122 (without initial connection time)
tps = 33.248571 (without initial connection time)
tps = 33.512984 (without initial connection time)

master + v7 + inline removed
$ psql -c "select pg_prewarm('a')" postgres > /dev/null && for i in
{1..3}; do pgbench -n -f seqscan.sql -M prepared -T 10 postgres | grep
tps; done
tps = 27.680115 (without initial connection time)
tps = 26.418562 (without initial connection time)
tps = 26.166800 (without initial connection time)

$ psql -c "select pg_prewarm('a')" postgres > /dev/null && for i in
{1..3}; do pgbench -n -f countall.sql -M prepared -T 10 postgres |
grep tps; done
tps = 33.948588 (without initial connection time)
tps = 33.684966 (without initial connection time)
tps = 33.946700 (without initial connection time)

You can see that v7 helps countall.sql quite a bit. It seems to also
help a little bit with seqscan.sql. v7 + inline removed makes
seqscan.sql a decent amount faster than both master and master + v7.

David

Re: [Commitfest 2023-01] has started

2023-01-31 Thread Julien Rouhaud

On Wed, Feb 1, 2023 at 10:44 AM vignesh C  wrote:
>
> I don't have permissions to close the commitfest, could one of them
> help in closing the commitfest.

It's technically 17:53 at Anywhere on Earth, so we usually wait for
the day to be over before doing so.  But since you already took care
triaging all CF entries I closed to CF.

Thanks for being the CFM!

Re: Assertion failure in SnapBuildInitialSnapshot()

2023-01-31 Thread Amit Kapila

On Tue, Jan 31, 2023 at 6:08 PM Masahiko Sawada  wrote:
>
> Attached updated patches.
>

Thanks, Andres, others, do you see a better way to fix this problem? I
have reproduced it manually and the steps are shared at [1] and
Sawada-San also reproduced it, see [2].

[1] - 
https://www.postgresql.org/message-id/CAA4eK1KDFeh%3DZbvSWPx%3Dir2QOXBxJbH0K8YqifDtG3xJENLR%2Bw%40mail.gmail.com
[2] - 
https://www.postgresql.org/message-id/CAD21AoDKJBB6p4X-%2B057Vz44Xyc-zDFbWJ%2Bg9FL6qAF5PC2iFg%40mail.gmail.com

-- 
With Regards,
Amit Kapila.

Re: Fix GUC_NO_SHOW_ALL test scenario in 003_check_guc.pl

2023-01-31 Thread Michael Paquier

On Mon, Jan 30, 2023 at 05:12:27PM +0530, Nitin Jadhav wrote:
> Thanks Michael for identifying a new mistake. I am a little confused
> here. I don't understand why GUC_NO_SHOW_ALL depends on other GUCs
> like GUC_NOT_IN_SAMPLE or GUC_NO_RESET_ALL. Looks like the dependency
> between GUC_NO_RESET_ALL and GUC_NO_SHOW_ALL is removed in the above
> patch since we have these combinations now.

pg_settings would be unable to show something marked as NO_SHOW_ALL,
so the SQL check that looked after (NO_SHOW_ALL && !NO_RESET_ALL) is
a no-op.  Postgres will likely gain more parameters that are kept
around for compability reasons, and forcing a NO_RESET_ALL in such
cases could impact applications using RESET on such GUCs, meaning
potential compatibility breakages.

> Similarly why can't we
> have a GUC marked as GUC_NO_SHOW_ALL but not GUC_NOT_IN_CONFIG. For me
> it makes sense if a GUC is marked as NO_SHOW_ALL and it can be present
> in a sample file. It's up to the author/developer to choose which all
> flags are applicable to the newly introduced GUCs. Please share your
> thoughts.

As also mentioned upthread by Tom, I am not sure that this combination
makes much sense, actually, because I don't see why one would never
want to know what is the effective value loaded for a parameter stored
in a file when he/she has the permission to do so.  This could be
changed as of ALTER SYSTEM, postgresql.conf or even an included file,
and the value can only be read if permission to see it is given to the
role querying SHOW or pg_settings.  This combination of flags is not a
practice to encourage.
--
Michael


signature.asc
Description: PGP signature

Re: Support for dumping extended statistics

2023-01-31 Thread Hari krishna Maddileti


+ pgsql-hackers

Hi Justin,
Thanks for the update, I have attached the updated patch with 
meson compatible and  addressed warnings from make file too.


On 15/01/23, 2:27 AM, "Justin Pryzby"  wrote:

!! External Email

On Tue, Jan 10, 2023 at 11:28:36AM +, Hari krishna Maddileti wrote:
> Thanks Team for showing interest.
>
> Please find the attached patch, which uses the same approach as mentioned in 
> previous email to implement input functions to parse pg_distinct, 
> pg_dependency and pg_mcv_list strings.

The patch is failing ; you need to make the corresponding update to
meson as you did for make.

https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fcfbot.cputube.org%2Fdavid-kimura.html=05%7C01%7Chmaddileti%40vmware.com%7C299f368fff494a8eddc508daf671e768%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C638093266355001101%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=ijYtKFzkEiruO9ZyzqEhsDakZG6G9IjJQgY3DiN4eUQ%3D=0
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki.postgresql.org%2Fwiki%2FMeson_for_patch_authors=05%7C01%7Chmaddileti%40vmware.com%7C299f368fff494a8eddc508daf671e768%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C638093266355001101%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=udY5fPdSMhi1wlcNiR0EHwvdiV5ozoQL8gDhNfJCcUI%3D=0
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki.postgresql.org%2Fwiki%2FMeson=05%7C01%7Chmaddileti%40vmware.com%7C299f368fff494a8eddc508daf671e768%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C638093266355001101%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=kHDvMHWoGXyk67%2FM9Kkct%2Bl4t2554XyJCoy53Eqx1xo%3D=0

But actually, it also fails to compile with "make".

--
Justin

!! External Email: This email originated from outside of the organization. Do 
not click links or open attachments unless you recognize the sender.

Re: recovery modules

2023-01-31 Thread Michael Paquier

On Tue, Jan 31, 2023 at 03:30:13PM -0800, Nathan Bossart wrote:
> Okay, here is a new patch set with the aforementioned adjustments and
> documentation updates.

So, it looks like you have addressed the feedback received here, as
of:
- Rename of Context to Callback.
- Move of the definition into their own header. 
- Introduction of a callback for the startup initialization.
- Pass down a private state to each callback.

I have a few minor comments.

+/*
+ * Since the logic for archiving via a shell command is in the core server
+ * and does not need to be loaded via a shared library, it has a special
+ * initialization function.
+ */
+extern const ArchiveModuleCallbacks *shell_archive_init(void);
Storing that in archive_module.h is not incorrect, still feels a bit
unnatural.  I would have used a separate header for clarity.  It may
not sound like a big deal, but we may want this separation if
archive_module.h is used in some frontend code in the future.  Perhaps
that will never be the case, but I've seen many fancy (as in useful)
proposals in the past when it comes to such things.

 static bool
-shell_archive_configured(void)
+shell_archive_configured(void *private_state)
 {
return XLogArchiveCommand[0] != '\0';
Maybe check that in this context private_state should be NULL?  The
other two callbacks could use an assert, as well.

-   _PG_archive_module_init.  This function is passed a
-   struct that needs to be filled with the callback function pointers for
-   individual actions.
+   _PG_archive_module_init.  This function must return a
+   struct filled with the callback function pointers for individual actions.

Worth mentioning the name of the structure, as of "This function must
return a structure ArchiveModuleCallbacks filled with.."

+The startup_cb callback is called shortly after the
+module is loaded.  This callback can be used to perform any additional
+initialization required.  If the archive module needs a state, it should
+return a pointer to the state.  This pointer will be passed to each of the
+module's other callbacks via the void *private_state
+argument.

Not sure about the complexity of two sentences here.  This could
simply be:
This function can return a pointer to an area of memory dedicated to
the state of the archive module loaded.  This pointer is passed to
each of the module's other callbacks as the argument
private_state.

Side note: it looks like there is nothing in archive-modules.sgml
telling that these modules are only loaded by the archiver process.
--
Michael


signature.asc
Description: PGP signature

Re: Support for dumping extended statistics

2023-01-31 Thread Hari krishna Maddileti



Hi Justin,
Thanks for the update, I have attached the updated patch with 
meson compatible and  addressed warnings from make file too.


On 15/01/23, 2:27 AM, "Justin Pryzby"  wrote:

!! External Email

On Tue, Jan 10, 2023 at 11:28:36AM +, Hari krishna Maddileti wrote:
> Thanks Team for showing interest.
>
> Please find the attached patch, which uses the same approach as mentioned in 
> previous email to implement input functions to parse pg_distinct, 
> pg_dependency and pg_mcv_list strings.

The patch is failing ; you need to make the corresponding update to
meson as you did for make.

https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fcfbot.cputube.org%2Fdavid-kimura.html=05%7C01%7Chmaddileti%40vmware.com%7C299f368fff494a8eddc508daf671e768%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C638093266355001101%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=ijYtKFzkEiruO9ZyzqEhsDakZG6G9IjJQgY3DiN4eUQ%3D=0
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki.postgresql.org%2Fwiki%2FMeson_for_patch_authors=05%7C01%7Chmaddileti%40vmware.com%7C299f368fff494a8eddc508daf671e768%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C638093266355001101%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=udY5fPdSMhi1wlcNiR0EHwvdiV5ozoQL8gDhNfJCcUI%3D=0
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki.postgresql.org%2Fwiki%2FMeson=05%7C01%7Chmaddileti%40vmware.com%7C299f368fff494a8eddc508daf671e768%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C638093266355001101%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=kHDvMHWoGXyk67%2FM9Kkct%2Bl4t2554XyJCoy53Eqx1xo%3D=0

But actually, it also fails to compile with "make".

--
Justin

!! External Email: This email originated from outside of the organization. Do 
not click links or open attachments unless you recognize the sender.


v2-0001-Implement-input-functions-for-extended-statistics.patch
Description:  v2-0001-Implement-input-functions-for-extended-statistics.patch

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-01-31 Thread Peter Smith

Here are my review comments for the patch v25-0001.

==
Commit Message

1.
The other possibility is to apply the delay at the end of the parallel
apply transaction but that would cause issues related to resource
bloat and locks being held for a long time.

~

SUGGESTION
We chose not to apply the delay at the end of the parallel apply
transaction because that would cause issues related to resource bloat
and locks being held for a long time.

==
doc/src/sgml/config.sgml

2.
+  
+   For time-delayed logical replication, the apply worker sends a feedback
+   message to the publisher every
+   wal_receiver_status_interval milliseconds. Make sure
+   to set wal_receiver_status_interval less than the
+   wal_sender_timeout on the publisher, otherwise, the
+   walsender will repeatedly terminate due to timeout
+   error. Note that if wal_receiver_status_interval is
+   set to zero, the apply worker sends no feedback messages during the
+   min_apply_delay period.
+  

2a.
"due to timeout error." --> "due to timeout errors."

~

2b.
Shouldn't this also cross-ref to CREATE SUBSCRIPTION docs? Because the
above mentions 'min_apply_delay' but that is not defined on this page.

==
doc/src/sgml/ref/create_subscription.sgml

3.
+ 
+  By default, the subscriber applies changes as soon as possible. This
+  parameter allows the user to delay the application of changes by a
+  given time period. If the value is specified without units, it is
+  taken as milliseconds. The default is zero (no delay). See
+   for details on the
+  available valid time unites.
+ 

Typo: "unites"

~~~

4.
+ 
+  Any delay becomes effective after all initial table synchronization
+  has finished and occurs before each transaction starts to get applied
+  on the subscriber. The delay is calculated as the difference between
+  the WAL timestamp as written on the publisher and the current time on
+  the subscriber. Any overhead of time spent in logical decoding and in
+  transferring the transaction may reduce the actual wait time. It is
+  also possible that the overhead already exceeds the requested
+  min_apply_delay value, in which case no delay is
+  applied. If the system clocks on publisher and subscriber are not
+  synchronized, this may lead to apply changes earlier than expected,
+  but this is not a major issue because this parameter is typically
+  much larger than the time deviations between servers. Note that if
+  this parameter is set to a long delay, the replication will stop if
+  the replication slot falls behind the current LSN by more than
+  max_slot_wal_keep_size.
+ 

"Any delay becomes effective after all initial table
synchronization..." --> "Any delay becomes effective only after all
initial table synchronization..."

~~~

5.
+ 
+   
+Delaying the replication means there is a much longer time between
+making a change on the publisher, and that change being committed
+on the subscriber. This can impact the performance of synchronous
+replication. See 
+parameter.
+   
+ 


I'm not sure why this was text changed to say "means there is a much
longer time" instead of "can mean there is a much longer time".

IMO the previous wording was better because this current text makes an
assumption about what the user has configured -- e.g. if they
configured only 1ms delay then the warning text is not really
relevant.

~~~

6.
Why was the example (it existed when I last looked at patch v19)
removed? Personally, I found that example to be a useful reminder that
the min_apply_delay can specify units other than just 'ms'.

==
src/backend/commands/subscriptioncmds.c

7. parse_subscription_options

+ /*
+ * The combination of parallel streaming mode and min_apply_delay is not
+ * allowed. This is because we start applying the transaction stream as
+ * soon as the first change arrives without knowing the transaction's
+ * prepare/commit time. This means we cannot calculate the underlying
+ * network/decoding lag between publisher and subscriber, and so always
+ * waiting for the full 'min_apply_delay' period might include unnecessary
+ * delay.
+ *
+ * The other possibility is to apply the delay at the end of the parallel
+ * apply transaction but that would cause issues related to resource bloat
+ * and locks being held for a long time.
+ */

I think the 2nd paragraph should be changed slightly as follows (like
review comment #1)

SUGGESTION
Note - we chose not to apply the delay at the end of the parallel
apply transaction because that would cause issues related to resource
bloat and locks being held for a long time.

~~~

8.
+ if (IsSet(supported_opts, SUBOPT_MIN_APPLY_DELAY) &&
+

Re: Logical replication timeout problem

2023-01-31 Thread Amit Kapila

On Tue, Jan 31, 2023 at 8:24 PM Ashutosh Bapat
 wrote:
>
> On Tue, Jan 31, 2023 at 5:12 PM Amit Kapila  wrote:
> >
> > On Tue, Jan 31, 2023 at 5:03 PM Ashutosh Bapat
> >  wrote:
> > >
> > > On Tue, Jan 31, 2023 at 4:58 PM Amit Kapila  
> > > wrote:
> > >
> > > > Thanks, the patch looks good to me. I have slightly adjusted one of
> > > > the comments and ran pgindent. See attached. As mentioned in the
> > > > commit message, we shouldn't backpatch this as this requires a new
> > > > callback and moreover, users can increase the wal_sender_timeout and
> > > > wal_receiver_timeout to avoid this problem. What do you think?
> > >
> > > The callback and the implementation is all in core. What's the risk
> > > you see in backpatching it?
> > >
> >
> > Because we are changing the exposed structure and which can break
> > existing extensions using it.
>
> Is that because we are adding the new member in the middle of the
> structure?
>

Not only that but this changes the size of the structure and we want
to avoid that as well in stable branches. See email [1] (you can't
change the struct size either ...). As per my understanding, our usual
practice is to not change the exposed structure's size/definition in
stable branches.


[1] - https://www.postgresql.org/message-id/2358496.1649168259%40sss.pgh.pa.us

-- 
With Regards,
Amit Kapila.

Re: Question regarding "Make archiver process an auxiliary process. commit"

2023-01-31 Thread Nathan Bossart

On Fri, Jan 20, 2023 at 11:39:56AM -0800, Nathan Bossart wrote:
> I noticed that time.h is no longer needed by the archiver, so I removed
> that and fixed an indentation nitpick in the attached v2.  I'm going to set
> the commitfest entry to ready-for-committer shortly after sending this
> message.

I'm not sure why I thought time.h was no longer needed.  time() is clearly
used elsewhere in this file.  Here's a new version with that added back.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 3313989a30b0821ed7014527b0342e7c63b59169 Mon Sep 17 00:00:00 2001
From: Sravan Velagandula 
Date: Tue, 6 Dec 2022 06:21:38 -0500
Subject: [PATCH v3 1/1] simplify wait loop in the archiver

---
 src/backend/postmaster/pgarch.c | 28 +---
 1 file changed, 9 insertions(+), 19 deletions(-)

diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index 8ecdb9ca23..51d882c17a 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -297,7 +297,6 @@ pgarch_waken_stop(SIGNAL_ARGS)
 static void
 pgarch_MainLoop(void)
 {
-	pg_time_t	last_copy_time = 0;
 	bool		time_to_stop;
 
 	/*
@@ -335,30 +334,21 @@ pgarch_MainLoop(void)
 
 		/* Do what we're here for */
 		pgarch_ArchiverCopyLoop();
-		last_copy_time = time(NULL);
 
 		/*
 		 * Sleep until a signal is received, or until a poll is forced by
-		 * PGARCH_AUTOWAKE_INTERVAL having passed since last_copy_time, or
-		 * until postmaster dies.
+		 * PGARCH_AUTOWAKE_INTERVAL, or until postmaster dies.
 		 */
 		if (!time_to_stop)		/* Don't wait during last iteration */
 		{
-			pg_time_t	curtime = (pg_time_t) time(NULL);
-			int			timeout;
-
-			timeout = PGARCH_AUTOWAKE_INTERVAL - (curtime - last_copy_time);
-			if (timeout > 0)
-			{
-int			rc;
-
-rc = WaitLatch(MyLatch,
-			   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
-			   timeout * 1000L,
-			   WAIT_EVENT_ARCHIVER_MAIN);
-if (rc & WL_POSTMASTER_DEATH)
-	time_to_stop = true;
-			}
+			int			rc;
+
+			rc = WaitLatch(MyLatch,
+		   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+		   PGARCH_AUTOWAKE_INTERVAL * 1000L,
+		   WAIT_EVENT_ARCHIVER_MAIN);
+			if (rc & WL_POSTMASTER_DEATH)
+time_to_stop = true;
 		}
 
 		/*
-- 
2.25.1

Re: Progress report of CREATE INDEX for nested partitioned tables

2023-01-31 Thread Justin Pryzby

On Tue, Jan 31, 2023 at 07:32:20PM +0400, Ilya Gladyshev wrote:
> > 17 янв. 2023 г., в 23:44, Tomas Vondra  
> > написал(а):
> > Do we actually need the new parts_done field? I mean, we already do
> > track the value - at PROGRESS_CREATEIDX_PARTITIONS_DONE index in the
> > st_progress_param array. Can't we simply read it from there? Then we
> > would not have ABI issues with the new field added to IndexStmt.
> 
> I think it’s a good approach and it could be useful outside of scope of this 
> patch too. So I have attached a patch, that introduces 
> pgstat_progress_incr_param function for this purpose. There’s one thing I am 
> not sure about, IIUC, we can assume that the only process that can write into 
> MyBEEntry of the current backend is the current backend itself, therefore 
> looping to get consistent reads from this array is not required. Please 
> correct me, if I am wrong here.

Thanks for the updated patch.

I think you're right - pgstat_begin_read_activity() is used for cases
involving other backends.  It ought to be safe for a process to read its
own status bits, since we know they're not also being written.

You changed DefineIndex() to update progress for the leaf indexes' when
called recursively.  The caller updates the progress for "attached"
indexes, but not created ones.  That allows providing fine-granularity
progress updates when using intermediate partitions, right ?  (Rather
than updating the progress by more than one at a time in the case of
intermediate partitioning).

If my understanding is right, that's subtle, and adds a bit of
complexity to the current code, so could use careful commentary.  I
suggest:

* If the index was attached, update progress for all its direct and
* indirect leaf indexes all at once.  If the index was built by calling
* DefineIndex() recursively, the called function is responsible for
* updating the progress report for built indexes.

...

* If this is the top-level index, we're done. When called recursively
* for child tables, the done partition counter is incremented now,
* rather than in the caller.

I guess you know that there were compiler warnings (related to your
question).
https://cirrus-ci.com/task/6571212386598912

pgstat_progress_incr_param() could call pgstat_progress_update_param()
rather than using its own Assert() and WRITE_ACTIVITY calls.  I'm not
sure which I prefer, though.

Also, there are whitespace/tab/style issues in
pgstat_progress_incr_param().

-- 
Justin

Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"

2023-01-31 Thread Anton A. Melnikov


Hi, Thomas!

There are two variants of the patch now.

1) As for the first workaround:

On 31.01.2023 07:09, Thomas Munro wrote:


Maybe it's unlikely that two samples will match while running that
torture test, because it's overwriting the file as fast as it can.
But the idea is that a real system isn't writing the control file most
of the time.




Yeah, I was thinking that we should also put a limit on the loop, just
to be cautious.


At first i didn’t understand that the equality condition with the previous
calculated crc and the current one at the second+ attempts was intended
for the case when the pg_control file is really corrupted.

Indeed, by making a few debugging variables and running the tortue test,
i found that there were ~4000 crc errors (~0.0003%) in ~125 million reads from 
the file,
and there was no case when the crc error appeared twice in a row.
So the second and moreover the third successive occurrence of an crc error
can be neglected and for this workaround seems a simpler and maybe more clear
algorithm is possible.
For instance:

for(try = 0 ; try < 3; try++)
{
   open, read from and close pg_control;
   calculate crc;

   *crc_ok_p = EQ_CRC32C(crc, ControlFile->crc);

   if(*crc_ok_p)
  break;
}

2) As for the second variant of the patch with POSIX locks:

On 31.01.2023 14:38, Thomas Munro wrote:

On Tue, Jan 31, 2023 at 5:09 PM Thomas Munro  wrote:

Clearly there is an element of speculation or superstition here.  I
don't know what else to do if both PostgreSQL and ext4 decided not to
add interlocking.  Maybe we should rethink that.  How bad would it
really be if control file access used POSIX file locking?  I mean, the
writer is going to *fsync* the file, so it's not like one more wafer
thin system call is going to hurt too much.


Here's an experimental patch for that alternative.  I wonder if
someone would want to be able to turn it off for some reason -- maybe
some NFS problem?  It's less back-patchable, but maybe more
principled?


It looks very strange to me that there may be cases where the cluster data
is stored in NFS. Can't figure out how this can be useful.

i think this variant of the patch is a normal solution
of the problem, not workaround. Found no problems on Linux.
+1 for this variant.

Might add a custom error message for EDEADLK
since it absent in errcode_for_file_access()?


I don't know if Windows suffers from this type of problem.
Unfortunately its equivalent functionality LockFile() looks non-ideal
for this purpose: if your program crashes, the documentation is very
vague on when exactly it is released by the OS, but it's not
immediately on process exit.  That seems non-ideal for a control file
you might want to access again very soon after a crash, to be able to
recover.


Unfortunately i've not had time to reproduce the problem and apply this patch on
Windows yet but i'm going to do it soon on windows 10. If a crc error
will occur there, then we might use the workaround from the first
variant of the patch.


A thought in passing: if UpdateMinRecoveryPoint() performance is an
issue, maybe we should figure out how to use fdatasync() instead of
fsync().


May be choose it in accordance with GUC wal_sync_method?


Sincerely yours,

--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Generating "Subplan Removed" in EXPLAIN

2023-01-31 Thread David Rowley

On Wed, 1 Feb 2023 at 15:53, Yugo NAGATA  wrote:
> Maybe, you missed to set plan_cache_mode to force_generic_plan.
> "Subplan Removed" doesn't appear when using a custom plan.

I wouldn't say that's 100% true. The planner is only able to prune
using values which are known during planning. Constant folding is
going to evaluate any immutable functions during planning, but nothing
more.

Partition pruning might be delayed until execution time if some
expression that's being compared to the partition key is stable. e.g:

create table rp (t timestamp not null) partition by range(t);
create table rp2022 partition of rp for values from ('2022-01-01') to
('2023-01-01');
create table rp2023 partition of rp for values from ('2023-01-01') to
('2024-01-01');

explain select * from rp where t >= now();

 Append  (cost=0.00..95.33 rows=1506 width=8)
   Subplans Removed: 1
   ->  Seq Scan on rp2023 rp_1  (cost=0.00..43.90 rows=753 width=8)
 Filter: (t >= now())

David

Re: Logical replication timeout problem

2023-01-31 Thread Amit Kapila

On Wed, Feb 1, 2023 at 4:43 AM Peter Smith  wrote:
>
> Here are my review comments for v13-1.
>
> ==
> Commit message
>
> 1.
> The DDLs like Refresh Materialized views that generate lots of temporary
> data due to rewrite rules may not be processed by output plugins (for
> example pgoutput). So, we won't send keep-alive messages for a long time
> while processing such commands and that can lead the subscriber side to
> timeout.
>
> ~
>
> SUGGESTION (minor rearranged way to say the same thing)
>
> For DDLs that generate lots of temporary data due to rewrite rules
> (e.g. REFRESH MATERIALIZED VIEW) the output plugins (e.g. pgoutput)
> may not be processed for a long time. Since we don't send keep-alive
> messages while processing such commands that can lead the subscriber
> side to timeout.
>

Hmm, this makes it less clear and in fact changed the meaning.

> ~~~
>
> 2.
> The commit message says what the problem is, but it doesn’t seem to
> describe what this patch does to fix the problem.
>

I thought it was apparent and the code comments made it clear.

>
> 4.
> +#define CHANGES_THRESHOLD 100
> +
> + if (++changes_count >= CHANGES_THRESHOLD)
> + {
> + rb->update_progress_txn(rb, txn, change->lsn);
> + changes_count = 0;
> + }
>
> I was wondering if it would have been simpler to write this code like below.
>
> Also, by doing it this way the 'changes_count' variable name makes
> more sense IMO, otherwise (for current code) maybe it should be called
> something like 'changes_since_last_keepalive'
>
> SUGGESTION
> if (++changes_count % CHANGES_THRESHOLD == 0)
> rb->update_progress_txn(rb, txn, change->lsn);
>

I find the current code in the patch clear and easy to understand.

-- 
With Regards,
Amit Kapila.

Re: Weird failure with latches in curculio on v15

2023-01-31 Thread Thomas Munro

My database off assertion failures has four like that, all 15 and master:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=curculio=2023-02-01%2001:05:17
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=curculio=2023-01-11%2011:16:40
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=curculio=2022-11-22%2012:19:21
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=conchuela=2022-11-17%2021:47:02

It's always in proc_exit() in StartupProcShutdownHandler(), a SIGTERM
handler which is allowed to call that while in_restore_command is
true.

Here's a different one, some kind of latch corruption in the WAL
writer under 017_shm:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae=2022-01-20%2016:26:54
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae=2022-01-20%2016:26:54

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-01-31 Thread Amit Kapila

On Wed, Feb 1, 2023 at 8:13 AM Kyotaro Horiguchi
 wrote:
>
> At Tue, 31 Jan 2023 15:12:14 +0530, Amit Kapila  
> wrote in
> > On Tue, Jan 31, 2023 at 1:40 PM Kyotaro Horiguchi
> >  wrote:
> > >
> > > Hi, Kuroda-san, Thanks for the detailed study.
> > >
> > > At Tue, 31 Jan 2023 07:06:40 +, "Hayato Kuroda (Fujitsu)" 
> > >  wrote in
> > > > Therefore, I think we can say that modern platforms that are supported 
> > > > by PostgreSQL define int as 32-bit.
> > > > It satisfies the condition sizeof(int) <= sizeof(int32), so we can keep 
> > > > to use INT_MAX.
> > >
> > > Yeah, I know that that's practically correct.  Just I wanted to make
> > > clear is whether we (always) assume int == int32. I don't want to do
> > > that just because that works. Even though we cannot be perfect, in
> > > this particular case the destination space is explicitly made as
> > > int32.
> > >
> >
> > So, shall we check if the result of parse_int is in the range 0 and
> > PG_INT32_MAX to ameliorate this concern?
>
> Yeah, it is exactly what I wanted to suggest.
>
> > If this works then we need to
> > probably change the return value of defGetMinApplyDelay() to int32.
>
> I didn't thought doing that, int can store all values in the valid
> range (I'm assuming we implicitly assume int >= int32 in bit width)
> and it is the natural integer in C.  Either will do for me but I
> slightly prefer to use int there.
>

I think it would be clear to use int32 because the parameter where we
store the return value is also int32.

-- 
With Regards,
Amit Kapila.

RE: Allow logical replication to copy tables in binary format

2023-01-31 Thread Takamichi Osumi (Fujitsu)

On Monday, January 30, 2023 7:50 PM Melih Mutlu  wrote:
> Thanks for reviewing. 
...
> Please see [1] and you'll get the following error in your case:
> "ERROR:  incorrect binary data format in logical replication column 1"
Hi, thanks for sharing v7.

(1) general comment

I wondered if the addition of the new option/parameter can introduce some 
confusion to the users.

case 1. When binary = true and copy_format = text, the table sync is conducted 
by text.
case 2. When binary = false and copy_format = binary, the table sync is 
conducted by binary.
(Note that the case 2 can be accepted after addressing the 3rd comment of 
Bharath-san in [1].
I agree with the 3rd comment by itself.)

The name of the new subscription parameter looks confusing.
How about "init_sync_format" or something ?

(2) The commit message doesn't get updated.

The commit message needs to mention the new subscription option.

(3) whitespace errors.

$ git am v7-0001-Allow-logical-replication-to-copy-table-in-binary.patch
Applying: Allow logical replication to copy table in binary
.git/rebase-apply/patch:95: trailing whitespace.
  copied to the subscriber. Supported formats are
.git/rebase-apply/patch:101: trailing whitespace.
  that data will not be copied if copy_data = false.
warning: 2 lines add whitespace errors.

(4) pg_dump.c

if (fout->remoteVersion >= 16)
-   appendPQExpBufferStr(query, " s.suborigin\n");
+   appendPQExpBufferStr(query, " s.suborigin,\n");
else
-   appendPQExpBuffer(query, " '%s' AS suborigin\n", 
LOGICALREP_ORIGIN_ANY);
+   appendPQExpBuffer(query, " '%s' AS suborigin,\n", 
LOGICALREP_ORIGIN_ANY);
+
+   if (fout->remoteVersion >= 16)
+   appendPQExpBufferStr(query, " s.subcopyformat\n");
+   else
+   appendPQExpBuffer(query, " '%c' AS subcopyformat\n", 
LOGICALREP_COPY_AS_TEXT);

This new branch for v16 can be made together with the previous same condition.

(5) describe.c

+
+   /* Copy format is only supported in v16 and higher */
+   if (pset.sversion >= 16)
+   appendPQExpBuffer(,
+ ", subcopyformat AS 
\"%s\"\n",
+ gettext_noop("Copy 
Format"));

Similarly to (4), this creates a new branch for v16. Please see the above codes 
of this part.

(6) 

+ * Extract the copy format value from a DefElem.
+ */
+char
+defGetCopyFormat(DefElem *def)

Shouldn't this function be static and remove the change of subscriptioncmds.h ?

(7) catalogs.sgml

The subcopyformat should be mentioned here and the current description for 
subbinary
should be removed.

(8) create_subscription.sgml

+  text.
+
+  binary format can be selected only if

Unnecessary blank line.

[1] - 
https://www.postgresql.org/message-id/CALj2ACW5Oa7_v25iZb326UXvtM_tjQfw0Tc3hPJ8zN4FZqc9cw%40mail.gmail.com

Best Regards,
Takamichi Osumi

Re: Generating "Subplan Removed" in EXPLAIN

2023-01-31 Thread Bruce Momjian

On Wed, Feb  1, 2023 at 11:53:34AM +0900, Yugo NAGATA wrote:
> On Tue, 31 Jan 2023 20:38:21 -0600
> Justin Pryzby  wrote:
> 
> > 
> > To: Bruce Momjian 
> > Cc: pgsql-hack...@postgresql.org
> > Subject: Re: Generating "Subplan Removed" in EXPLAIN
> > Date: Tue, 31 Jan 2023 20:38:21 -0600
> > User-Agent: Mutt/1.9.4 (2018-02-28)
> > 
> > On Tue, Jan 31, 2023 at 08:59:57PM -0500, Bruce Momjian wrote:
> > > Does anyone know how to generate this?  Thanks.
> > 
> > The regression tests know:
> > 
> > $ git grep -c 'Subplans Removed' ./src/test/regress/
> > src/test/regr
> 
> Maybe, you missed to set plan_cache_mode to force_generic_plan.
> "Subplan Removed" doesn't appear when using a custom plan.

Yes, that is exactly what I as missing.  Thank you!

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

Embrace your flaws.  They make you human, rather than perfect,
which you will never be.

Re: Generating "Subplan Removed" in EXPLAIN

2023-01-31 Thread Yugo NAGATA

On Tue, 31 Jan 2023 20:38:21 -0600
Justin Pryzby  wrote:

> 
> To: Bruce Momjian 
> Cc: pgsql-hack...@postgresql.org
> Subject: Re: Generating "Subplan Removed" in EXPLAIN
> Date: Tue, 31 Jan 2023 20:38:21 -0600
> User-Agent: Mutt/1.9.4 (2018-02-28)
> 
> On Tue, Jan 31, 2023 at 08:59:57PM -0500, Bruce Momjian wrote:
> > Does anyone know how to generate this?  Thanks.
> 
> The regression tests know:
> 
> $ git grep -c 'Subplans Removed' ./src/test/regress/
> src/test/regr

Maybe, you missed to set plan_cache_mode to force_generic_plan.
"Subplan Removed" doesn't appear when using a custom plan.

postgres=# set enable_indexonlyscan = off; 
SET
postgres=# prepare ab_q1 (int, int, int) as
select * from ab where a between $1 and $2 and b <= $3;
PREPARE
postgres=# explain (analyze, costs off, summary off, timing off) execute ab_q1 
(2, 2, 3);
   QUERY PLAN
-
 Append (actual rows=0 loops=1)
   ->  Seq Scan on ab_a2_b1 ab_1 (actual rows=0 loops=1)
 Filter: ((a >= 2) AND (a <= 2) AND (b <= 3))
   ->  Seq Scan on ab_a2_b2 ab_2 (actual rows=0 loops=1)
 Filter: ((a >= 2) AND (a <= 2) AND (b <= 3))
   ->  Seq Scan on ab_a2_b3 ab_3 (actual rows=0 loops=1)
 Filter: ((a >= 2) AND (a <= 2) AND (b <= 3))
(7 rows)

postgres=# show plan_cache_mode ;
 plan_cache_mode 
-
 auto
(1 row)

postgres=# set plan_cache_mode to force_generic_plan;
SET
postgres=# explain (analyze, costs off, summary off, timing off) execute ab_q1 
(2, 2, 3);
   QUERY PLAN
-
 Append (actual rows=0 loops=1)
   Subplans Removed: 6
   ->  Seq Scan on ab_a2_b1 ab_1 (actual rows=0 loops=1)
 Filter: ((a >= $1) AND (a <= $2) AND (b <= $3))
   ->  Seq Scan on ab_a2_b2 ab_2 (actual rows=0 loops=1)
 Filter: ((a >= $1) AND (a <= $2) AND (b <= $3))
   ->  Seq Scan on ab_a2_b3 ab_3 (actual rows=0 loops=1)
 Filter: ((a >= $1) AND (a <= $2) AND (b <= $3))
(8 rows)

Regards,
Yugo Nagata

-- 
Yugo NAGATA

Re: [Commitfest 2023-01] has started

2023-01-31 Thread vignesh C

On Sun, 29 Jan 2023 at 22:05, vignesh C  wrote:
>
> On Sun, 22 Jan 2023 at 22:00, vignesh C  wrote:
> >
> > On Sun, 15 Jan 2023 at 23:02, vignesh C  wrote:
> > >
> > > On Sun, 8 Jan 2023 at 21:00, vignesh C  wrote:
> > > >
> > > > On Tue, 3 Jan 2023 at 13:13, vignesh C  wrote:
> > > > >
> > > > > Hi All,
> > > > >
> > > > > Just a reminder that Commitfest 2023-01 has started.
> > > > > There are many patches based on the latest run from [1] which require
> > > > > a) Rebased on top of head b) Fix compilation failures c) Fix test
> > > > > failure, please have a look and rebase it so that it is easy for the
> > > > > reviewers and committers:
> > > > > 1. TAP output format for pg_regress
> > > > > 2. Add BufFileRead variants with short read and EOF detection
> > > > > 3. Add SHELL_EXIT_CODE variable to psql
> > > > > 4. Add foreign-server health checks infrastructure
> > > > > 5. Add last_vacuum_index_scans in pg_stat_all_tables
> > > > > 6. Add index scan progress to pg_stat_progress_vacuum
> > > > > 7. Add the ability to limit the amount of memory that can be allocated
> > > > > to backends.
> > > > > 8. Add tracking of backend memory allocated to pg_stat_activity
> > > > > 9. CAST( ... ON DEFAULT)
> > > > > 10. CF App: add "Returned: Needs more interest" close status
> > > > > 11. CI and test improvements
> > > > > 12. Cygwin cleanup
> > > > > 13. Expand character set for ltree labels
> > > > > 14. Fix tab completion MERGE
> > > > > 15. Force streaming every change in logical decoding
> > > > > 16. More scalable multixacts buffers and locking
> > > > > 17. Move SLRU data into the regular buffer pool
> > > > > 18. Move extraUpdatedCols out of RangeTblEntry
> > > > > 19.New [relation] options engine
> > > > > 20. Optimizing Node Files Support
> > > > > 21. PGDOCS - Stats views and functions not in order?
> > > > > 22. POC: Lock updated tuples in tuple_update() and tuple_delete()
> > > > > 23. Parallelize correlated subqueries that execute within each worker
> > > > > 24. Pluggable toaster
> > > > > 25. Prefetch the next tuple's memory during seqscans
> > > > > 26. Pulling up direct-correlated ANY_SUBLINK
> > > > > 27. Push aggregation down to base relations and joins
> > > > > 28. Reduce timing overhead of EXPLAIN ANALYZE using rdtsc
> > > > > 29. Refactor relation extension, faster COPY
> > > > > 30. Remove NEW placeholder entry from stored view query range table
> > > > > 31. TDE key management patches
> > > > > 32. Use AF_UNIX for tests on Windows (ie drop fallback TCP code)
> > > > > 33. Windows filesystem support improvements
> > > > > 34. making relfilenodes 56 bit
> > > > > 35. postgres_fdw: commit remote (sub)transactions in parallel during 
> > > > > pre-commit
> > > > > 36.recovery modules
> > > > >
> > > > > Commitfest status as of now:
> > > > > Needs review:177
> > > > > Waiting on Author:   47
> > > > > Ready for Committer:  20
> > > > > Committed:  31
> > > > > Withdrawn:4
> > > > > Rejected:   0
> > > > > Returned with Feedback:  0
> > > > > Total:  279
> > > > >
> > > > > We will be needing more members to actively review the patches to get
> > > > > more patches to the committed state. I would like to remind you that
> > > > > each patch submitter is expected to review at least one patch from
> > > > > another submitter during the CommitFest, those members who have not
> > > > > picked up patch for review please pick someone else's patch to review
> > > > > as soon as you can.
> > > > > I'll send out reminders this week to get your patches rebased and
> > > > > update the status of the patch accordingly.
> > > > >
> > > > > [1] - http://cfbot.cputube.org/
> > > >
> > > > Hi Hackers,
> > > >
> > > > Here's a quick status report after the first week (I think only about
> > > > 9 commits happened during the week, the rest were pre-CF activity):
> > > >
> > > > status   |   3rd Jan |  w1
> > > > -+---+-
> > > > Needs review:|   177 | 149
> > > > Waiting on Author:   |47 |  60
> > > > Ready for Committer: |20 |  23
> > > > Committed:   |31 |  40
> > > > Withdrawn:   | 4 |   7
> > > > Rejected:| 0 |   0
> > > > Returned with Feedback:  | 0 |   0
> > > > Total:   |   279 | 279
> > > >
> > > > Here is a list of "Needs review" entries for which there has not been
> > > > much communication on the thread and needs help in proceeding further.
> > > > Please pick one of these and help us on how to proceed further:
> > > > pgbench: using prepared BEGIN statement in a pipeline could cause an
> > > > error | Yugo Nagata
> > > > Fix dsa_free() to re-bin segment | Dongming Liu
> > > > pg_rewind: warn when checkpoint hasn't happened after promotion | James 
> > > > Coleman
> > >

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-01-31 Thread Kyotaro Horiguchi

At Tue, 31 Jan 2023 15:12:14 +0530, Amit Kapila  wrote 
in 
> On Tue, Jan 31, 2023 at 1:40 PM Kyotaro Horiguchi
>  wrote:
> >
> > Hi, Kuroda-san, Thanks for the detailed study.
> >
> > At Tue, 31 Jan 2023 07:06:40 +, "Hayato Kuroda (Fujitsu)" 
> >  wrote in
> > > Therefore, I think we can say that modern platforms that are supported by 
> > > PostgreSQL define int as 32-bit.
> > > It satisfies the condition sizeof(int) <= sizeof(int32), so we can keep 
> > > to use INT_MAX.
> >
> > Yeah, I know that that's practically correct.  Just I wanted to make
> > clear is whether we (always) assume int == int32. I don't want to do
> > that just because that works. Even though we cannot be perfect, in
> > this particular case the destination space is explicitly made as
> > int32.
> >
> 
> So, shall we check if the result of parse_int is in the range 0 and
> PG_INT32_MAX to ameliorate this concern?

Yeah, it is exactly what I wanted to suggest.

> If this works then we need to
> probably change the return value of defGetMinApplyDelay() to int32.

I didn't thought doing that, int can store all values in the valid
range (I'm assuming we implicitly assume int >= int32 in bit width)
and it is the natural integer in C.  Either will do for me but I
slightly prefer to use int there.

As the result I'd like to propose the following change.


diff --git a/src/backend/commands/subscriptioncmds.c 
b/src/backend/commands/subscriptioncmds.c
index 489eae85ee..9de2745623 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -2293,16 +2293,16 @@ defGetMinApplyDelay(DefElem *def)
 hintmsg ? errhint("%s", _(hintmsg)) : 0));
 
/*
-* Check lower bound. parse_int() has already been confirmed that result
-* is less than or equal to INT_MAX.
+* Check the both boundary. Although parse_int() checked the result 
against
+* INT_MAX, this value is to be stored in a catalog column of int32.
 */
-   if (result < 0)
+   if (result < 0 || result > PG_INT32_MAX)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 errmsg("%d ms is outside the valid range for 
parameter \"%s\" (%d .. %d)",
result,
"min_apply_delay",
-   0, INT_MAX)));
+   0, PG_INT32_MAX)));
 
return result;
 }


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Generating "Subplan Removed" in EXPLAIN

2023-01-31 Thread Justin Pryzby

On Tue, Jan 31, 2023 at 08:59:57PM -0500, Bruce Momjian wrote:
> Does anyone know how to generate this?  Thanks.

The regression tests know:

$ git grep -c 'Subplans Removed' ./src/test/regress/
src/test/regress/expected/partition_prune.out:29

-- 
Justin

Re: Generating "Subplan Removed" in EXPLAIN

2023-01-31 Thread Bruce Momjian

On Tue, Jan 31, 2023 at 08:59:57PM -0500, Bruce Momjian wrote:
> Our document states that EXPLAIN can generate "Subplan Removed":
> 
>   
> https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITION-PRUNING
> 
>   It is possible to determine the number of partitions which were removed
>   during this phase by observing the “Subplans Removed” property in the
>   EXPLAIN output.

Sorry, here is the full paragraph:

During initialization of the query plan. Partition pruning can
be performed here for parameter values which are known during
the initialization phase of execution. Partitions which are
pruned during this stage will not show up in the query's EXPLAIN
or EXPLAIN ANALYZE.  It is possible to determine the number of
partitions which were removed during this phase by observing the
“Subplans Removed” property in the EXPLAIN output.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

Embrace your flaws.  They make you human, rather than perfect,
which you will never be.

Re: postgres_fdw uninterruptible during connection establishment / ProcSignalBarrier

2023-01-31 Thread Andres Freund

Hi,

On 2023-01-30 12:00:55 -0800, Nathan Bossart wrote:
> On Mon, Jan 30, 2023 at 11:49:37AM -0800, Andres Freund wrote:
> > Why don't the dblink tests catch this?  Any chance you or Robins could 
> > prepare
> > a patch with fix and test, given that you know how to trigger this?
> 
> It's trivially reproducible by calling 1-argument dblink_connect() multiple
> times and then calling dblink_disconnect().  Here's a patch.

Thanks for the quick patch and for the find. Pushed.

Greetings,

Andres

Re: Weird failure with latches in curculio on v15

2023-01-31 Thread Andres Freund

Hi,

On 2023-02-01 10:53:17 +0900, Michael Paquier wrote:
> While browsing the buildfarm, I have noticed this failure on curcilio:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=curculio=2023-02-01%2001%3A05%3A17
> 
> The test that has reported a failure is the check on the archive
> module callback:
> #   Failed test 'check shutdown callback of shell archive module'
> #   at t/020_archive_status.pl line 248.
> # Looks like you failed 1 test of 17.
> [02:28:06] t/020_archive_status.pl .. 
> Dubious, test returned 1 (wstat 256, 0x100)
> Failed 1/17 subtests 
> 
> Looking closer, this is a result of an assertion failure in the latch
> code:
> 2023-02-01 02:28:05.615 CET [6961:8] LOG:  received fast shutdown request
> 2023-02-01 02:28:05.615 CET [6961:9] LOG:  aborting any active transactions
> 2023-02-01 02:28:05.616 CET [30681:9] LOG:  process 30681 releasing 
> ProcSignal slot 33, but it contains 0
> TRAP: FailedAssertion("latch->owner_pid == MyProcPid", File: "latch.c", Line: 
> 451, PID: 30681)

Given the ProcSignal LOG message before it, I don't think this is about
latches.


> The information available in standby2.log shows that 30681 is the
> startup process.  I am not sure what all that means, yet.
>
> Thoughts or comments welcome.

Perhaps a wild write overwriting shared memory state?

Greetings,

Andres Freund

Re: Worth using personality(ADDR_NO_RANDOMIZE) for EXEC_BACKEND on linux?

2023-01-31 Thread Thomas Munro

On Wed, Feb 1, 2023 at 2:58 PM Michael Paquier  wrote:
> On Tue, Jan 31, 2023 at 08:37:29PM -0500, Tom Lane wrote:
> > That's aged long enough now that it seems like a pretty safe
> > thing to do.
>
> Thanks.  I'll wait for a few days before doing something for my
> buildfarm stuff, in case somebody thinks this is a bad idea..

+1, go for it.  It shouldn't affect Unix build releases, and on
Windows the function does nothing.

Generating "Subplan Removed" in EXPLAIN

2023-01-31 Thread Bruce Momjian

Our document states that EXPLAIN can generate "Subplan Removed":


https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITION-PRUNING

It is possible to determine the number of partitions which were removed
during this phase by observing the “Subplans Removed” property in the
EXPLAIN output.

However, I can't figure out how to generate that string in EXPLAIN
output.  I tried many examples and searched the web for examples but I
can't generate it in queries using git master.

For example, this website:

https://gist.github.com/amitlan/cd13271142bb2d26ae46b69afb675a31

has several EXPLAIN examples that show "Subplan Removed" but running the
queries in git master doesn't generate it for me.

Does anyone know how to generate this?  Thanks.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

Embrace your flaws.  They make you human, rather than perfect,
which you will never be.

Re: Worth using personality(ADDR_NO_RANDOMIZE) for EXEC_BACKEND on linux?

2023-01-31 Thread Michael Paquier

On Tue, Jan 31, 2023 at 08:37:29PM -0500, Tom Lane wrote:
> That's aged long enough now that it seems like a pretty safe
> thing to do.

Thanks.  I'll wait for a few days before doing something for my
buildfarm stuff, in case somebody thinks this is a bad idea..
--
Michael


signature.asc
Description: PGP signature

Weird failure with latches in curculio on v15

2023-01-31 Thread Michael Paquier

Hi all,

While browsing the buildfarm, I have noticed this failure on curcilio:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=curculio=2023-02-01%2001%3A05%3A17

The test that has reported a failure is the check on the archive
module callback:
#   Failed test 'check shutdown callback of shell archive module'
#   at t/020_archive_status.pl line 248.
# Looks like you failed 1 test of 17.
[02:28:06] t/020_archive_status.pl .. 
Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/17 subtests 

Looking closer, this is a result of an assertion failure in the latch
code:
2023-02-01 02:28:05.615 CET [6961:8] LOG:  received fast shutdown request
2023-02-01 02:28:05.615 CET [6961:9] LOG:  aborting any active transactions
2023-02-01 02:28:05.616 CET [30681:9] LOG:  process 30681 releasing ProcSignal 
slot 33, but it contains 0
TRAP: FailedAssertion("latch->owner_pid == MyProcPid", File: "latch.c", Line: 
451, PID: 30681)

The information available in standby2.log shows that 30681 is the
startup process.  I am not sure what all that means, yet.

Thoughts or comments welcome.
--
Michael


signature.asc
Description: PGP signature

Re: pg_upgrade test failure

2023-01-31 Thread Thomas Munro

On Wed, Feb 1, 2023 at 10:08 AM Thomas Munro  wrote:
> On Wed, Feb 1, 2023 at 10:04 AM Andres Freund  wrote:
> > Maybe we should just handle it by sleeping and retrying, if on windows? Sad 
> > to even propose...
>
> Yeah, that's what that code I posted would do automatically, though
> it's a bit hidden.  The second attempt to unlink() would see delete
> already pending, and activate its secret internal sleep/retry loop.

OK, I pushed that.  Third time lucky?

I tracked down the discussion of that existing comment about pg_ctl,
which comes from the 9.2 days:

https://www.postgresql.org/message-id/flat/5044DE59.5020500%40dunslane.net

I guess maybe back then fopen() was Windows' own fopen() that wouldn't
allow two handles to a file at the same time?  These days we redirect
it to a wrapper with the magic "shared" flags, so the kluge installed
by commit f8c81c5dde2 may not even be needed anymore.  It does
demonstrate that there are long standing timing races around log
files, process exit and wait-for-shutdown logic, though.

Someone who develops for Windows could probably chase this right down,
and make sure that we do certain things in the right order, and/or
find better kernel facilities; at a wild guess, something like
OpenProcess() before you initiate shutdown, so you can then wait on
its handle, for example.  The docs for ExitProcess() make it clear
that handles are synchronously closed, so I think it's probably just
that our tests for when processes have exited are too fuzzy.

Re: Worth using personality(ADDR_NO_RANDOMIZE) for EXEC_BACKEND on linux?

2023-01-31 Thread Tom Lane

Michael Paquier  writes:
> Could it be worth back-patching f3e7806?

That's aged long enough now that it seems like a pretty safe
thing to do.

regards, tom lane

Re: Worth using personality(ADDR_NO_RANDOMIZE) for EXEC_BACKEND on linux?

2023-01-31 Thread Michael Paquier

On Tue, Jan 11, 2022 at 12:29:44AM +1300, Thomas Munro wrote:
> Thanks for testing.  Tidied and pushed, to master only for now.

I have noticed the following failure for v11~14 on one of my hosts
that compiles with -DEXEC_BACKEND, and Nathan has redirected me here:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gokiburi=2023-01-31%2012%3A07%3A32
FATAL:  could not reattach to shared memory (key=1050468, addr=0x97eb2000): 
Invalid argument

Could it be worth back-patching f3e7806?  I don't mind changing this
animal setup by switching the kernel configuration or reducing the
branch scope, but this solution is less invasive because it would not
influence parallel runs.

Thoughts?
--
Michael


signature.asc
Description: PGP signature

Re: Add progress reporting to pg_verifybackup

2023-01-31 Thread Michael Paquier

On Fri, Jan 06, 2023 at 04:28:42PM +0900, Masahiko Sawada wrote:
> I've attached the simple patch to add the progress reporting option to
> pg_verifybackup. The progress information is displayed with --progress
> option only during the checksum verification, which is the most time
> consuming task. It cannot be used together with --quiet option.

That looks helpful, particularly when a backup has many relation
files.  Calculating the total size when browsing the file list looks
fine.

+   /* Complain if the specified arguments conflict */
+   if (show_progress && quiet)
+   pg_fatal("cannot specify both --progress and --quiet");

Nothing on HEAD proposes --progress and --quiet at the same time from
what I can see, so just disabling the combination is fine by me.  For
the error message, I would recommend to switch to a more generic
pattern, as of:
"cannot specify both %s and %s", "-P/--progress", "-q/--quiet"

Could you add a check based on command_fails_like() in 004_options.pl,
at least?  A second test to check after the output of --progress would
be a nice bonus, for example by sticking --progress into one of the
existing commands doing a command_like().
--
Michael

signature.asc
Description: PGP signature

Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE

2023-01-31 Thread Matthias van de Meent

On Tue, 31 Jan 2023 at 23:48, Andres Freund  wrote:
>
> Hi,
>
> On 2023-01-31 15:05:17 +0100, Matthias van de Meent wrote:
> > If TransactionIdRetreatSafely will be exposed outside procarray.c,
> > then I think the xid pointer should be replaced with normal
> > arguments/returns; both for parity with TransactionIdRetreatedBy
>
> That's why I named one version *Retreat the other Retreated :)

That part I noticed too :) I don't mind either way, I was just
concerned with exposing the function as a prototype, not as an inline
static.

> I think it'll make the code easier to read in the other places too, the
> variable names / function names in this space are uncomfortably long to
> fit into 78chars..., particularly when there's two references to the
> same variable in the same line.

I guess that's true, and once inlined there should indeed be no extra
runtime overhead.

> 78 chars
Didn't we use 80 columns/chars? How did you get to 78? Not that I
can't think of any ways, but none of them stand out to me as obviously
correct.

> > and to remove this memory store dependency in this hot code path.
>
> I doubt that matters much here and the places it's going to be used
> in.

I thought that this was executed while still in ProcArrayLock, but
instead we've released that lock already by the time we're trying to
adjust the horizons, so the 'hot code path' concern is mostly
relieved.

> And presumably the compiler will inline it anyway. I'd probably make
> it a static inline in the header too.

Yes, my concern was based on an extern prototype with private
implementation, as that does prohibit inlining and thus would have a
requirement to push the data to memory (probably only L1, but still
memory).

> What's making me hesitate about exposing it is that it's quite easy to
> get things wrong by using a wrong fxid or such.

I'm less concerned about that when the function is well-documented.

> > > +/*
> > > + * FIXME, doubtful this is the best fix.
> > > + *
> > > + * Can't represent the 32bit xid as a 64bit xid, as it's before 
> > > fxid
> > > + * 0. Represent it as an xid from the future instead.
> > > + */
> > > +if (epoch == 0)
> > > +return FullTransactionIdFromEpochAndXid(0, xid);
> >
> > Shouldn't this be an error condition instead, as this XID should not
> > be able to appear?
>
> If you mean error in the sense of ERROR, no, I don't think so. That code
> tries hard to be able to check many tuples in a row. And if we were to
> error out here, we'd not able to do that. We should still report those
> tuples as corrupt, fwiw.
>
> The reason this path is hit is that a test intentionally corrupts some
> xids. So the path is reachable and we need to cope somehow.

I see.

Kind regards,

Matthias van de Meent

Re: MacOS: xsltproc fails with "warning: failed to load external entity"

2023-01-31 Thread Andres Freund

Hi,

On 2023-01-31 18:54:31 -0500, Tom Lane wrote:
> 1. I have not tested the meson changes.

Works here.

> 2. As this is written, you can't override the --nonet options very
> easily in the Makefile build (you could do so at runtime by setting
> XSLTPROC, but not at configure time); and you can't override them at
> all in the meson build.  Given the lack of evidence that it's still
> useful to allow net access, I'm untroubled by that.  I did intentionally
> skip using "override" in the Makefile, though, to allow that case.

I'm not troubled by this either.

I wonder if we should provide a build target to download the stylesheets
ourselves. The amount of packages our instructions download is quite
substantial. We could perhaps trim them a bit, but we intentionally are
including things to build pdfs etc as well, which does make sense...

Greetings,

Andres Freund

Re: [PoC] Let libpq reject unexpected authentication requests

2023-01-31 Thread Michael Paquier

On Tue, Jan 31, 2023 at 02:03:54PM +0300, Aleksander Alekseev wrote:
>> What is the status of this patch set?  Michael had registered himself as
>> committer and then removed himself again.  So I hadn't been paying much
>> attention myself.  Was there anything left to discuss?

Yes, sorry about not following up on that.  I was registered as such
for a few weeks, but I have not been able to follow up.  It did not
seem fair for this patch to wait on only me, which is why I have
removed my name, at least temporarily, so as somebody may be able to
come back to it before me.  I am not completely sure whether I will be
able to come back and dive deeply into this thread soon, TBH :/

> Previously I marked the patch as RfC. Although it's been a few months
> ago and I don't recall all the details, it should have been in good
> shape (in my personal opinion at least). The commits a9e9a9f and
> 0873b2d Michael referred to are message refactorings so I doubt Jacob
> had serious problems with them.
> 
> Of course, I'll take another fresh look and let you know my findings in a bit.

(There were a few things around certificate handling that need careful
consideration, at least that was my impression.)
--
Michael

signature.asc
Description: PGP signature

Re: suppressing useless wakeups in logical/worker.c

2023-01-31 Thread Nathan Bossart

On Sat, Jan 28, 2023 at 10:26:25AM +0530, Amit Kapila wrote:
> On Fri, Jan 27, 2023 at 4:07 AM Tom Lane  wrote:
>> Returning to the prior patch ... I don't much care for this:
>>
>> +/* Maybe there will be a free slot in a second... */
>> +retry_time = TimestampTzPlusSeconds(now, 1);
>> +LogRepWorkerUpdateSyncStartWakeup(retry_time);
>>
>> We're not moving the goalposts very far on unnecessary wakeups if
>> we have to do that.  Do we need to get a wakeup on sync slot free?
>> Although having to send that to every worker seems ugly.  Maybe this
>> is being done in the wrong place and we need to find a way to get
>> the launcher to handle it.

It might be feasible to set up a before_shmem_exit() callback that wakes up
the apply worker (like is already done for the launcher).  I think the
apply worker is ordinarily notified via the tablesync worker's notify_pid,
but AFAICT there's no guarantee that the apply worker hasn't restarted with
a different PID.

> + /*
> + * Since process_syncing_tables() is called conditionally, the
> + * tablesync worker start wakeup time might be in the past, and we
> + * can't know for sure when it will be updated again.  Rather than
> + * spinning in a tight loop in this case, bump this wakeup time by
> + * a second.
> + */
> + now = GetCurrentTimestamp();
> + if (wakeup[LRW_WAKEUP_SYNC_START] < now)
> + wakeup[LRW_WAKEUP_SYNC_START] =
> TimestampTzPlusSeconds(wakeup[LRW_WAKEUP_SYNC_START], 1);
> 
> Do we see unnecessary wakeups without this, or delay in sync?

I haven't looked too cloesly to see whether busy loops are likely in
practice.

> BTW, do we need to do something about wakeups in
> wait_for_relation_state_change()?

... and wait_for_worker_state_change(), and copy_read_data().  From a quick
glance, it looks like fixing these would be a more invasive change.  TBH
I'm beginning to wonder whether all this is really worth it to prevent
waking up once per second.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Non-superuser subscription owners

2023-01-31 Thread Andres Freund

Hi,

On 2023-01-30 10:44:29 -0500, Robert Haas wrote:
> On a technical level, I think that the idea of having a separate
> objection for the connection string vs. the subscription itself is
> perfectly sound, and to repeat what I said earlier, if someone wants
> to implement that, cool. I also agree that it has the advantage that
> you specify, namely, that someone can have rights to modify one of
> those objects but not the other. What that lets you do is define a
> short list of known systems and say, hey, you can replicate whatever
> tables you want with whatever options you want, but only between these
> systems. I'm not quite sure what problem that solves, though.

That does seem somewhat useful, but also fairly limited, at least as
long as it's really just a single connection, rather than a "pattern" of
safe connections.

> Unfortunately, I have even less of an idea about what the run-as
> concept is supposed to accomplish. I mean, at one level, I see it
> quite clearly: the user creating the subscription wants replication to
> have restricted privileges when it's running, and so they make the
> run-as user some role with fewer privileges than their own. Brilliant.
> But then I get stuck: against what kind of attack does that actually
> protect us? If I'm a high privilege user, perhaps even a superuser,
> and it's not safe to have logical replication running as me, then it
> seems like the security model of logical replication is fundamentally
> busted and we need to fix that.

I don't really understand that - the run-as approach seems like a
necessary piece of improving the security model.

I think it's perfectly reasonable to want to replicate from one system
in another, but to not want to allow logical replication to insert into
pg_class or whatnot. So not using superuser to execute the replication
makes sense.

This is particularly the case if you're just replicating a small part of
the tables from one system to another. E.g. in a sharded setup, you may
want to replicate metadata too servers.

Even if all the systems are operated by people you trust (including
possibly even yourself, if you want to go that far), you may want to
reduce the blast radius of privilege escalation, or even just bugs, to a
smaller amount of data.

I think we'll need two things to improve upon the current situation:

1) run-as user, to reduce the scope of potential danger

2) Option to run the database inserts as the owner of the table, with a
   check that the run-as is actually allowed to perform work as the
   owning role. That prevents escalation from table owner (who could add
   default expressions etc) from gettng the privs of the
   run-as/replication owner.

I think it makes sense for 1) to be a fairly privileged user, but I
think it's good practice for that user to not be allowed to change the
system configuration etc.

> It can't be right to say that if you have 263 users in a database and
> you want to replicate the whole database to some other node, you need
> 263 different subscriptions with a different run-as user for each. You
> need to be able to run all of that logical replication as the
> superuser or some other high-privilege user and not end up with a
> security compromise.

I'm not quite following along here - are you thinking of 263 tables
owned by 263 users? If yes, that's why I am thinking that we need the
option to perform each table modification as the owner of that table
(with the same security restrictions we use for REINDEX etc).

> And if we suppose that that already works and is safe, well then
> what's the case where I do need a run-as user?

It's not at all safe today, IMO. You need to trust that nothing bad will
be replicated, otherwise the owner of the subscription has to be
considered compromised.

Greetings,

Andres Freund

Re: [PoC] Let libpq reject unexpected authentication requests

2023-01-31 Thread Jacob Champion

On Tue, Jan 31, 2023 at 5:20 AM Aleksander Alekseev
 wrote:
> To my knowledge there are no open questions left. I think the
> patch is as good as it will ever get.

A committer will need to decide whether they're willing to maintain
0003 or not, as mentioned with the v11 post. Which I suppose is the
last open question, but not one I can answer from here.

Thanks!
--Jacob

Re: MacOS: xsltproc fails with "warning: failed to load external entity"

2023-01-31 Thread Tom Lane

I wrote:
> It's worse than that: I find that
>   export XML_CATALOG_FILES=/dev/null
> breaks the docs build on RHEL8 and Fedora 37 (latest) too, with the
> same "failed to load external entity" symptom.  I conclude from this
> that there is no version of xsltproc anywhere that can still download
> the required files automatically.  So we need to take out the advice
> that says you can rely on auto-download for everybody, not just macOS.

> If this is indeed the case, perhaps we ought to start inserting --nonet
> into the invocations.  There's not much use in allowing these tools to
> perform internet access when the best-case scenario is that they fail.

Concretely, I'm thinking something like the attached.  Notes:

1. I have not tested the meson changes.

2. As this is written, you can't override the --nonet options very
easily in the Makefile build (you could do so at runtime by setting
XSLTPROC, but not at configure time); and you can't override them at
all in the meson build.  Given the lack of evidence that it's still
useful to allow net access, I'm untroubled by that.  I did intentionally
skip using "override" in the Makefile, though, to allow that case.

3. For consistency with the directions for other platforms, I made
the package lists for macOS just mention libxslt.  That should
be enough to pull in libxml2 as well.

4. Use of --nonet changes the error message you get if xsltproc
can't find the DTDs.  I copied the error I get from MacPorts'
version of xsltproc, but can you confirm it's the same on Homebrew?

regards, tom lane

diff --git a/doc/src/sgml/Makefile b/doc/src/sgml/Makefile
index 4f0e39223c..b96c7cbf22 100644
--- a/doc/src/sgml/Makefile
+++ b/doc/src/sgml/Makefile
@@ -41,11 +41,15 @@ endif
 
 XMLINCLUDE = --path .
 
-ifndef XMLLINT
+ifdef XMLLINT
+XMLLINT := $(XMLLINT) --nonet
+else
 XMLLINT = $(missing) xmllint
 endif
 
-ifndef XSLTPROC
+ifdef XSLTPROC
+XSLTPROC := $(XSLTPROC) --nonet
+else
 XSLTPROC = $(missing) xsltproc
 endif
 
diff --git a/doc/src/sgml/docguide.sgml b/doc/src/sgml/docguide.sgml
index 787caef70d..cf9d4d4389 100644
--- a/doc/src/sgml/docguide.sgml
+++ b/doc/src/sgml/docguide.sgml
@@ -151,18 +151,6 @@
here.
   
 
-  
-   You can get away with not installing DocBook XML and the DocBook XSLT
-   stylesheets locally, because the required files will be downloaded from the
-   Internet and cached locally.  This may in fact be the preferred solution if
-   your operating system packages provide only an old version of these files,
-   or if no packages are available at all.
-   If you want to prevent any attempt to access the Internet while building
-   the documentation, you need to pass the --nonet option
-   to xmllint and xsltproc; see below
-   for an example.
-  
-
   
Installation on Fedora, RHEL, and Derivatives
 
@@ -207,22 +195,38 @@ apt-get install docbook-xml docbook-xsl fop libxml2-utils xsltproc
   
macOS
 
-   
-On macOS, you can build the HTML and man documentation without installing
-anything extra.  If you want to build PDFs or want to install a local copy
-of DocBook, you can get those from your preferred package manager.
-   
-

 If you use MacPorts, the following will get you set up:
 
-sudo port install docbook-xml-4.5 docbook-xsl fop
+sudo port install docbook-xml docbook-xsl-nons fop libxslt
 
 If you use Homebrew, use this:
 
-brew install docbook docbook-xsl fop
+brew install docbook docbook-xsl fop libxslt
 

+
+   
+The Homebrew-supplied programs require the following environment variable
+to be set:
+
+export XML_CATALOG_FILES=/usr/local/etc/xml/catalog
+
+Without it, xsltproc will throw errors like this:
+
+I/O error : Attempt to load network entity http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd
+postgres.sgml:21: warning: failed to load external entity "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd;
+...
+
+   
+
+   
+While it is possible to use the Apple-provided versions
+of xmllint and xsltproc
+instead of those from MacPorts or Homebrew, you'll still need
+to install the DocBook DTD and stylesheets, and set up a catalog
+file that points to them.
+   
   
 
   
@@ -253,12 +257,6 @@ checking for dbtoepub... dbtoepub
these programs, for example
 
 ./configure ... XMLLINT=/opt/local/bin/xmllint ...
-
-   Also, if you want to ensure that xmllint
-   and xsltproc will not perform any network access,
-   you can do something like
-
-./configure ... XMLLINT="xmllint --nonet" XSLTPROC="xsltproc --nonet" ...
 
   
   
diff --git a/doc/src/sgml/images/Makefile b/doc/src/sgml/images/Makefile
index f9e356348b..645519095d 100644
--- a/doc/src/sgml/images/Makefile
+++ b/doc/src/sgml/images/Makefile
@@ -9,7 +9,7 @@ ALL_IMAGES = \
 
 DITAA = ditaa
 DOT = dot
-XSLTPROC = xsltproc
+XSLTPROC = xsltproc --nonet
 
 all: $(ALL_IMAGES)
 
diff --git a/doc/src/sgml/meson.build b/doc/src/sgml/meson.build
index b9f4c6a05b..524a4e00e4 100644
---

Re: Rework of collation code, extensibility

2023-01-31 Thread Jeff Davis

On Tue, 2023-01-31 at 11:40 +0100, Peter Eisentraut wrote:
> I don't know to what extent this depends on the abbreviated key GUC 
> discussion.  Does the rest of this patch set depend on this?

The overall refactoring is not dependent logically on the GUC patch. It
may require some trivial fixup if you eliminate the GUC patch.

I left it there because it makes exploring/testing easier (at least for
me), but the GUC patch doesn't need to be committed if there's no
consensus.

Regards,
Jeff Davis

Re: recovery modules

2023-01-31 Thread Nathan Bossart

On Tue, Jan 31, 2023 at 08:13:11AM +0900, Michael Paquier wrote:
> On Mon, Jan 30, 2023 at 12:04:22PM -0800, Nathan Bossart wrote:
>> On Mon, Jan 30, 2023 at 11:48:10AM -0800, Andres Freund wrote:
>>> I don't think _PG_archive_module_init() should actually allocate a memory
>>> context and do other similar initializations. Instead it should just return
>>> 'const ArchiveModuleCallbacks*', typically a single line.
>>> 
>>> Allocations etc should happen in one of the callbacks. That way we can
>>> actually have multiple instances of a module.
>> 
>> I think we'd need to invent a startup callback for archive modules for this
>> to work, but that's easy enough.
> 
> If you don't return some (void *) pointing to a private area that
> would be stored by the backend, allocated as part of the loading path,
> I agree that an extra callback is what makes the most sense,
> presumably called around the beginning of PgArchiverMain().  Doing
> this kind of one-time action in the file callback woud be weird..

Okay, here is a new patch set with the aforementioned adjustments and
documentation updates.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From f07a2c65439ed1f1b38741f8796563e62adef6bb Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Fri, 27 Jan 2023 21:01:22 -0800
Subject: [PATCH v2 1/3] s/ArchiveContext/ArchiveCallbacks

---
 src/backend/postmaster/pgarch.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index 8ecdb9ca23..36800127e8 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -97,7 +97,7 @@ char	   *XLogArchiveLibrary = "";
  */
 static time_t last_sigterm_time = 0;
 static PgArchData *PgArch = NULL;
-static ArchiveModuleCallbacks ArchiveContext;
+static ArchiveModuleCallbacks ArchiveCallbacks;
 
 
 /*
@@ -416,8 +416,8 @@ pgarch_ArchiverCopyLoop(void)
 			HandlePgArchInterrupts();
 
 			/* can't do anything if not configured ... */
-			if (ArchiveContext.check_configured_cb != NULL &&
-!ArchiveContext.check_configured_cb())
+			if (ArchiveCallbacks.check_configured_cb != NULL &&
+!ArchiveCallbacks.check_configured_cb())
 			{
 ereport(WARNING,
 		(errmsg("archive_mode enabled, yet archiving is not configured")));
@@ -518,7 +518,7 @@ pgarch_archiveXlog(char *xlog)
 	snprintf(activitymsg, sizeof(activitymsg), "archiving %s", xlog);
 	set_ps_display(activitymsg);
 
-	ret = ArchiveContext.archive_file_cb(xlog, pathname);
+	ret = ArchiveCallbacks.archive_file_cb(xlog, pathname);
 	if (ret)
 		snprintf(activitymsg, sizeof(activitymsg), "last was %s", xlog);
 	else
@@ -824,7 +824,7 @@ HandlePgArchInterrupts(void)
 /*
  * LoadArchiveLibrary
  *
- * Loads the archiving callbacks into our local ArchiveContext.
+ * Loads the archiving callbacks into our local ArchiveCallbacks.
  */
 static void
 LoadArchiveLibrary(void)
@@ -837,7 +837,7 @@ LoadArchiveLibrary(void)
  errmsg("both archive_command and archive_library set"),
  errdetail("Only one of archive_command, archive_library may be set.")));
 
-	memset(, 0, sizeof(ArchiveModuleCallbacks));
+	memset(, 0, sizeof(ArchiveModuleCallbacks));
 
 	/*
 	 * If shell archiving is enabled, use our special initialization function.
@@ -854,9 +854,9 @@ LoadArchiveLibrary(void)
 		ereport(ERROR,
 (errmsg("archive modules have to define the symbol %s", "_PG_archive_module_init")));
 
-	(*archive_init) ();
+	(*archive_init) ();
 
-	if (ArchiveContext.archive_file_cb == NULL)
+	if (ArchiveCallbacks.archive_file_cb == NULL)
 		ereport(ERROR,
 (errmsg("archive modules must register an archive callback")));
 
@@ -869,6 +869,6 @@ LoadArchiveLibrary(void)
 static void
 pgarch_call_module_shutdown_cb(int code, Datum arg)
 {
-	if (ArchiveContext.shutdown_cb != NULL)
-		ArchiveContext.shutdown_cb();
+	if (ArchiveCallbacks.shutdown_cb != NULL)
+		ArchiveCallbacks.shutdown_cb();
 }
-- 
2.25.1

>From ea03c90a99ed07a4df8537dfa4916d7858decef9 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Fri, 27 Jan 2023 21:16:34 -0800
Subject: [PATCH v2 2/3] move archive module exports to dedicated header

---
 contrib/basic_archive/basic_archive.c   |  2 +-
 src/backend/postmaster/pgarch.c |  1 +
 src/backend/postmaster/shell_archive.c  |  2 +-
 src/backend/utils/misc/guc_tables.c |  1 +
 src/include/postmaster/archive_module.h | 54 +
 src/include/postmaster/pgarch.h | 39 --
 6 files changed, 58 insertions(+), 41 deletions(-)
 create mode 100644 src/include/postmaster/archive_module.h

diff --git a/contrib/basic_archive/basic_archive.c b/contrib/basic_archive/basic_archive.c
index 3d29711a31..87bbb2174d 100644
--- a/contrib/basic_archive/basic_archive.c
+++ b/contrib/basic_archive/basic_archive.c
@@ -32,7 +32,7 @@
 
 #include "common/int.h"
 #include "miscadmin.h"
-#include "postmaster/pgarch.h"
+#include "postmaster/archive_module.h"
 #include

Re: Show various offset arrays for heap WAL records

2023-01-31 Thread Peter Geoghegan

On Tue, Jan 31, 2023 at 1:52 PM Peter Geoghegan  wrote:
> Obviously what you're doing here will lead to a significant increase
> in the verbosity of the output for affected WAL records. I don't feel
> too bad about that, though. It's really an existing problem, and one
> that should be fixed either way. You kind of have to deal with this
> already, by having a good psql pager, since record types such as
> COMMIT_PREPARED, INVALIDATIONS, and RUNNING_XACTS are already very
> verbose in roughly the same way. You only need to have one of these
> record types output by a function like pg_get_wal_records_info() to
> get absurdly wide output -- it hardly matters that most individual WAL
> record types have terse output at that point.

Actually the really wide output comes from COMMIT records. After I run
the regression tests, and execute some of my own custom pg_walinspect
queries, I see that some individual COMMIT records have a
length(description) of over 10,000 bytes/characters. There is even one
particular COMMIT record whose length(description) is about 46,000
bytes/characters. So *ludicrously* verbose GetRmgr() strings are not
uncommon today. The worst case (or even particularly bad cases) won't
be made any worse by this patch, because there are obviously limits on
the width of the arrays that it outputs details descriptions of, that
don't apply to these COMMIT records.

-- 
Peter Geoghegan

Re: Logical replication timeout problem

2023-01-31 Thread Peter Smith

Here are my review comments for v13-1.

==
Commit message

1.
The DDLs like Refresh Materialized views that generate lots of temporary
data due to rewrite rules may not be processed by output plugins (for
example pgoutput). So, we won't send keep-alive messages for a long time
while processing such commands and that can lead the subscriber side to
timeout.

~

SUGGESTION (minor rearranged way to say the same thing)

For DDLs that generate lots of temporary data due to rewrite rules
(e.g. REFRESH MATERIALIZED VIEW) the output plugins (e.g. pgoutput)
may not be processed for a long time. Since we don't send keep-alive
messages while processing such commands that can lead the subscriber
side to timeout.

~~~

2.
The commit message says what the problem is, but it doesn’t seem to
describe what this patch does to fix the problem.

==
src/backend/replication/logical/reorderbuffer.c

3.
+ /*
+ * It is possible that the data is not sent to downstream for a
+ * long time either because the output plugin filtered it or there
+ * is a DDL that generates a lot of data that is not processed by
+ * the plugin. So, in such cases, the downstream can timeout. To
+ * avoid that we try to send a keepalive message if required.
+ * Trying to send a keepalive message after every change has some
+ * overhead, but testing showed there is no noticeable overhead if
+ * we do it after every ~100 changes.
+ */


3a.
"data is not sent to downstream" --> "data is not sent downstream" (?)

~

3b.
"So, in such cases," --> "In such cases,"

~~~

4.
+#define CHANGES_THRESHOLD 100
+
+ if (++changes_count >= CHANGES_THRESHOLD)
+ {
+ rb->update_progress_txn(rb, txn, change->lsn);
+ changes_count = 0;
+ }

I was wondering if it would have been simpler to write this code like below.

Also, by doing it this way the 'changes_count' variable name makes
more sense IMO, otherwise (for current code) maybe it should be called
something like 'changes_since_last_keepalive'

SUGGESTION
if (++changes_count % CHANGES_THRESHOLD == 0)
rb->update_progress_txn(rb, txn, change->lsn);


--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE

2023-01-31 Thread Andres Freund

Hi,

On 2023-01-31 15:05:17 +0100, Matthias van de Meent wrote:
> On Mon, 30 Jan 2023 at 21:19, Andres Freund  wrote:
> > In an earlier, not posted, version I had an vacuum_defer_cleanup_age 
> > specific
> > helper function for this, but it seems likely we'll need it in other places
> > too.  So I named it TransactionIdRetreatSafely(). I made it accept the xid 
> > by
> > pointer, as the line lengths / repetition otherwise end up making it hard to
> > read the code.  For now I have TransactionIdRetreatSafely() be private to
> > procarray.c, but I expect we'll have to change that eventually.
> 
> If TransactionIdRetreatSafely will be exposed outside procarray.c,
> then I think the xid pointer should be replaced with normal
> arguments/returns; both for parity with TransactionIdRetreatedBy

That's why I named one version *Retreat the other Retreated :)

I think it'll make the code easier to read in the other places too, the
variable names / function names in this space are uncomfortably long to
fit into 78chars..., particularly when there's two references to the
same variable in the same line.

> and to remove this memory store dependency in this hot code path.

I doubt that matters much here and the places it's going to be used
in. And presumably the compiler will inline it anyway. I'd probably make
it a static inline in the header too.

What's making me hesitate about exposing it is that it's quite easy to
get things wrong by using a wrong fxid or such.

> > +/*
> > + * FIXME, doubtful this is the best fix.
> > + *
> > + * Can't represent the 32bit xid as a 64bit xid, as it's before 
> > fxid
> > + * 0. Represent it as an xid from the future instead.
> > + */
> > +if (epoch == 0)
> > +return FullTransactionIdFromEpochAndXid(0, xid);
> 
> Shouldn't this be an error condition instead, as this XID should not
> be able to appear?

If you mean error in the sense of ERROR, no, I don't think so. That code
tries hard to be able to check many tuples in a row. And if we were to
error out here, we'd not able to do that. We should still report those
tuples as corrupt, fwiw.

The reason this path is hit is that a test intentionally corrupts some
xids. So the path is reachable and we need to cope somehow.

I'm not really satisfied with this fix either - I mostly wanted to
include something sufficient to prevent assertion failures.

I had hoped that Mark would look at the amcheck bits and come up with
more complete fixes.

> on 0004:
> 
> > -   '0x'::xid8,
> > -   '-1'::xid8;
> > +   '0xefff'::xid8,
> > +   '0'::xid8;
> 
> The 0xFF... usages were replaced with "0xEFFF...". Shouldn't we also
> test on 0x_fffE__ to test for input of our actual max
> value?

Probably a good idea.

> > @@ -326,7 +329,11 @@ parse_snapshot(const char *str, Node *escontext)
> > while (*str != '\0')
> > {
> > /* read next value */
> > -val = FullTransactionIdFromU64(strtou64(str, , 10));
> > +raw_fxid = strtou64(str, , 10);
> > +
> > +val = FullTransactionIdFromU64(raw_fxid);
> > +if (!InFullTransactionIdRange(raw_fxid))
> > +goto bad_format;
> 
> With assertions enabled FullTransactionIdFromU64 will assert the
> InFullTransactionIdRange condition, meaning we wouldn't hit the branch
> into bad_format.
> I think these operations should be swapped, as parsing a snapshot
> shouldn't run into assertion failures like this if it can error
> normally.

Yep.

> Maybe this can be added to tests as well?

I'll check. I thought for a bit it'd not work because we'd perform range
checks on the xids, but we don't...

Greetings,

Andres Freund

Re: Show various offset arrays for heap WAL records

2023-01-31 Thread Peter Geoghegan

On Tue, Jan 31, 2023 at 1:52 PM Peter Geoghegan  wrote:
> > I would also like to see functions like XLogRecGetBlockRefInfo() pass
> > something more useful than a stringinfo buffer so that we could easily
> > extract out the relfilenode in pgwalinspect.
>
> That does seem particularly important. It's a pain to do this from
> SQL. In general I'm okay with focussing on pg_walinspect over
> pg_waldump, since it'll become more important over time. Obviously
> pg_waldump needs to still work, but I think it's okay to care less
> about pg_waldump usability.

I just realized why you mentioned XLogRecGetBlockRefInfo() -- it
probably shouldn't even be used by pg_walinspect at all (just by
pg_waldump). Using something like XLogRecGetBlockRefInfo() within
pg_walinspect misses out on the opportunity to output information in a
more descriptive tuple format, with real data types. It's not just the
relfilenode, either -- it's the block numbers themselves. And the fork
number.

In other words, I suspect that this is out of scope for this patch,
strictly speaking. We simply shouldn't be using
XLogRecGetBlockRefInfo() in pg_walinspect in the first place. Rather,
pg_walinspect should be calling some other function that ultimately
allows the user to work with (say) an array of int8 from SQL for the
block numbers. There is no great reason not to, AFAICT, since this
information is completely generic -- it's not like the rmgr-specific
output from GetRmgr(), where fine grained type information is just a
nice-to-have, with usability issues of its own (on account of the
details being record type specific).

I've been managing this problem within my own custom pg_walinspect
queries by using my own custom ICU collation. I use ICU's natural sort
order to order based on block_ref, or based on a substring()
expression that extracts something interesting from block_ref, such as
relfilenode. You can create a custom collation for this like so, per
the docs:

CREATE COLLATION IF NOT EXISTS numeric (provider = icu, locale =
'en-u-kn-true');

Obviously this hack of mine works, but hardly anybody else would be
willing to take the time to figure something like this out. Plus it's
error prone when it doesn't really have to be. And it suggests that
the block_ref field isn't record type generic -- that's sort of
misleading IMV.

-- 
Peter Geoghegan

Re: [PATCH] Fix old thinko in formula to compute sweight in numeric_sqrt().

2023-01-31 Thread Joel Jacobson

On Tue, Jan 31, 2023, at 20:25, Dean Rasheed wrote:
> That seems a bit wordy, given the context of this comment. I think
> it's sufficient to just give the formula, and note that it simplifies
> when DEC_DIGITS is even (not just 4):
>
> /*
>  * Assume the input was normalized, so arg.weight is accurate.  The result
>  * then has at least sweight = floor(arg.weight * DEC_DIGITS / 2 + 1)
>  * digits before the decimal point.  When DEC_DIGITS is even, we can save
>  * a few cycles, since the division is exact and there is no need to
>  *  round down.
>  */
> #if DEC_DIGITS == ((DEC_DIGITS / 2) * 2)
> sweight = arg.weight * DEC_DIGITS / 2 + 1;
> #else
> if (arg.weight >= 0)
> sweight = arg.weight * DEC_DIGITS / 2 + 1;
> else
> sweight = 1 - (1 - arg.weight * DEC_DIGITS) / 2;
> #endif

Nice, you managed to simplify it even further.
I think the comment and the code now are crystal clear together.

I've tested it successfully, test report attached. In summary:
DEC_DIGITS=1 now produce 16 sig. figs. in the range sqrt(2e-31) .. sqrt(2e+32), 
which before had a mix of 17 and 18 sig. figs in the result.
DEC_DIGTIS=2 now produce 16 sig. figs. in the range sqrt(2e-31) .. sqrt(2e+31), 
which before always had 17 sig. figs in the result.
DEC_DIGITS=4 is unchanged.

Exact tested patch attached, code copy/pasted verbatim from your email.

Test 

fix-sweight-v4.patch
Description: Binary data
--
-- Comparison of sig_figs(sqrt(n)) for HEAD vs fix-sweight-v4
-- when DEC_DIGITS == 1
--
 DEC_DIGITS |  n   | HEAD | fix-sweight-v4 | diff
+--+--++--
  1 | 2e-100   |   51 | 51 |0
  1 | 2e-99|   50 | 50 |0
  1 | 2e-98|   50 | 50 |0
  1 | 2e-97|   49 | 49 |0
  1 | 2e-96|   49 | 49 |0
  1 | 2e-95|   48 | 48 |0
  1 | 2e-94|   48 | 48 |0
  1 | 2e-93|   47 | 47 |0
  1 | 2e-92|   47 | 47 |0
  1 | 2e-91|   46 | 46 |0
  1 | 2e-90|   46 | 46 |0
  1 | 2e-89|   45 | 45 |0
  1 | 2e-88|   45 | 45 |0
  1 | 2e-87|   44 | 44 |0
  1 | 2e-86|   44 | 44 |0
  1 | 2e-85|   43 | 43 |0
  1 | 2e-84|   43 | 43 |0
  1 | 2e-83|   42 | 42 |0
  1 | 2e-82|   42 | 42 |0
  1 | 2e-81|   41 | 41 |0
  1 | 2e-80|   41 | 41 |0
  1 | 2e-79|   40 | 40 |0
  1 | 2e-78|   40 | 40 |0
  1 | 2e-77|   39 | 39 |0
  1 | 2e-76|   39 | 39 |0
  1 | 2e-75|   38 | 38 |0
  1 | 2e-74|   38 | 38 |0
  1 | 2e-73|   37 | 37 |0
  1 | 2e-72|   37 | 37 |0
  1 | 2e-71|   36 | 36 |0
  1 | 2e-70|   36 | 36 |0
  1 | 2e-69|   35 | 35 |0
  1 | 2e-68|   35 | 35 |0
  1 | 2e-67|   34 | 34 |0
  1 | 2e-66|   34 | 34 |0
  1 | 2e-65|   33 | 33 |0
  1 | 2e-64|   33 | 33 |0
  1 | 2e-63|   32 | 32 |0
  1 | 2e-62|   32 | 32 |0
  1 | 2e-61|   31 | 31 |0
  1 | 2e-60|   31 | 31 |0
  1 | 2e-59|   30 | 30 |0
  1 | 2e-58|   30 | 30 |0
  1 | 2e-57|   29 | 29 |0
  1 | 2e-56|   29 | 29 |0
  1 | 2e-55|   28 | 28 |0
  1 | 2e-54|   28 | 28 |0
  1 | 2e-53|   27 | 27 |0
  1 | 2e-52|   27 | 27 |0
  1 | 2e-51|   26 | 26 |0
  1 | 2e-50|   26 | 26 |0
  1 | 2e-49|   25 | 25 |0
  1 | 2e-48|   25 | 25 |0
  1 | 2e-47|   24 | 24 |0
  1 | 2e-46|   24 | 24 |0
  1 | 2e-45|   23 | 23 |0
  1 | 2e-44|   23 | 23 |0
  1 | 2e-43|

Re: heapgettup() with NoMovementScanDirection unused in core?

2023-01-31 Thread David Rowley

On Wed, 1 Feb 2023 at 03:02, Melanie Plageman  wrote:
>
> On Tue, Jan 31, 2023 at 11:46:05PM +1300, David Rowley wrote:
> > Both can be easily fixed, so no need to submit another patch as far as
> > I'm concerned.
>
> I realized I forgot a commit message in the second version. Patch v1 has
> one.

I made a couple of other adjustments to the Asserts and comments and
pushed the result.

On further looking, I felt the Assert was better off done in
create_indexscan_plan() rather than make_index[only]scan(). I also put
the asserts in tableam.h and removed the heapam.c ones.  The rest was
just comment adjustments.

David

Re: Show various offset arrays for heap WAL records

2023-01-31 Thread Peter Geoghegan

On Fri, Jan 27, 2023 at 9:24 AM Melanie Plageman
 wrote:
> I have taken a stab at doing some of the tasks listed in this email.

Cool.

> I have made the new files rmgr_utils.c/h.
>
> I have come up with a standard format that I like for the output and
> used it in all the heap record types.
>
> Examples below:

That seems like a reasonable approach.

> I started documenting it in the rmgr_utils.h header file in a comment,
> however it may be worth a README?
>
> I haven't polished this description of the format (or added examples,
> etc) or used it in the btree-related functions because I assume the
> format and helper function API will need more discussion.

I think that standardization is good, but ISTM that we need clarity on
what the scope is -- what is *not* being standardized? It may or may
not be useful to call the end result an API. Or it may not make sense
to do so in the first committed version, even though we may ultimately
end up as something that deserves to be called an API. The obligation
to not break tools that are scraping the output in whatever way seems
kind of onerous right now -- just not having any gratuitous
inconsistencies (e.g., fixing totally inconsistent punctuation, making
the names for fields across WAL records consistent when they serve
exactly the same purpose) would be a big improvement.

As I mentioned in passing already, I actually don't think that the
B-Tree WAL records are all that special, as far as this stuff goes.
For example, the DELETE Btree record type is very similar to heapam's
PRUNE record type, and practically identical to Btree's VACUUM record
type. All of these record types use the same basic conventions, like a
snapshotConflictHorizon field for recovery conflicts (which is
generated in a very similar way during original execution, and
processed in precisely the same way during REDO), and arrays of page
offset numbers sorted in ascending order.

There are some remaining details where things from an index AM WAL
record aren't directly analogous (or pretty much identical) to some
other heapam WAL records, such as the way that the DELETE Btree record
type deals with deleting a subset of TIDs from a posting list index
tuple (generated by B-Tree deduplication). But even these exceptions
don't require all that much discussion. You could either choose to
only display the array of deleted index tuple page offset numbers, as
well as the similar array of "updated" index tuple page offset numbers
from xl_btree_delete, in which case you just display two arrays of
page offset numbers, in the same standard way. You may or may not want
to also show each individual xl_btree_update entry -- doing so would
be kinda like showing the details of individual freeze plans, except
that you'd probably display something very similar to the page offset
number display here too (even though these aren't page offset numbers,
they're 0-based offsets into the posting list's item pointer data
array).

BTW, there is also a tendency for non-btree index AM WAL records to be
fairly similar or even near-identical to the B-Tree WAL records. While
Hash indexes are very different to B-Tree indexes at a high level, it
is nevertheless the case that xl_hash_vacuum_one_page is directly
based on xl_btree_delete/xl_btree_vacuum, and that xl_hash_insert is
directly based on xl_btree_insert. There are some other WAL record
types that are completely different across hash and B-Tree, which is a
reflection of the fact that the index grows using a totally different
approach in each AM -- but that doesn't seem like something that
throws up any roadblocks for you (these can all be displayed as simple
structs anyway).

Speaking with my B-Tree hat on, I'd just be happy to be able to see
both of the page offset number arrays (the deleted + updated offset
number arrays from xl_btree_delete/xl_btree_vacuum), without also
being able to\ see output for each individual xl_btree_update
item-pointer-array-offset arrays -- just seeing that much is already a
huge improvement. That's why I'm a bit hesitant to use the term API
just yet, because an obligation to be consistent in whatever way seems
like it might block incremental progress.

> Perhaps there should also be example output of the offset arrays in
> pgwalinspect docs?

That would definitely make sense.

> I've changed the array format helper functions that Andres added to be a
> single function with an additional layer of indirection so that any
> record with an array can use it regardless of type and format of the
> individual elements. The signature is based somewhat off of qsort_r()
> and allows the user to pass a function with the the desired format of
> the elements.

That's handy.

> Personally, I like having the infomasks for the freeze plans. If we
> someday have a more structured input to rmgr_desc, we could then easily
> have them in their own column and use functions like
> heap_tuple_infomask_flags() on them.

I agree, in general, though long term the

Re: pg_upgrade test failure

2023-01-31 Thread Thomas Munro

On Wed, Feb 1, 2023 at 9:54 AM Thomas Munro  wrote:
> ... I have one more idea ...

I also had a second idea, barely good enough to mention and probably
just paranoia.  In a nearby thread I learned that process exit does
not release Windows advisory file locks synchronously, which surprised
this Unix hacker; it made me wonder what else might be released lazily
after process exit.  Handles?!  However, as previously mentioned, it's
possible that even with fully Unix-like resource cleanup on process
exit, we could be confused if we are using "the process that was on
the end of this pipe has closed it" as a proxy for "the process is
gone, *all* its handles are closed".  In any case, the previous kluge
should help wallpaper over any of that too, for this test anyway.

Re: pg_upgrade test failure

2023-01-31 Thread Thomas Munro

On Wed, Feb 1, 2023 at 10:04 AM Andres Freund  wrote:
> On January 31, 2023 12:54:42 PM PST, Thomas Munro  
> wrote:
> >I'm not sure about anything, but if that's what's happening here, then
> >maybe the attached would help.  In short, it would make the previous
> >theory true (the idea of a second unlink() saving the day).
>
> Maybe we should just handle it by sleeping and retrying, if on windows? Sad 
> to even propose...

Yeah, that's what that code I posted would do automatically, though
it's a bit hidden.  The second attempt to unlink() would see delete
already pending, and activate its secret internal sleep/retry loop.

Re: pg_upgrade test failure

2023-01-31 Thread Andres Freund

Hi, 

On January 31, 2023 12:54:42 PM PST, Thomas Munro  
wrote:
>On Wed, Feb 1, 2023 at 6:28 AM Justin Pryzby  wrote:
>> > I pushed the rmtree() change.  Let's see if that helps, or tells us
>> > something new.
>>
>> I found a few failures since then:
>>
>> https://api.cirrus-ci.com/v1/artifact/task/6696942420361216/testrun/build/testrun/pg_upgrade/002_pg_upgrade/log/regress_log_002_pg_upgrade
>>
>> pg_upgrade: warning: could not remove directory 
>> "C:/cirrus/build/testrun/pg_upgrade/002_pg_upgrade/data/t_002_pg_upgrade_new_node_data/pgdata/pg_upgrade_output.d/20230131T134931.720/log":
>>  Directory not empty
>> pg_upgrade: warning: could not remove directory 
>> "C:/cirrus/build/testrun/pg_upgrade/002_pg_upgrade/data/t_002_pg_upgrade_new_node_data/pgdata/pg_upgrade_output.d/20230131T134931.720":
>>  Directory not empty
>
>So no change: we didn't see "could not unlink file ...".  So I think
>that means that it was rmtree() that unlinked the file for the *first*
>time, but someone else has it open.
>
>Even though Windows is at this point eroding my love of computers and
>making me consider a new career in, I dunno, carrot farming or
>something, I have one more idea.  Check out this kluge in
>src/bin/pg_upgrade/exec.c:
>
>/*
> * "pg_ctl -w stop" might have reported that the server has stopped
> * because the postmaster.pid file has been removed, but "pg_ctl -w
> * start" might still be in the process of closing and might still be
> * holding its stdout and -l log file descriptors open.  Therefore,
> * try to open the log file a few more times.
> */
>
>I'm not sure about anything, but if that's what's happening here, then
>maybe the attached would help.  In short, it would make the previous
>theory true (the idea of a second unlink() saving the day).


Maybe we should just handle it by sleeping and retrying, if on windows? Sad to 
even propose... 
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: pg_upgrade test failure

2023-01-31 Thread Thomas Munro

On Wed, Feb 1, 2023 at 6:28 AM Justin Pryzby  wrote:
> > I pushed the rmtree() change.  Let's see if that helps, or tells us
> > something new.
>
> I found a few failures since then:
>
> https://api.cirrus-ci.com/v1/artifact/task/6696942420361216/testrun/build/testrun/pg_upgrade/002_pg_upgrade/log/regress_log_002_pg_upgrade
>
> pg_upgrade: warning: could not remove directory 
> "C:/cirrus/build/testrun/pg_upgrade/002_pg_upgrade/data/t_002_pg_upgrade_new_node_data/pgdata/pg_upgrade_output.d/20230131T134931.720/log":
>  Directory not empty
> pg_upgrade: warning: could not remove directory 
> "C:/cirrus/build/testrun/pg_upgrade/002_pg_upgrade/data/t_002_pg_upgrade_new_node_data/pgdata/pg_upgrade_output.d/20230131T134931.720":
>  Directory not empty

So no change: we didn't see "could not unlink file ...".  So I think
that means that it was rmtree() that unlinked the file for the *first*
time, but someone else has it open.

Even though Windows is at this point eroding my love of computers and
making me consider a new career in, I dunno, carrot farming or
something, I have one more idea.  Check out this kluge in
src/bin/pg_upgrade/exec.c:

/*
 * "pg_ctl -w stop" might have reported that the server has stopped
 * because the postmaster.pid file has been removed, but "pg_ctl -w
 * start" might still be in the process of closing and might still be
 * holding its stdout and -l log file descriptors open.  Therefore,
 * try to open the log file a few more times.
 */

I'm not sure about anything, but if that's what's happening here, then
maybe the attached would help.  In short, it would make the previous
theory true (the idea of a second unlink() saving the day).
diff --git a/src/bin/pg_upgrade/util.c b/src/bin/pg_upgrade/util.c
index 42dcbfc5b5..f214b7737f 100644
--- a/src/bin/pg_upgrade/util.c
+++ b/src/bin/pg_upgrade/util.c
@@ -68,7 +68,12 @@ cleanup_output_dirs(void)
 	if (log_opts.retain)
 		return;
 
-	(void) rmtree(log_opts.basedir, true);
+	/*
+	 * Try twice.  The second time might wait for files to be concurrently
+	 * closed on Windows.
+	 */
+	if (!rmtree(log_opts.basedir, true))
+		rmtree(log_opts.basedir, true);
 
 	/* Remove pg_upgrade_output.d only if empty */
 	switch (pg_check_dir(log_opts.rootdir))
@@ -80,7 +85,12 @@ cleanup_output_dirs(void)
 
 		case 1:	/* exists and empty */
 		case 2:	/* exists and contains only dot files */
-			(void) rmtree(log_opts.rootdir, true);
+			/*
+			 * Try twice.  The second time might wait for files to be
+			 * concurrently closed on Windows.
+			 */
+			if (!rmtree(log_opts.rootdir, true))
+rmtree(log_opts.rootdir, true);
 			break;
 
 		case 4:	/* exists */

Re: heapgettup() with NoMovementScanDirection unused in core?

2023-01-31 Thread Tom Lane

David Rowley  writes:
> On Wed, 1 Feb 2023 at 03:02, Melanie Plageman  
> wrote:
>> I previously had the asserts here, but I thought perhaps we shouldn't
>> restrict table AMs from using NoMovementScanDirection in whatever way
>> they'd like. We care about protecting heapgettup() and
>> heapgettup_pagemode(). We could put a comment in the table AM API about
>> NoMovementScanDirection not necessarily making sense for a next() type
>> function and informing table AMs that they need not support it.

> hmm, but the recent discovery is that we'll never call ExecutePlan()
> with NoMovementScanDirection, so what exactly is going to call
> table_scan_getnextslot() and table_scan_getnextslot_tidrange() with
> NoMovementScanDirection?

Yeah.  This is not an AM-local API.

regards, tom lane

Re: heapgettup() with NoMovementScanDirection unused in core?

2023-01-31 Thread David Rowley

On Wed, 1 Feb 2023 at 03:02, Melanie Plageman  wrote:
>
> On Tue, Jan 31, 2023 at 11:46:05PM +1300, David Rowley wrote:
> > My thoughts were that we might want to put them
> > table_scan_getnextslot() and table_scan_getnextslot_tidrange(). My
> > rationale for that was that it makes it more clear to table AM devs
> > that they don't need to handle NoMovementScanDirection.
>
> I previously had the asserts here, but I thought perhaps we shouldn't
> restrict table AMs from using NoMovementScanDirection in whatever way
> they'd like. We care about protecting heapgettup() and
> heapgettup_pagemode(). We could put a comment in the table AM API about
> NoMovementScanDirection not necessarily making sense for a next() type
> function and informing table AMs that they need not support it.

hmm, but the recent discovery is that we'll never call ExecutePlan()
with NoMovementScanDirection, so what exactly is going to call
table_scan_getnextslot() and table_scan_getnextslot_tidrange() with
NoMovementScanDirection?

> So, I would favor having both asserts check that the direction is one of
> forward or backward.

That sounds fine to me.

David

Re: MacOS: xsltproc fails with "warning: failed to load external entity"

2023-01-31 Thread Tom Lane

Aleksander Alekseev  writes:
>> For either sets of tools, the automatic download option doesn't appear
>> to work anymore.  This probably has to do with either the https or the
>> redirects that have been mentioned.

> Peter, thanks for reporting this. I got the same results: neither
> tools work without setting XML_CATALOG_FILES and setting this
> environment variable work for both Homebrew and macOS versions.

> Here is the summary of our findings. PFA the updated patch v2.

It's worse than that: I find that

export XML_CATALOG_FILES=/dev/null

breaks the docs build on RHEL8 and Fedora 37 (latest) too, with the
same "failed to load external entity" symptom.  I conclude from this
that there is no version of xsltproc anywhere that can still download
the required files automatically.  So we need to take out the advice
that says you can rely on auto-download for everybody, not just macOS.

If this is indeed the case, perhaps we ought to start inserting --nonet
into the invocations.  There's not much use in allowing these tools to
perform internet access when the best-case scenario is that they fail.
(Worst-case, you could end up getting hacked, perhaps?)

regards, tom lane

Re: [PATCH] Fix old thinko in formula to compute sweight in numeric_sqrt().

2023-01-31 Thread Dean Rasheed

On Tue, 31 Jan 2023 at 15:05, Joel Jacobson  wrote:
>
> I also think the performance impact no matter how small isn't worth it,
> but a comment based on your comments would be very valuable IMO.
>
> Below is an attempt at summarising your text, and to avoid the performance 
> impact,
> maybe an #if so we get the general correct formula for DEC_DIGITS 1 or 2,
> and the reduced hand-optimised form for DEC_DIGITS 4?
> That could also improve readabilty, since readers perhaps more easily would 
> see
> the relation between sweight and arg.weight, for the only DEC_DIGITS case we 
> care about.
>

That seems a bit wordy, given the context of this comment. I think
it's sufficient to just give the formula, and note that it simplifies
when DEC_DIGITS is even (not just 4):

/*
 * Assume the input was normalized, so arg.weight is accurate.  The result
 * then has at least sweight = floor(arg.weight * DEC_DIGITS / 2 + 1)
 * digits before the decimal point.  When DEC_DIGITS is even, we can save
 * a few cycles, since the division is exact and there is no need to
 *  round down.
 */
#if DEC_DIGITS == ((DEC_DIGITS / 2) * 2)
sweight = arg.weight * DEC_DIGITS / 2 + 1;
#else
if (arg.weight >= 0)
sweight = arg.weight * DEC_DIGITS / 2 + 1;
else
sweight = 1 - (1 - arg.weight * DEC_DIGITS) / 2;
#endif

Regards,
Dean

Re: Timeline ID hexadecimal format

2023-01-31 Thread Greg Stark

I actually find it kind of annoying that we use hex strings for a lot
of things where they don't add any value. Namely Transaction ID and
LSNs. As a result it's always a bit of a pain to ingest these in other
tools or do arithmetic on them. Neither is referring to memory or
anything where powers of 2 are significant so it really doesn't buy
anything in making them easier to interpret either.

I don't see any advantage in converting every place where we refer to
timelines into hex and then having to refer to things like timeline
1A. It doesn't seem any more intuitive to someone understanding what's
going on than referring to timeline 26.

The fact that the *filename* has it encoded in hex is an
implementation detail and really gets exposed here because it's giving
you the underlying system error that caused the problem. The confusion
only arises when the two are juxtaposed. A hint or something just in
that case might be enough?

Re: Clarify deleting comments and security labels in synopsis

2023-01-31 Thread Tom Lane

=?utf-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?=  writes:
> Tom Lane  writes:
>> Agreed; as-is, the syntax summary is not just confusing but outright
>> wrong.
>> 
>> I think we could go further and split the entry under Parameters
>> to match:

> Makes sense. Something like the attached v2?

WFM, will push.

regards, tom lane

Re: Make EXPLAIN generate a generic plan for a parameterized query

2023-01-31 Thread Tom Lane

Laurenz Albe  writes:
> [ 0001-Add-EXPLAIN-option-GENERIC_PLAN.v4.patch ]

I took a closer look at this patch, and didn't like the implementation
much.  You're not matching the behavior of PREPARE at all: for example,
this patch is content to let $1 be resolved with different types in
different places.  We should be using the existing infrastructure that
parse_analyze_varparams uses.

Also, I believe that in contexts such as plpgsql, it is possible that
there's an external source of $N definitions, which we should probably
continue to honor even with GENERIC_PLAN.

So that leads me to think the code should be more like this.  I'm not
sure if it's worth spending documentation and testing effort on the
case where we don't override an existing p_paramref_hook.

regards, tom lane

diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index d4895b9d7d..58350624e7 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -40,6 +40,7 @@ EXPLAIN [ ANALYZE ] [ VERBOSE ] statementboolean ]
 COSTS [ boolean ]
 SETTINGS [ boolean ]
+GENERIC_PLAN [ boolean ]
 BUFFERS [ boolean ]
 WAL [ boolean ]
 TIMING [ boolean ]
@@ -167,6 +168,20 @@ ROLLBACK;
 

 
+   
+GENERIC_PLAN
+
+ 
+  Generate a generic plan for the statement (see 
+  for details about generic plans).  The statement can contain parameter
+  placeholders like $1, if it is a statement that can
+  use parameters.  This option cannot be used together with
+  ANALYZE, since a statement with unknown parameters
+  cannot be executed.
+ 
+
+   
+

 BUFFERS
 
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 35c23bd27d..ab21a67862 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -190,6 +190,8 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
 			es->wal = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "settings") == 0)
 			es->settings = defGetBoolean(opt);
+		else if (strcmp(opt->defname, "generic_plan") == 0)
+			es->generic = defGetBoolean(opt);
 		else if (strcmp(opt->defname, "timing") == 0)
 		{
 			timing_set = true;
@@ -227,6 +229,13 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
 	 parser_errposition(pstate, opt->location)));
 	}
 
+	/* check that GENERIC_PLAN is not used with EXPLAIN ANALYZE */
+	if (es->generic && es->analyze)
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("EXPLAIN ANALYZE cannot be used with GENERIC_PLAN")));
+
+	/* check that WAL is used with EXPLAIN ANALYZE */
 	if (es->wal && !es->analyze)
 		ereport(ERROR,
 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index e892df9819..9143964e78 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -27,6 +27,7 @@
 #include "access/sysattr.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
+#include "commands/defrem.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
@@ -2906,10 +2907,38 @@ static Query *
 transformExplainStmt(ParseState *pstate, ExplainStmt *stmt)
 {
 	Query	   *result;
+	bool		generic_plan = false;
+	Oid		   *paramTypes = NULL;
+	int			numParams = 0;
+
+	/*
+	 * If we have no external source of parameter definitions, and the
+	 * GENERIC_PLAN option is specified, then accept variable parameter
+	 * definitions (as occurs in PREPARE, for example).
+	 */
+	if (pstate->p_paramref_hook == NULL)
+	{
+		ListCell   *lc;
+
+		foreach(lc, stmt->options)
+		{
+			DefElem*opt = (DefElem *) lfirst(lc);
+
+			if (strcmp(opt->defname, "generic_plan") == 0)
+generic_plan = defGetBoolean(opt);
+			/* don't "break", as we want the last value */
+		}
+		if (generic_plan)
+			setup_parse_variable_parameters(pstate, , );
+	}
 
 	/* transform contained query, allowing SELECT INTO */
 	stmt->query = (Node *) transformOptionalSelectInto(pstate, stmt->query);
 
+	/* make sure all is well with parameter types */
+	if (generic_plan)
+		check_variable_parameters(pstate, (Query *) stmt->query);
+
 	/* represent the command as a utility Query */
 	result = makeNode(Query);
 	result->commandType = CMD_UTILITY;
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 7c1071ddd1..3d3e632a0c 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -46,6 +46,7 @@ typedef struct ExplainState
 	bool		timing;			/* print detailed node timing */
 	bool		summary;		/* print total planning and execution timing */
 	bool		settings;		/* print modified settings */
+	bool		generic;		/* generate a generic plan */
 	ExplainFormat format;		/* output format */
 	/* state for output formatting --- not reset for each new plan tree */
 	int			indent;			/* current indentation level */
diff --git a/src/test/regress/expected/explain.out

Re: Clarify deleting comments and security labels in synopsis

2023-01-31 Thread Dagfinn Ilmari Mannsåker

Tom Lane  writes:

> =?utf-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?=  writes:
>> A user on IRC was confused about how to delete a security label using
>> the `SECURITY LABLEL ON … IS …` command, and looking at the docs I can
>> see why.
>
>> The synopsis just says `IS 'label'`, which implies that it can only be a
>> string. It's not until you read the description for `label` that you
>> see "or `NULL` to drop the security label."  I propose making the
>> synopsis say `IS { 'label' | NULL }` to make it clear that it can be
>> NULL as well.  The same applies to `COMMENT ON … IS …`, which I've also
>> changed similarly in the attached.
>
> Agreed; as-is, the syntax summary is not just confusing but outright
> wrong.
>
> I think we could go further and split the entry under Parameters
> to match:
>
>   'text'
>   The new comment (must be a simple string literal,
>   not an expression).
>
>   NULL
>   Write NULL to drop the comment.

Makes sense. Something like the attached v2?

>   regards, tom lane

- ilmari

>From 0fb27fc8a5f484b87df6068076987719f0fb79bf Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Tue, 31 Jan 2023 16:05:20 +
Subject: [PATCH v2] Clarify that COMMENT and SECURITY LABEL can be set to NULL
 in the synopses

This was only mentioned in the description of the text/label, which
are marked as being in quotes in the synopsis, which can cause
confusion (as witnessed on IRC).

Also separate the literal and NULL case in the parameter list, per
suggestion from Tom Lane.
---
 doc/src/sgml/ref/comment.sgml| 16 
 doc/src/sgml/ref/security_label.sgml | 16 
 2 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/doc/src/sgml/ref/comment.sgml b/doc/src/sgml/ref/comment.sgml
index 7499da1d62..470a6d0b5c 100644
--- a/doc/src/sgml/ref/comment.sgml
+++ b/doc/src/sgml/ref/comment.sgml
@@ -66,7 +66,7 @@
   TRIGGER trigger_name ON table_name |
   TYPE object_name |
   VIEW object_name
-} IS 'text'
+} IS { 'text' | NULL }
 
 where aggregate_signature is:
 
@@ -263,11 +263,19 @@
 
 

-text
+'text'
 
  
-  The new comment, written as a string literal; or NULL
-  to drop the comment.
+  The new comment, written as a string literal.
+ 
+
+   
+
+   
+NULL
+
+ 
+  Write NULL to drop the comment.
  
 

diff --git a/doc/src/sgml/ref/security_label.sgml b/doc/src/sgml/ref/security_label.sgml
index 20a839ff0c..e4eee77932 100644
--- a/doc/src/sgml/ref/security_label.sgml
+++ b/doc/src/sgml/ref/security_label.sgml
@@ -44,7 +44,7 @@
   TABLESPACE object_name |
   TYPE object_name |
   VIEW object_name
-} IS 'label'
+} IS { 'label' | NULL }
 
 where aggregate_signature is:
 
@@ -178,11 +178,19 @@
 
 

-label
+'label'
 
  
-  The new security label, written as a string literal; or NULL
-  to drop the security label.
+  The new security label, written as a string literal.
+ 
+
+   
+
+   
+NULL>
+
+ 
+  Write NULL to drop the security label.
  
 

-- 
2.34.1

Re: Fix database creation during installchecks for ICU cluster

2023-01-31 Thread vignesh C

On Tue, 17 Jan 2023 at 17:17, vignesh C  wrote:
>
> On Tue, 29 Nov 2022 at 20:24, Nazir Bilal Yavuz  wrote:
> >
> > Hi,
> >
> > Thanks for the patch!
> >
> >
> > On 10/29/22 12:54, Marina Polyakova wrote:
> > >
> > > 1) The ECPG tests fail because they use the SQL_ASCII encoding [2],
> > > the database template0 uses the ICU locale provider and SQL_ASCII is
> > > not supported by ICU:
> > >
> > > $ make -C src/interfaces/ecpg/ installcheck
> > > ...
> > > == creating database "ecpg1_regression" ==
> > > ERROR:  encoding "SQL_ASCII" is not supported with ICU provider
> > > ERROR:  database "ecpg1_regression" does not exist
> > > command failed: "/home/marina/postgresql/master/my/inst/bin/psql" -X
> > > -c "CREATE DATABASE \"ecpg1_regression\" TEMPLATE=template0
> > > ENCODING='SQL_ASCII'" -c "ALTER DATABASE \"ecpg1_regression\" SET
> > > lc_messages TO 'C';ALTER DATABASE \"ecpg1_regression\" SET lc_monetary
> > > TO 'C';ALTER DATABASE \"ecpg1_regression\" SET lc_numeric TO 'C';ALTER
> > > DATABASE \"ecpg1_regression\" SET lc_time TO 'C';ALTER DATABASE
> > > \"ecpg1_regression\" SET bytea_output TO 'hex';ALTER DATABASE
> > > \"ecpg1_regression\" SET timezone_abbreviations TO 'Default';" "postgres"
> > >
> >
> > I can confirm that same error happens on my end and your patch fixes the
> > issue. But, do ECPG tests really require SQL_ASCII encoding? I removed
> > ECPG tests' encoding line [1], rebuilt it and 'make -C
> > src/interfaces/ecpg/ installcheck' passed without applying your patch.
> >
> >
> > >
> > > 2) The option --no-locale in pg_regress is described as "use C locale"
> > > [3]. But in this case the created databases actually use the ICU
> > > locale provider with the ICU cluster locale from template0 (see
> > > diff_check_backend_used_provider.txt):
> > >
> > > $ make NO_LOCALE=1 installcheck
> >
> >
> > This works on my end without applying your patch. Commands I used are:
> >
> > $ ./configure --with-icu --prefix=$PWD/build_dir
> > $ make && make install && export PATH=$PWD/build_dir/bin:$PATH
> > $ initdb --locale-provider icu --icu-locale en-US -D data && pg_ctl -D
> > data -l logfile start
> > $ make NO_LOCALE=1 installcheck
>
> Hi Marina Polyakova,
>
> Since it is working without your patch, Is this patch required for any
> other scenarios?

There has been no updates on this thread for some time, so this has
been switched as Returned with Feedback. Feel free to open it in the
next commitfest if you plan to continue on this.

Regards,
Vignesh

Re: vacuumlo: add test to vacuumlo for test coverage

2023-01-31 Thread vignesh C

On Tue, 17 Jan 2023 at 17:10, vignesh C  wrote:
>
> On Wed, 16 Nov 2022 at 10:18, Ian Lawrence Barwick  wrote:
> >
> > 2022年9月3日(土) 17:28 Dong Wook Lee :
> > >
> > > Hi hackers,
> > > I write a tiny patch about vacuumlo to improve test coverage.
> > > I hope my work is meaningful.
> >
> > Hi
> >
> > While reviewing the patch backlog, we have determined that this patch adds
> > one or more TAP tests but has not added the test to the "meson.build" file.
> >
> > To do this, locate the relevant "meson.build" file for each test and add it
> > in the 'tests' dictionary, which will look something like this:
> >
> >   'tap': {
> > 'tests': [
> >   't/001_basic.pl',
> > ],
> >   },
> >
> > For some additional details please see this Wiki article:
> >
> >   https://wiki.postgresql.org/wiki/Meson_for_patch_authors
> >
> > For more information on the meson build system for PostgreSQL see:
> >
> >   https://wiki.postgresql.org/wiki/Meson
>
> Hi DongWook Lee,
>
> Please plan to work on the comment and provide a patch. As CommitFest
> 2023-01 is currently underway, this would be an excellent time to
> update the patch and get the patch in a better shape.

There has been no updates on this thread for some time, so this has
been switched as Returned with Feedback. Feel free to open it in the
next commitfest if you plan to continue on this.

Regards,
Vignesh

Re: xml2: add test for coverage

2023-01-31 Thread vignesh C

On Tue, 17 Jan 2023 at 17:06, vignesh C  wrote:
>
> On Fri, 25 Nov 2022 at 18:08, Peter Eisentraut
>  wrote:
> >
> > On 23.08.22 03:38, Dong Wook Lee wrote:
> > > I made a small patch for xml2 to improve test coverage.
> > > However, there was a problem using the functions below.
> > >
> > > - xpath_number
> > > - xpath_bool
> > > - xpath_nodeset
> > > - xpath_list
> > >
> > > Do you have any advice on how to use this function correctly?
> > > It would also be good to add an example of using the function to the 
> > > document.
> >
> > I can confirm that these functions could use more tests and more
> > documentation and examples.  But given that you registered a patch in
> > the commit fest, it should be you who provides a patch to solve those
> > issues.  Are you still working on this, or were you just looking for
> > help on how to solve this?
>
> Hi DongWook Lee,
>
> Are you planning to work on this and provide an updated patch, if you
> are not planning to work on it, we can update the commitfest entry
> accordingly.

There has been no updates on this thread for some time, so this has
been switched as Returned with Feedback. Feel free to open it in the
next commitfest if you plan to continue on this.

Regards,
Vignesh

Re: [Proposal] Page Compression for OLTP

2023-01-31 Thread vignesh C

On Fri, 4 Nov 2022 at 07:02, Ian Lawrence Barwick  wrote:
>
> 2022年7月27日(水) 2:47 chenhj :
> >
> > Hi hackers,
> >
> > I have rebase this patch and made some improvements.
> >
> >
> > 1. A header is added to each chunk in the pcd file, which records the chunk 
> > of which block the chunk belongs to, and the checksum of the chunk.
> >
> >   Accordingly, all pages in a compressed relation are stored in compressed 
> > format, even if the compressed page is larger than BLCKSZ.
> >
> >   The maximum space occupied by a compressed page is BLCKSZ + chunk_size 
> > (exceeding this range will report an error when writing the page).
> >
> > 2. Repair the pca file through the information recorded in the pcd when 
> > recovering from a crash
> >
> > 3. For compressed relation, do not release the free blocks at the end of 
> > the relation (just like what old_snapshot_threshold does), reducing the 
> > risk of data inconsistency between pcd and pca file.
> >
> > 4. During backup, only check the checksum in the chunk header for the pcd 
> > file, and avoid assembling and decompressing chunks into the original page.
> >
> > 5. bugfix, doc, code style and so on
> >
> >
> > And see src/backend/storage/smgr/README.compression for detail
> >
> >
> > Other
> >
> > 1. remove support of default compression option in tablespace, I'm not sure 
> > about the necessity of this feature, so don't support it for now.
> >
> > 2. pg_rewind currently does not support copying only changed blocks from 
> > pcd file. This feature is relatively independent and could be implemented 
> > later.
>
> Hi
>
> cfbot reports the patch no longer applies.  As CommitFest 2022-11 is
> currently underway, this would be an excellent time to update the patch.

There has been no updates on this thread for some time, so this has
been switched as Returned with Feedback. Feel free to open it in the
next commitfest if you plan to continue on this.

Regards,
Vignesh

Re: Adding CommandID to heap xlog records

2023-01-31 Thread vignesh C

On Mon, 16 Jan 2023 at 19:56, vignesh C  wrote:
>
> On Thu, 3 Nov 2022 at 15:06, Ian Lawrence Barwick  wrote:
> >
> > 2022年9月30日(金) 1:04 Matthias van de Meent :
> > >
> > > On Wed, 28 Sept 2022 at 19:40, Bruce Momjian  wrote:
> > > >
> > > > On Thu, Sep 22, 2022 at 11:12:32PM +0200, Matthias van de Meent wrote:
> > > > > On Thu, 8 Sept 2022 at 23:24, Tom Lane  wrote:
> > > > > >
> > > > > > Matthias van de Meent  writes:
> > > > > > > Please find attached a patch that adds the CommandId of the 
> > > > > > > inserting
> > > > > > > transaction to heap (batch)insert, update and delete records. It 
> > > > > > > is
> > > > > > > based on the changes we made in the fork we maintain for Neon.
> > > > > >
> > > > > > This seems like a very significant cost increment with returns
> > > > > > to only a minuscule number of users.  We certainly cannot consider
> > > > > > it unless you provide some evidence that that impression is wrong.
> > > > >
> > > > > Attached a proposed set of patches to reduce overhead of the inital 
> > > > > patch.
> > > >
> > > > This might be obvious to some, but the patch got a lot larger.  :-(
> > >
> > > Sorry for that, but updating the field from which the redo manager
> > > should pull its information does indeed touch a lot of files because
> > > most users of xl_info are only interested in the 4 bits reserved for
> > > the redo-manager. Most of 0001 is therefore updates to point code to
> > > the new field in XLogRecord, and renaming the variables and arguments
> > > from info to rminfo.
> > >
> > > [tangent] With that refactoring, I also clean up a lot of code that
> > > was using a wrong macro/constant for rmgr flags; `info &
> > > ~XLR_INFO_MASK` may have the same value as `info &
> > > XLR_RMGR_INFO_MASK`, but that's only guaranteed by the documentation;
> > > and would require the same significant rework if new bits were
> > > assigned to non-XLR_INFO_MASK and non-XLR_RMGR_INFO_MASK. [/tangent]
> > >
> > > 0002 grew a bit as well, but not to a degree that I think is worrying
> > > or otherwise impossible to review.
> >
> > Hi
> >
> > This entry was marked as "Needs review" in the CommitFest app but cfbot
> > reports the patch no longer applies.
> >
> > We've marked it as "Waiting on Author". As CommitFest 2022-11 is
> > currently underway, this would be an excellent time update the patch.
> >
> > Once you think the patchset is ready for review again, you (or any
> > interested party) can  move the patch entry forward by visiting
> >
> > https://commitfest.postgresql.org/40/3882/
> >
> > and changing the status to "Needs review".
>
> I was not sure if you will be planning to post an updated version of
> patch as the patch has been awaiting your attention from last
> commitfest, please post an updated version for it soon or update the
> commitfest entry accordingly.

There has been no updates on this thread for some time, so this has
been switched as Returned with Feedback. Feel free to open it in the
next commitfest if you plan to continue on this.

Regards,
Vignesh

Re: [PATCH] New [relation] option engine

2023-01-31 Thread vignesh C

On Tue, 3 Jan 2023 at 18:38, vignesh C  wrote:
>
> On Sun, 20 Nov 2022 at 11:42, Nikolay Shaplov  wrote:
> >
> > В письме от воскресенье, 6 ноября 2022 г. 19:22:09 MSK пользователь Nikolay
> > Shaplov написал:
> >
> > > > > > cfbot reports the patch no longer applies.  As CommitFest 2022-11 is
> > > > > > currently underway, this would be an excellent time to update the
> > > > > > patch.
> > > > >
> > > > > Oups! I should have done it before...
> > > > > Fixed
> > > >
> > > > Trying to fix meson build
> > >
> > > Trying to fix compiler warnings.
> >
> > Patched rebased. Imported changes from 4f981df8 commit (Report a more useful
> > error for reloptions on a partitioned table)
>
> The patch does not apply on top of HEAD as in [1], please post a rebased 
> patch:
> === Applying patches on top of PostgreSQL commit ID
> 92957ed98c5c565362ce665266132a7f08f6b0c0 ===
> === applying patch ./new_options_take_two_v03f.patch
> patching file src/include/access/reloptions.h
> Hunk #1 FAILED at 1.
> 1 out of 4 hunks FAILED -- saving rejects to file
> src/include/access/reloptions.h.rej

There has been no updates on this thread for some time, so this has
been switched as Returned with Feedback. Feel free to open it in the
next commitfest if you plan to continue on this.

Regards,
Vignesh

Re: A new strategy for pull-up correlated ANY_SUBLINK

2023-01-31 Thread vignesh C

On Fri, 6 Jan 2023 at 11:46, vignesh C  wrote:
>
> On Sun, 13 Nov 2022 at 04:15, Tom Lane  wrote:
> >
> > Andy Fan  writes:
> > > In the past we pull-up the ANY-sublink with 2 steps, the first step is to
> > > pull up the sublink as a subquery, and the next step is to pull up the
> > > subquery if it is allowed.  The benefits of this method are obvious,
> > > pulling up the subquery has more requirements, even if we can just finish
> > > the first step, we still get huge benefits. However the bad stuff happens
> > > if varlevelsup = 1 involves, things fail at step 1.
> >
> > > convert_ANY_sublink_to_join ...
> >
> > > if (contain_vars_of_level((Node *) subselect, 1))
> > > return NULL;
> >
> > > In this patch we distinguish the above case and try to pull-up it within
> > > one step if it is helpful, It looks to me that what we need to do is just
> > > transform it to EXIST-SUBLINK.
> >
> > This patch seems awfully messy to me.  The fact that you're having to
> > duplicate stuff done elsewhere suggests at the least that you've not
> > plugged the code into the best place.
> >
> > Looking again at that contain_vars_of_level restriction, I think the
> > reason for it was just to avoid making a FROM subquery that has outer
> > references, and the reason we needed to avoid that was merely that we
> > didn't have LATERAL at the time.  So I experimented with the attached.
> > It seems to work, in that we don't get wrong answers from any of the
> > small number of places that are affected.  (I wonder though whether
> > those test cases still test what they were intended to, particularly
> > the postgres_fdw one.  We might have to try to hack them some more
> > to not get affected by this optimization.)  Could do with more test
> > cases, no doubt.
> >
> > One thing I'm not at all clear about is whether we need to restrict
> > the optimization so that it doesn't occur if the subquery contains
> > outer references falling outside available_rels.  I think that that
> > case is covered by is_simple_subquery() deciding later to not pull up
> > the subquery based on LATERAL restrictions, but maybe that misses
> > something.
> >
> > I'm also wondering whether the similar restriction in
> > convert_EXISTS_sublink_to_join could be removed similarly.
> > In this light it was a mistake for convert_EXISTS_sublink_to_join
> > to manage the pullup itself rather than doing it in the two-step
> > fashion that convert_ANY_sublink_to_join does it.
>
> The patch does not apply on top of HEAD as in [1], please post a rebased 
> patch:
> === Applying patches on top of PostgreSQL commit ID
> b82557ecc2ebbf649142740a1c5ce8d19089f620 ===
> === applying patch ./v2-0001-use-LATERAL-for-ANY_SUBLINK.patch
> patching file contrib/postgres_fdw/expected/postgres_fdw.out
> ...
> Hunk #2 FAILED at 6074.
> Hunk #3 FAILED at 6087.
> 2 out of 3 hunks FAILED -- saving rejects to file
> src/test/regress/expected/join.out.rej

There has been no updates on this thread for some time, so this has
been switched as Returned with Feedback. Feel free to open it in the
next commitfest if you plan to continue on this.

Regards,
Vignesh

Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)

2023-01-31 Thread vignesh C

On Thu, 8 Dec 2022 at 00:33, Andres Freund  wrote:
>
> Hi,
>
> On 2022-11-04 09:25:52 +0100, Drouvot, Bertrand wrote:
> > Please find attached a rebase in v7.
>
> cfbot complains that the docs don't build:
> https://cirrus-ci.com/task/6694349031866368?logs=docs_build#L296
>
> [03:24:27.317] ref/checkpoint.sgml:66: element para: validity error : Element 
> para is not declared in para list of possible children
>
> I've marked the patch as waitin-on-author for now.
>
>
> There's been a bunch of architectural feedback too, but tbh, I don't know if
> we came to any conclusion on that front...

There has been no updates on this thread for some time, so this has
been switched as Returned with Feedback. Feel free to open it in the
next commitfest if you plan to continue on this.

Regards,
Vignesh

Re: [PATCH] Add <> support to sepgsql_restorecon

2023-01-31 Thread vignesh C

On Wed, 18 Jan 2023 at 23:57, Joe Conway  wrote:
>
> The following review has been posted through the commitfest application:
> make installcheck-world:  not tested
> Implements feature:   not tested
> Spec compliant:   not tested
> Documentation:not tested
>
> This needs regression test support for the feature and some minimal 
> documentation that shows how to make use of it.
>
> The new status of this patch is: Waiting on Author

By mistake instead of setting the patch to "Moved to Next CF", I had
selected "Returned with Feedback". Sorry about that.
I have recreated the entry for this patch in the 03/23 commitfest:
https://commitfest.postgresql.org/42/4158/

Regards,
Vignesh

Re: Clarify deleting comments and security labels in synopsis

2023-01-31 Thread Tom Lane

=?utf-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?=  writes:
> A user on IRC was confused about how to delete a security label using
> the `SECURITY LABLEL ON … IS …` command, and looking at the docs I can
> see why.

> The synopsis just says `IS 'label'`, which implies that it can only be a
> string. It's not until you read the description for `label` that you
> see "or `NULL` to drop the security label."  I propose making the
> synopsis say `IS { 'label' | NULL }` to make it clear that it can be
> NULL as well.  The same applies to `COMMENT ON … IS …`, which I've also
> changed similarly in the attached.

Agreed; as-is, the syntax summary is not just confusing but outright
wrong.

I think we could go further and split the entry under Parameters
to match:

'text'
The new comment (must be a simple string literal,
not an expression).

NULL
Write NULL to drop the comment.


regards, tom lane

Re: [PATCH] Completed unaccent dictionary with many missing characters

2023-01-31 Thread vignesh C

On Mon, 16 Jan 2023 at 20:07, vignesh C  wrote:
>
> On Fri, 4 Nov 2022 at 04:59, Ian Lawrence Barwick  wrote:
> >
> > 2022年7月13日(水) 19:13 Przemysław Sztoch :
> > >
> > > Dear Michael P.,
> > >
> > > 3. The matter is not that simple. When I change priorities (ie 
> > > Latin-ASCII.xml is less important than Unicode decomposition),
> > > then "U + 33D7" changes not to pH but to PH.
> > > In the end, I left it like it was before ...
> > >
> > > If you decide what to do with point 3, I will correct it and send new 
> > > patches.
> > >
> > > What is your decision?
> > > Option 1: We leave x as in Latin-ASCII.xml and we also have full 
> > > compatibility with previous PostgreSQL versions.
> > > If they fix Latin-ASCII.xml at Unicode, it will fix itself.
> > >
> > > Option 2:  We choose a lower priority for entries in Latin-ASCII.xml
> > >
> > > I would choose option 1.
> > >
> > > P.S. I will be going on vacation and it would be nice to close this patch 
> > > soon. TIA.
> >
> > Hi
> >
> > This entry was marked as "Needs review" in the CommitFest app but cfbot
> > reports the patch no longer applies.
> >
> > We've marked it as "Waiting on Author". As CommitFest 2022-11 is
> > currently underway, this would be an excellent time update the patch.
> >
> > Once you think the patchset is ready for review again, you (or any
> > interested party) can  move the patch entry forward by visiting
> >
> > https://commitfest.postgresql.org/40/3631/
> >
> > and changing the status to "Needs review".
>
> I was not sure if you will be planning to post an updated version of
> patch as the patch has been awaiting your attention from last
> commitfest, please post an updated version for it soon or update the
> commitfest entry accordingly.  As CommitFest 2023-01 is currently
> underway, this would be an excellent time to update the patch.

There has been no updates on this thread for some time, so this has
been switched as Returned with Feedback. Feel free to open it in the
next commitfest if you plan to continue on this.

Regards,
Vignesh

Re: [PATCH] pgbench: add multiconnect option

2023-01-31 Thread vignesh C

On Wed, 11 Jan 2023 at 22:17, vignesh C  wrote:
>
> On Tue, 8 Nov 2022 at 02:16, Fabien COELHO  wrote:
> >
> >
> > Hello Ian,
> >
> > > cfbot reports the patch no longer applies.  As CommitFest 2022-11 is
> > > currently underway, this would be an excellent time to update the patch.
> >
> > Attached a v5 which is just a rebase.
>
> The patch does not apply on top of HEAD as in [1], please post a rebased 
> patch:
> === Applying patches on top of PostgreSQL commit ID
> 3c6fc58209f24b959ee18f5d19ef96403d08f15c ===
> === applying patch ./pgbench-multi-connect-conninfo-5.patch
> (Stripping trailing CRs from patch; use --binary to disable.)
> patching file doc/src/sgml/ref/pgbench.sgml
> Hunk #3 FAILED at 921.
> 1 out of 3 hunks FAILED -- saving rejects to file
> doc/src/sgml/ref/pgbench.sgml.rej

There has been no updates on this thread for some time, so this has
been switched as Returned with Feedback. Feel free to change it open
in the next commitfest if you plan to continue on this.

Regards,
Vignesh

Re: pg_upgrade test failure

2023-01-31 Thread Justin Pryzby

On Tue, Jan 31, 2023 at 02:00:05PM +1300, Thomas Munro wrote:
> On Thu, Jan 5, 2023 at 4:11 PM Thomas Munro  wrote:
> > On Wed, Dec 7, 2022 at 7:15 AM Andres Freund  wrote:
> > > On 2022-11-08 01:16:09 +1300, Thomas Munro wrote:
> > > > So [1] on its own didn't fix this.  My next guess is that the attached
> > > > might help.
> >
> > > What is our plan here? This afaict is the most common "false positive" for
> > > cfbot in the last weeks.
> 
> I pushed the rmtree() change.  Let's see if that helps, or tells us
> something new.

I found a few failures since then:

https://api.cirrus-ci.com/v1/artifact/task/6696942420361216/testrun/build/testrun/pg_upgrade/002_pg_upgrade/log/regress_log_002_pg_upgrade

pg_upgrade: warning: could not remove directory 
"C:/cirrus/build/testrun/pg_upgrade/002_pg_upgrade/data/t_002_pg_upgrade_new_node_data/pgdata/pg_upgrade_output.d/20230131T134931.720/log":
 Directory not empty
pg_upgrade: warning: could not remove directory 
"C:/cirrus/build/testrun/pg_upgrade/002_pg_upgrade/data/t_002_pg_upgrade_new_node_data/pgdata/pg_upgrade_output.d/20230131T134931.720":
 Directory not empty

https://api.cirrus-ci.com/v1/artifact/task/5119776607961088/testrun/build/testrun/pg_upgrade/002_pg_upgrade/log/regress_log_002_pg_upgrade
same

I verified that those both include your 54e72b66e, which is pretty
strange, since the patch passed tests 10s of times on CI until it was
merged, when it started/kept failing.

-- 
Justin

Clarify deleting comments and security labels in synopsis

2023-01-31 Thread Dagfinn Ilmari Mannsåker

Hi Hackers,

A user on IRC was confused about how to delete a security label using
the `SECURITY LABLEL ON … IS …` command, and looking at the docs I can
see why.

The synopsis just says `IS 'label'`, which implies that it can only be a
string. It's not until you read the description for `label` that you
see "or `NULL` to drop the security label."  I propose making the
synopsis say `IS { 'label' | NULL }` to make it clear that it can be
NULL as well.  The same applies to `COMMENT ON … IS …`, which I've also
changed similarly in the attached.

- ilmari


>From cb02bd9aae85a48355dede9dd13db809f6ada35b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Tue, 31 Jan 2023 16:05:20 +
Subject: [PATCH] Clarify that COMMENT and SECURITY LABEL can be set to NULL in
 the synopses

This was only mentioned in the description of the text/label, which
are marked as being in quotes in the synopsis, which can cause
confusion (as witnessed on IRC).
---
 doc/src/sgml/ref/comment.sgml| 2 +-
 doc/src/sgml/ref/security_label.sgml | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/comment.sgml b/doc/src/sgml/ref/comment.sgml
index 7499da1d62..f5daea7a7e 100644
--- a/doc/src/sgml/ref/comment.sgml
+++ b/doc/src/sgml/ref/comment.sgml
@@ -66,7 +66,7 @@
   TRIGGER trigger_name ON table_name |
   TYPE object_name |
   VIEW object_name
-} IS 'text'
+} IS { 'text' | NULL }
 
 where aggregate_signature is:
 
diff --git a/doc/src/sgml/ref/security_label.sgml b/doc/src/sgml/ref/security_label.sgml
index 20a839ff0c..3df5ed2e9c 100644
--- a/doc/src/sgml/ref/security_label.sgml
+++ b/doc/src/sgml/ref/security_label.sgml
@@ -44,7 +44,7 @@
   TABLESPACE object_name |
   TYPE object_name |
   VIEW object_name
-} IS 'label'
+} IS { 'label' | NULL }
 
 where aggregate_signature is:
 
-- 
2.34.1

Re: Allow tailoring of ICU locales with custom rules

2023-01-31 Thread Laurenz Albe

On Mon, 2023-01-16 at 12:18 +0100, Peter Eisentraut wrote:
> Updated patch attached.

I like that patch.  It applies and passes regression tests.

I played with it:

  CREATE COLLATION german_phone (LOCALE = 'de-AT', PROVIDER = icu, RULES = ' 
< ö');

  SELECT * FROM (VALUES ('od'), ('oe'), ('of'), ('p'), ('ö')) AS q(c)
  ORDER BY c COLLATE german_phone;

   c  
  
   od
   oe
   ö
   of
   p
  (5 rows)

Cool so far.  Now I created a database with that locale:

  CREATE DATABASE teutsch LOCALE_PROVIDER icu ICU_LOCALE german_phone
 LOCALE "de_AT.utf8" TEMPLATE template0;

Now the rules are not in "pg_database":

  SELECT datcollate, daticulocale, daticurules FROM pg_database WHERE datname = 
'teutsch';

   datcollate │ daticulocale │ daticurules 
  ╪══╪═
   de_AT.utf8 │ german_phone │ ∅
  (1 row)

I connect to the database and try:

  SELECT * FROM (VALUES ('od'), ('oe'), ('of'), ('p'), ('ö')) AS q(c)
  ORDER BY c COLLATE german_phone;

  ERROR:  collation "german_phone" for encoding "UTF8" does not exist
  LINE 1: ... ('oe'), ('of'), ('p'), ('ö')) AS q(c) ORDER BY c COLLATE ge...
   ^

Indeed, the collation isn't there...

I guess that it is not the fault of this patch that the collation isn't there,
but I think it is surprising.  What good is a database collation that does not
exist in the database?

What might be the fault of this patch, however, is that "daticurules" is not
set in "pg_database".  Looking at the code, that column seems to be copied
from the template database, but cannot be overridden.

Perhaps this only needs more documentation, but I am confused.

Yours,
Laurenz Albe

Re: Underscores in numeric literals

2023-01-31 Thread Dean Rasheed

On Tue, 31 Jan 2023 at 15:28, Peter Eisentraut
 wrote:
>
> Did you have any thoughts about what to do with the float types?  I
> guess we could handle those in a separate patch?
>

I was assuming that we'd do nothing for float types, because anything
we did would necessarily impact their performance.

Regards,
Dean

Re: [BUG] Autovacuum not dynamically decreasing cost_limit and cost_delay

2023-01-31 Thread Scott Mead

On Mon, Jan 23, 2023 at 12:33 PM Alvaro Herrera 
wrote:

> On 2021-Feb-08, Mead, Scott wrote:
>
> > Hello,
> >I recently looked at what it would take to make a running autovacuum
> > pick-up a change to either cost_delay or cost_limit.  Users frequently
> > will have a conservative value set, and then wish to change it when
> > autovacuum initiates a freeze on a relation.  Most users end up
> > finding out they are in ‘to prevent wraparound’ after it has happened,
> > this means that if they want the vacuum to take advantage of more I/O,
> > they need to stop and then restart the currently running vacuum (after
> > reloading the GUCs).
>
> Hello, I think this has been overlooked, right?  I can't find a relevant
> commit, but maybe I just didn't look hard enough.  I have a feeling that
> this is something that we should address.  If you still have the cycles,
> please consider posting an updated patch and creating a commitfest
> entry.
>

Thanks!  Yeah, I should be able to get this together next week.


>
> Thanks
>
> --
> Álvaro Herrera PostgreSQL Developer  —
> https://www.EnterpriseDB.com/
> "Someone said that it is at least an order of magnitude more work to do
> production software than a prototype. I think he is wrong by at least
> an order of magnitude."  (Brian Kernighan)
>


-- 
--
Scott Mead
*sc...@meads.us *

Re: Progress report of CREATE INDEX for nested partitioned tables

2023-01-31 Thread Ilya Gladyshev


> 17 янв. 2023 г., в 23:44, Tomas Vondra  
> написал(а):
> Do we actually need the new parts_done field? I mean, we already do
> track the value - at PROGRESS_CREATEIDX_PARTITIONS_DONE index in the
> st_progress_param array. Can't we simply read it from there? Then we
> would not have ABI issues with the new field added to IndexStmt.

I think it’s a good approach and it could be useful outside of scope of this 
patch too. So I have attached a patch, that introduces 
pgstat_progress_incr_param function for this purpose. There’s one thing I am 
not sure about, IIUC, we can assume that the only process that can write into 
MyBEEntry of the current backend is the current backend itself, therefore 
looping to get consistent reads from this array is not required. Please correct 
me, if I am wrong here.



0001-create-index-progress-increment.patch
Description: Binary data

Re: Underscores in numeric literals

2023-01-31 Thread Peter Eisentraut


On 23.01.23 21:45, Dean Rasheed wrote:

On Wed, 4 Jan 2023 at 09:28, Dean Rasheed  wrote:


In addition, I think that strip_underscores() could then go away if
numeric_in() were made to handle underscores.

Essentially then, that would move all responsibility for parsing
underscores and non-decimal integers to the datatype input functions,
or their support routines, rather than having it distributed.


Here's an update with those changes.


This looks good to me.

Did you have any thoughts about what to do with the float types?  I 
guess we could handle those in a separate patch?

Re: [PATCH] Fix old thinko in formula to compute sweight in numeric_sqrt().

2023-01-31 Thread Joel Jacobson

Hi,

On Tue, Jan 31, 2023, at 14:40, Dean Rasheed wrote:
> That's still not right. If you want a proper mathematically justified
> formula, it's fairly easy to derive.
...
> or equivalently, in code with truncated integer division:
>
> if (arg.weight >= 0)
> sweight = arg.weight * DEC_DIGITS / 2 + 1;
> else
> sweight = 1 - (1 - arg.weight * DEC_DIGITS) / 2;

Beautiful! Thank you for magnificent analysis and extraordinary good 
explanation, now I finally get it.

> When DEC_DIGITS = 4, this formula also reduces to sweight = 2 *
> arg.weight + 1, but neither gcc nor clang is smart enough to spot that
> (clang doesn't simplify your formula either, BTW).

Oh, that's a shame. :-(

> So even though I
> believe that the above is mathematically correct, and won't change any
> results for DEC_DIGITS = 4, I'm still hesitant to use it, because it
> will have a (small) performance impact, and I don't believe it does
> anything to improve code readability (and certainly not without an
> explanatory comment).

I also think the performance impact no matter how small isn't worth it,
but a comment based on your comments would be very valuable IMO.

Below is an attempt at summarising your text, and to avoid the performance 
impact,
maybe an #if so we get the general correct formula for DEC_DIGITS 1 or 2,
and the reduced hand-optimised form for DEC_DIGITS 4?
That could also improve readabilty, since readers perhaps more easily would see
the relation between sweight and arg.weight, for the only DEC_DIGITS case we 
care about.

Suggestion:

/*
 * Here we approximate the decimal weight of the square root (sweight),
 * given the NBASE-weight (arg.weight) of the input argument.
 *
 * The lower bound of the decimal weight of the input argument is used 
to
 * calculate the decimal weight of the square root, with integer 
division
 * being truncated.
 * 
 * The general formula is:
 *
 * sweight = floor((n+1) / 2)
 * 
 * In our case, since the base is NBASE, not 10, and since we only
 * require an approximation, we don't take the trouble to compute n
 * exactly, we just use the fact that it lies in the range
 * 
 * arg.weight * DEC_DIGITS + 1 <= n <= (arg.weight + 1) * DEC_DIGITS
 *
 * Since we want to ensure at least a certain number of significant
 * digits in the result, we're only interested in the lower bound.
 * Plugging that into the formula above gives:
 * 
 * sweight >= floor(arg.weight * DEC_DIGITS / 2 + 1)
 *
 * Which leads us to the formula below with truncated integer division.
 */
#if DEC_DIGITS == 1 || DEC_DIGITS == 2

if (arg.weight >= 0)
sweight = arg.weight * DEC_DIGITS / 2 + 1;
else
sweight = 1 - (1 - arg.weight * DEC_DIGITS) / 2;

#elif DEC_DIGITS == 4

/*
 * Neither gcc nor clang is smart enough to spot that
 * the formula above neatly reduces to the below
 * when DEC_DIGITS == 4.
 */
sweight = 2 * arg.weight + 1;

#else
#error unsupported NBASE
#endif

/Joel

Re: Logical replication timeout problem

2023-01-31 Thread Ashutosh Bapat

On Tue, Jan 31, 2023 at 5:12 PM Amit Kapila  wrote:
>
> On Tue, Jan 31, 2023 at 5:03 PM Ashutosh Bapat
>  wrote:
> >
> > On Tue, Jan 31, 2023 at 4:58 PM Amit Kapila  wrote:
> >
> > > Thanks, the patch looks good to me. I have slightly adjusted one of
> > > the comments and ran pgindent. See attached. As mentioned in the
> > > commit message, we shouldn't backpatch this as this requires a new
> > > callback and moreover, users can increase the wal_sender_timeout and
> > > wal_receiver_timeout to avoid this problem. What do you think?
> >
> > The callback and the implementation is all in core. What's the risk
> > you see in backpatching it?
> >
>
> Because we are changing the exposed structure and which can break
> existing extensions using it.

Is that because we are adding the new member in the middle of the
structure? Shouldn't extensions provide new libraries with each
maintenance release of PG?

>
> > Customers can adjust the timeouts, but only after the receiver has
> > timed out a few times. Replication remains broekn till they notice it
> > and adjust timeouts. By that time WAL has piled up. It also takes a
> > few attempts to increase timeouts since the time taken by a
> > transaction to decode can not be estimated beforehand. All that makes
> > it worth back-patching if it's possible. We had a customer who piled
> > up GBs of WAL before realising that this is the problem. Their system
> > almost came to a halt due to that.
> >
>
> Which version are they using? If they are at >=14, using "streaming =
> on" for a subscription should also avoid this problem.

13.

-- 
Best Wishes,
Ashutosh Bapat

Re: pub/sub - specifying optional parameters without values.

2023-01-31 Thread Tom Lane

Amit Kapila  writes:
> On Tue, Jan 31, 2023 at 4:25 AM Tom Lane  wrote:
>> Hmph.  I generally think that options defined like this (it's a boolean,
>> except it isn't) are a bad idea, and would prefer to see that API
>> rethought while we still can.

> We have discussed this during development and considered using a
> separate option like parallel = on (or say parallel_workers = n) but
> there were challenges with the same. See discussion in email [1]. We
> also checked that we have various other places using something similar
> for options. For example COPY commands option: HEADER [ boolean |
> MATCH ].

Yeah, and it's bad experiences with the existing cases that make me
not want to add more.  Every one of those was somebody taking the
easy way out.  It generally leads to parsing oddities, such as
not accepting all the same spellings of "boolean" as before.

regards, tom lane

Re: Assert fcinfo has enough args before allowing parameter access (was: Re: generate_series for timestamptz and time zone problem)

2023-01-31 Thread Tom Lane

Gurjeet Singh  writes:
> Please see attached the patch to that ensures we don't accidentally
> access more parameters than that are passed to a SQL callable
> function.

I'm unexcited by that.  It'd add a pretty substantial amount
of code to catch an error that hardly anyone ever makes.

regards, tom lane

[PATCH] Report the query string that caused a memory error under Valgrind

2023-01-31 Thread Onur Tirtir

We use Valgrind --together with the suppression file provided in Postgres 
repo-- to test Citus extension against memory errors.
We replace /bin/postgres executable with a simple bash script that executes the 
original postgres executable under Valgrind and then we run our usual 
regression tests.
However, it is quite hard to understand which query caused a memory error in 
the stack traces that has been dumped into valgrind logfile.

For this reason, we propose the attached patch to allow Valgrind to report the 
query string that caused a memory error right after the relevant stack trace.
I belive this would not only be useful for Citus but also for Postgres and 
other extensions in their valgrind-testing process.

An example piece of valgrind test output for a memory error found in Citus is 
as follows:

==67222== VALGRINDERROR-BEGIN
==67222== Invalid write of size 8
==67222==at 0x7A6F040: dlist_delete 
(home/pguser/postgres-installation/include/postgresql/server/lib/ilist.h:360)
==67222==by 0x7A6F040: ResetRemoteTransaction 
(home/pguser/citus/src/backend/distributed/transaction/remote_transaction.c:872)
==67222==by 0x79CF606: AfterXactHostConnectionHandling 
(home/pguser/citus/src/backend/distributed/connection/connection_management.c:1468)
==67222==by 0x79CF65E: AfterXactConnectionHandling 
(home/pguser/citus/src/backend/distributed/connection/connection_management.c:175)
==67222==by 0x7A6FEDA: CoordinatedTransactionCallback 
(home/pguser/citus/src/backend/distributed/transaction/transaction_management.c:309)
==67222==by 0x544F30: CallXactCallbacks 
(home/pguser/postgres-source/postgresql-15.1/src/backend/access/transam/xact.c:3661)
==67222==by 0x548E12: CommitTransaction 
(home/pguser/postgres-source/postgresql-15.1/src/backend/access/transam/xact.c:2298)
==67222==by 0x549BBC: CommitTransactionCommand 
(home/pguser/postgres-source/postgresql-15.1/src/backend/access/transam/xact.c:3048)
==67222==by 0x832C30: finish_xact_command 
(home/pguser/postgres-source/postgresql-15.1/src/backend/tcop/postgres.c:2750)
==67222==by 0x8352AF: exec_simple_query 
(home/pguser/postgres-source/postgresql-15.1/src/backend/tcop/postgres.c:1279)
==67222==by 0x837312: PostgresMain 
(home/pguser/postgres-source/postgresql-15.1/src/backend/tcop/postgres.c:4595)
==67222==by 0x79F7B5: BackendRun 
(home/pguser/postgres-source/postgresql-15.1/src/backend/postmaster/postmaster.c:4504)
==67222==by 0x7A24E6: BackendStartup 
(home/pguser/postgres-source/postgresql-15.1/src/backend/postmaster/postmaster.c:4232)
==67222==  Address 0x7486378 is 3,512 bytes inside a recently re-allocated 
block of size 8,192 alloc'd
==67222==at 0x484486F: malloc 
(builddir/build/BUILD/valgrind-3.19.0/coregrind/m_replacemalloc/vg_replace_malloc.c:381)
==67222==by 0x98B6EB: AllocSetContextCreateInternal 
(home/pguser/postgres-source/postgresql-15.1/src/backend/utils/mmgr/aset.c:469)
==67222==by 0x79CEABA: InitializeConnectionManagement 
(home/pguser/citus/src/backend/distributed/connection/connection_management.c:107)
==67222==by 0x799FE9F: _PG_init 
(home/pguser/citus/src/backend/distributed/shared_library_init.c:464)
==67222==by 0x96AE6B: internal_load_library 
(home/pguser/postgres-source/postgresql-15.1/src/backend/utils/fmgr/dfmgr.c:289)
==67222==by 0x96B09A: load_file 
(home/pguser/postgres-source/postgresql-15.1/src/backend/utils/fmgr/dfmgr.c:156)
==67222==by 0x973122: load_libraries 
(home/pguser/postgres-source/postgresql-15.1/src/backend/utils/init/miscinit.c:1668)
==67222==by 0x974680: process_shared_preload_libraries 
(home/pguser/postgres-source/postgresql-15.1/src/backend/utils/init/miscinit.c:1686)
==67222==by 0x7A336A: PostmasterMain 
(home/pguser/postgres-source/postgresql-15.1/src/backend/postmaster/postmaster.c:1026)
==67222==by 0x6F303C: main 
(home/pguser/postgres-source/postgresql-15.1/src/backend/main/main.c:202)
==67222==
==67222== VALGRINDERROR-END
**67222** The query for which valgrind reported a memory error was: REFRESH 
MATERIALIZED VIEW other_schema.mat_view;


v1-0001-Report-the-query-string-that-caused-a-mem-error.patch
Description:  v1-0001-Report-the-query-string-that-caused-a-mem-error.patch

Re: Speed up transaction completion faster after many relations are accessed in a transaction

2023-01-31 Thread Amit Langote

Hi David,

On Tue, Jan 24, 2023 at 12:58 PM David Rowley  wrote:
> On Fri, 20 Jan 2023 at 00:26, vignesh C  wrote:
> > CFBot shows some compilation errors as in [1], please post an updated
> > version for the same:
>
> I've attached a rebased patch.

Thanks for the new patch.

Maybe you're planning to do it once this patch is post the PoC phase
(isn't it?), but it would be helpful to have commentary on all the new
dlist fields.

Especially, I think the following warrants a bit more explanation than other:

-   /* We can remember up to MAX_RESOWNER_LOCKS references to local locks. */
-   int nlocks; /* number of owned locks */
-   LOCALLOCK  *locks[MAX_RESOWNER_LOCKS];  /* list of owned locks */
+   dlist_head  locks;  /* dlist of owned locks */

This seems to be replacing what is a cache with an upper limit on the
number of cacheds locks with something that has no limit on how many
per-owner locks are remembered.  I wonder whether we'd be doing
additional work in some cases with the new no-limit implementation
that wasn't being done before (where the owner's locks array is
overflowed) or maybe not much because the new implementation of
ResourceOwner{Remember|Forget}Lock() is simple push/delete of a dlist
node from the owner's dlist?

The following comment is now obsolete:

/*
 * LockReassignCurrentOwner
 *  Reassign all locks belonging to CurrentResourceOwner to belong
 *  to its parent resource owner.
 *
 * If the caller knows what those locks are, it can pass them as an array.
 * That speeds up the call significantly, when a lot of locks are held
 * (e.g pg_dump with a large schema).  Otherwise, pass NULL for locallocks,
 * and we'll traverse through our hash table to find them.
 */

--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com

Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE

2023-01-31 Thread Matthias van de Meent

On Mon, 30 Jan 2023 at 21:19, Andres Freund  wrote:
> On 2023-01-10 21:32:54 +0100, Matthias van de Meent wrote:
> > On Tue, 10 Jan 2023 at 20:14, Andres Freund  wrote:
> > > On 2023-01-10 15:03:42 +0100, Matthias van de Meent wrote:
> > > What precisely do you mean with "skew" here? Do you just mean that it'd 
> > > take a
> > > long time until vacuum_defer_cleanup_age takes effect? Somehow it sounds 
> > > like
> > > you might mean more than that?
> >
> > h->oldest_considered_running can be extremely old due to the global
> > nature of the value and the potential existence of a snapshot in
> > another database that started in parallel to a very old running
> > transaction.
>
> Here's a version that, I think, does not have that issue.

Thanks!

> In an earlier, not posted, version I had an vacuum_defer_cleanup_age specific
> helper function for this, but it seems likely we'll need it in other places
> too.  So I named it TransactionIdRetreatSafely(). I made it accept the xid by
> pointer, as the line lengths / repetition otherwise end up making it hard to
> read the code.  For now I have TransactionIdRetreatSafely() be private to
> procarray.c, but I expect we'll have to change that eventually.

If TransactionIdRetreatSafely will be exposed outside procarray.c,
then I think the xid pointer should be replaced with normal
arguments/returns; both for parity with TransactionIdRetreatedBy and
to remove this memory store dependency in this hot code path.

> Not sure I like TransactionIdRetreatSafely() as a name. Maybe
> TransactionIdRetreatClamped() is better?

I think the 'safely' version is fine.

> I've been working on a test for vacuum_defer_cleanup_age. It does catch the
> corruption at hand, but not much more.  It's quite painful to write, right
> now.  Some of the reasons:
> https://postgr.es/m/20230130194350.zj5v467x4jgqt3d6%40awork3.anarazel.de
>
>
>
> > > I'm tempted to go with reinterpreting 64bit xids as signed. Except that it
> > > seems like a mighty invasive change to backpatch.
> >
> > I'm not sure either. Protecting against underflow by halving the
> > effective valid value space is quite the intervention, but if it is
> > necessary to make this work in a performant manner, it would be worth
> > it. Maybe someone else with more experience can provide their opinion
> > here.
>
> The attached assertions just removes 1/2**32'ths of the space, by reserving
> the xid range with the upper 32bit set as something that shouldn't be
> reachable.

I think that is acceptible.

> Still requires us to change the input routines to reject that range, but I
> think that's a worthy tradeoff.

Agreed.

> I didn't find the existing limits for the
> type to be documented anywhere.
>
> Obviously something like that could only go into HEAD.

Yeah.

Comments on 0003:

> +/*
> + * FIXME, doubtful this is the best fix.
> + *
> + * Can't represent the 32bit xid as a 64bit xid, as it's before fxid
> + * 0. Represent it as an xid from the future instead.
> + */
> +if (epoch == 0)
> +return FullTransactionIdFromEpochAndXid(0, xid);

Shouldn't this be an error condition instead, as this XID should not
be able to appear?

on 0004:

> -   '0x'::xid8,
> -   '-1'::xid8;
> +   '0xefff'::xid8,
> +   '0'::xid8;

The 0xFF... usages were replaced with "0xEFFF...". Shouldn't we also
test on 0x_fffE__ to test for input of our actual max
value?

> @@ -326,7 +329,11 @@ parse_snapshot(const char *str, Node *escontext)
> while (*str != '\0')
> {
> /* read next value */
> -val = FullTransactionIdFromU64(strtou64(str, , 10));
> +raw_fxid = strtou64(str, , 10);
> +
> +val = FullTransactionIdFromU64(raw_fxid);
> +if (!InFullTransactionIdRange(raw_fxid))
> +goto bad_format;

With assertions enabled FullTransactionIdFromU64 will assert the
InFullTransactionIdRange condition, meaning we wouldn't hit the branch
into bad_format.
I think these operations should be swapped, as parsing a snapshot
shouldn't run into assertion failures like this if it can error
normally. Maybe this can be added to tests as well?

Kind regards,

Matthias van de Meent

Re: heapgettup() with NoMovementScanDirection unused in core?

2023-01-31 Thread Melanie Plageman

On Tue, Jan 31, 2023 at 11:46:05PM +1300, David Rowley wrote:
> On Tue, 31 Jan 2023 at 09:57, Melanie Plageman
>  wrote:
> > As for the asserts, I was at a bit of a loss as to where to put an
> > assert which will make it clear that heapgettup() and
> > heapgettup_pagemode() do not handle NoMovementScanDirection but was
> > at a higher level of the executor.
> 
> My thoughts were that we might want to put them
> table_scan_getnextslot() and table_scan_getnextslot_tidrange(). My
> rationale for that was that it makes it more clear to table AM devs
> that they don't need to handle NoMovementScanDirection.

I previously had the asserts here, but I thought perhaps we shouldn't
restrict table AMs from using NoMovementScanDirection in whatever way
they'd like. We care about protecting heapgettup() and
heapgettup_pagemode(). We could put a comment in the table AM API about
NoMovementScanDirection not necessarily making sense for a next() type
function and informing table AMs that they need not support it.

> 
> > Do we not have to accommodate the
> > direction changing from tuple to tuple? If we don't expect the plan node
> > direction to change during execution, then why recalculate
> > estate->es_direction for each invocation of Index/SeqNext()?
> 
> Yeah, this needs to be handled.  FETCH can fetch forwards or backwards
> from a cursor.  The code you have looks fine to me.
> 
> > As such, in this version I've put the asserts in heapgettup() and
> > heapgettup_pagemode().
> >
> > I also realized that it doesn't really make sense to assert about the
> > index scan direction in ExecInitIndexOnlyScan() and ExecInitIndexScan()
> > -- so I've moved the assertion to planner when we make the index plan
> > from the path. I'm not sure if it is needed.
> 
> That's probably slightly better.
> 
> The only thing I really have on this is my thoughts on the Asserts
> going in tableam.h plus the following comment:
> 
> /*
>  * These enum values were originally int8 values. Using -1, 0, and 1 as their
>  * values conveniently mirrors their semantic value when used during 
> execution.
>  */
> 
> I don't really see any reason to keep the historical note here. I
> think something like the following might be better:
> 
> /*
>  * Defines the direction for scanning a table or an index.  Scans are never
>  * invoked using NoMovementScanDirectionScans.  For convenience, we use the
>  * values -1 and 1 for backward and forward scans.  This allows us to perform
>  * a few mathematical tricks such as what is done in ScanDirectionCombine.
>  */

This comment looks good to me.

> Also, a nitpick around the inconsistency with the Asserts. In
> make_indexscan() and make_indexonlyscan() you're checking you're
> getting a forward and backward value, but in heapgettup() and
> heapgettup_pagemode() you're checking you don't get
> NoMovementScanDirection. I think the != NoMovementScanDirection is
> fine for both cases.

Yes, I thought about it being weird that they are different. Perhaps we
should check in both places that it is forward or backward. In
heapgettup[_pagemode()] there is if/else -- so if the assert is only for
NoMovementScanDirection and a new scan direction is added, it would fall
through to the else.

In planner, it is not that we are not "handling" NoMovementScanDirection
(like in heapgettup) but rather that we are only passing Forward and
Backward scan directions when creating the path nodes, so the Assert
would be mainly to remind the developer that if they are creating a plan
with a different scan direction that they should be intentional about
it.

So, I would favor having both asserts check that the direction is one of
forward or backward.

> Both can be easily fixed, so no need to submit another patch as far as
> I'm concerned.

I realized I forgot a commit message in the second version. Patch v1 has
one.

- Melanie

Re: HOT chain validation in verify_heapam()

2023-01-31 Thread Robert Haas

On Mon, Jan 30, 2023 at 8:24 AM Himanshu Upadhyaya
 wrote:
> Before this we stop the node by "$node->stop;" and then only we progress to
> manual corruption. This will abort all running/in-progress transactions.
> So, if we create an in-progress transaction and comment "$node->stop;"
> then somehow all the code that we have for manual corruption does not work.
>
> I think it is required to stop the server and then only proceed for manual 
> corruption?
> If this is the case then please suggest if there is a way to get an 
> in-progress transaction
> that we can use for manual corruption.

How about using a prepared transaction?

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: [PATCH] Fix old thinko in formula to compute sweight in numeric_sqrt().

2023-01-31 Thread Dean Rasheed

On Tue, 31 Jan 2023 at 08:00, Joel Jacobson  wrote:
>
> I think this is what we want:
>
> if (arg.weight < 0)
> sweight = (arg.weight + 1) * DEC_DIGITS / 2 - 1;
> else
> sweight = arg.weight * DEC_DIGITS / 2 + 1;
>

That's still not right. If you want a proper mathematically justified
formula, it's fairly easy to derive.

Let "n" be the decimal weight of the input, taken to be the number of
decimal digits before the decimal point (or minus the number of zeros
after the decimal point, for inputs with no digits before the decimal
point).

Similarly, let "sweight" be the decimal weight of the square root.
Then the relationship between sweight and n can be seen from a few
simple examples (to 4 significant digits):

n argsqrt(arg) sweight
-30.0001 .. 0.0000.01 .. 0.03162   -1
-20.001 .. 0.00  0.03162 .. 0.0-1
-10.01 .. 0.00.1 .. 0.3162 0
0 0.1 .. 0.  0.3162 .. 0.  0
1 1 .. 9.999 1 .. 3.1621
2 10 .. 99.993.16 .. 9.999 1
3 100 .. 999.9   10 .. 31.62   2
4 1000 ...   31.62 .. 99.992

and the general formula is:

sweight = floor((n+1) / 2)

In our case, since the base is NBASE, not 10, and since we only
require an approximation, we don't take the trouble to compute n
exactly, we just use the fact that it lies in the range

arg.weight * DEC_DIGITS + 1 <= n <= (arg.weight + 1) * DEC_DIGITS

Since we want to ensure at least a certain number of significant
digits in the result, we're only interested in the lower bound.
Plugging that into the formula above gives:

sweight >= floor(arg.weight * DEC_DIGITS / 2 + 1)

or equivalently, in code with truncated integer division:

if (arg.weight >= 0)
sweight = arg.weight * DEC_DIGITS / 2 + 1;
else
sweight = 1 - (1 - arg.weight * DEC_DIGITS) / 2;

This is not the same as your formula. For example, when DEC_DIGITS = 1
and arg.weight = -1, yours gives sweight = -1 which isn't right, it
should be 0.

When DEC_DIGITS = 4, this formula also reduces to sweight = 2 *
arg.weight + 1, but neither gcc nor clang is smart enough to spot that
(clang doesn't simplify your formula either, BTW). So even though I
believe that the above is mathematically correct, and won't change any
results for DEC_DIGITS = 4, I'm still hesitant to use it, because it
will have a (small) performance impact, and I don't believe it does
anything to improve code readability (and certainly not without an
explanatory comment).

When DEC_DIGITS = 1, it does guarantee that the result has exactly 16
significant digits (or more if the input scale is larger), but that's
only really of theoretical interest to anyone.

As I noted above, when DEC_DIGITS > 1, this formula is only an
approximation, since it's not using the exact input decimal weight. So
my inclination is to leave the code as-is. It does guarantee that the
result has at least 16 significant digits, which is the intention.

Regards,
Dean

Re: Transparent column encryption

2023-01-31 Thread Peter Eisentraut


On 30.01.23 23:30, Jacob Champion wrote:

The column encryption algorithm is set per-column -- but isn't it
tightly coupled to the CEK, since the key length has to match? From a
layperson perspective, using the same key to encrypt the same plaintext
under two different algorithms (if they happen to have the same key
length) seems like it might be cryptographically risky. Is there a
reason I should be encouraged to do that?


Not really.  I was also initially confused by this setup, but that's how
other similar systems are set up, so I thought it would be confusing to
do it differently.


Which systems let you mix and match keys and algorithms this way? I'd
like to take a look at them.


See here for example: 
https://learn.microsoft.com/en-us/sql/relational-databases/security/encryption/always-encrypted-database-engine?view=sql-server-ver15



With the loss of \gencr it looks like we also lost a potential way to
force encryption from within psql. Any plans to add that for v1?


\gencr didn't do that either.  We could do it.  The libpq API supports
it.  We just need to come up with some syntax for psql.


Do you think people would rather set encryption for all parameters at
once -- something like \encbind -- or have the ability to mix
encrypted and unencrypted parameters?


For pg_dump, I'd like a mode that makes all values parameters of an 
INSERT statement.  But obviously not all of those will be encrypted.  So 
I think we'd want a per-parameter syntax.



More concretely: should psql allow you to push arbitrary text into an
encrypted \bind parameter, like it does now?


We don't have any data type awareness like that now in psql or libpq. 
It would be quite a change to start now.  How would that deal with data 
type extensibility, is an obvious question to start with.  Don't know.

Re: [PoC] Let libpq reject unexpected authentication requests

2023-01-31 Thread Aleksander Alekseev

Hi Peter,

> > What is the status of this patch set?  Michael had registered himself as
> > committer and then removed himself again.  So I hadn't been paying much
> > attention myself.  Was there anything left to discuss?
>
> Previously I marked the patch as RfC. Although it's been a few months
> ago and I don't recall all the details, it should have been in good
> shape (in my personal opinion at least). The commits a9e9a9f and
> 0873b2d Michael referred to are message refactorings so I doubt Jacob
> had serious problems with them.
>
> Of course, I'll take another fresh look and let you know my findings in a bit.

The code is well written, documented and test-covered. All the tests
pass. To my knowledge there are no open questions left. I think the
patch is as good as it will ever get.

-- 
Best regards,
Aleksander Alekseev

1 2 >

1 - 100 of 130 matches

Mail list logo