RE: [PoC] pg_upgrade: allow to upgrade publisher node

2023-10-22 Thread Hayato Kuroda (Fujitsu)
Dear Bharath,

Thank you for reviewing! PSA new version.

> 1. A nit:
> +
> +/*
> + * We also skip decoding in 'fast_forward' mode. In passing set the
> + * 'processing_required' flag to indicate, were it not for this mode,
> + * processing *would* have been required.
> + */
> How about "We also skip decoding in fast_forward mode. In passing set
> the processing_required flag to indicate that if it were not for
> fast_forward mode, processing would have been required."?

Fixed.

> 2. Don't we need InvalidateSystemCaches() after FreeDecodingContext()?
> 
> +/* Clean up */
> +FreeDecodingContext(ctx);

Right. Older system caches should be thrown away here for upcoming pg_dump.

> 3. Don't we need to put CreateDecodingContext in PG_TRY-PG_CATCH with
> InvalidateSystemCaches() in PG_CATCH block? I think we need to clear
> all timetravel entries with InvalidateSystemCaches(), no?

Added.

> 4. The following assertion better be an error? Or we ensure that
> binary_upgrade_slot_has_caught_up isn't called for an invalidated slot
> at all?
> +
> +/* Slots must be valid as otherwise we won't be able to scan the WAL */
> +Assert(MyReplicationSlot->data.invalidated == RS_INVAL_NONE);

I kept the Assert() because pg_upgrade won't call this function for invalidated
slots.

> 5. This better be an error instead of returning false? IMO, null value
> for slot name is an error.
> +/* Quick exit if the input is NULL */
> +if (PG_ARGISNULL(0))
> +PG_RETURN_BOOL(false);

Hmm, OK, changed to elog(ERROR).
If current style is kept and NULL were to input, an empty string may be reported
as slotname in invalid_logical_replication_slots.txt. It is quite strange. Note
again that it won't be expected.

> 6. A nit: how about is_decodable_txn or is_decodable_change or some
> other instead of just a plain name processing_required?
> +/* Do we need to process any change in 'fast_forward' mode? */
> +boolprocessing_required;

I preferred current one. Because not only decodable txn, non-txn change and
empty transactions also be processed.

> 7. Can the following pg_fatal message be consistent and start with
> lowercase letter something like "expected 0 logical replication slots
> "?
> +pg_fatal("Expected 0 logical replication slots but found %d.",
> + nslots_on_new);

Note that the Upper/Lower case rule has been broken in this file. Lower case was
used here because I regarded this sentence as hint message. Please see previous
posts [1] [2].


> 8. s/problem/problematic - "A list of problematic slots is in the file:\n"
> + "A list of the problem slots is in the file:\n"

Fixed.

> 9. IMO, binary_upgrade_logical_replication_slot_has_caught_up seems
> better, meaningful and consistent despite a bit long than just
> binary_upgrade_slot_has_caught_up.

Fixed.

> 10. How about an assert that the passed-in replication slot is logical
> in binary_upgrade_slot_has_caught_up?

Fixed.

> 11. How about adding CheckLogicalDecodingRequirements too in
> binary_upgrade_slot_has_caught_up after CheckSlotPermissions just in
> case?

Not added. CheckLogicalDecodingRequirements() ensures that WALs can be decodable
and the changes can be applied, but both of them are not needed for fast_forward
mode. Also, pre-existing function pg_logical_replication_slot_advance() does not
call it.

> 12. Not necessary but adding ReplicationSlotValidateName(slot_name,
> ERROR); for the passed-in slotname in
> binary_upgrade_slot_has_caught_up may be a good idea, at least in
> assert builds to help with input validations.

Not added because ReplicationSlotAcquire() can report even if invalid name is
added. Also, pre-existing function pg_logical_replication_slot_advance() does 
not
call it.

> 13. Can the functionality of LogicalReplicationSlotHasPendingWal be
> moved to binary_upgrade_slot_has_caught_up and get rid of a separate
> function LogicalReplicationSlotHasPendingWal? Or is it that the
> function exists in logical.c to avoid extra dependencies between
> logical.c and pg_upgrade_support.c?

I kept current style. I think upgrade functions should be short so that actual
tasks should be done in other place. SetAttrMissing() is called only from an
upgrading function, so we do not have a policy to avoid deviding function.
Also, LogicalDecodingProcessRecord() is called from only files in 
src/backend/replication,
so we can keep them.

> 14. I think it's better to check if the old cluster contains the
> necessary function binary_upgrade_slot_has_caught_up instead of just
> relying on major version.
> +/* Logical slots can be migrated since PG17. */
> +if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1600)
> +return;

I kept current style because I could not find a merit for the approach. If the
patch is committed PG17.X surely has 
binary_upgrade_logical_replication_slot_has_caught_up().
Also, other upgrading function are not checked from the 

RE: pg_upgrade's interaction with pg_resetwal seems confusing

2023-10-22 Thread Hayato Kuroda (Fujitsu)
Dear Vignesh,

Thank you for reviewing! New patch can be available in [1].

> We can update the commit message with the details of the same, it will
> help to understand that it is intentionally done.

Both comments and a commit message were updated related.

> There are couple of typos with the new patch:
> 1) "uprade logical replication slot" should be "upgrade logical
> replication slot":
> Previously, the OID counter is restored by invoking pg_resetwal with the -o
> option, at the end of upgrade. This is not problematic for now, but WAL 
> removals
> are not happy if we want to uprade logical replication slot. Therefore, a new
> upgrade function is introduced to reset next OID.

Fixed.

> 2) "becasue the value" should be "because the value":
> Note that we only update the on-memory data to avoid concurrent update of
> control with the chekpointer. It is harmless becasue the value would be set at
> shutdown checkpoint.

Fixed.

[1]: 
https://www.postgresql.org/message-id/TYAPR01MB5866DAFE000F8677C49ACD66F5D8A%40TYAPR01MB5866.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



RE: pg_upgrade's interaction with pg_resetwal seems confusing

2023-10-22 Thread Hayato Kuroda (Fujitsu)
Dear Alvaro,

Thank you for updating! PSA new version.

> Note that this patch falsifies the comment in SetNextObjectId that
> taking the lock is pro forma only -- it no longer is, since in upgrade
> mode there can be multiple subprocesses running -- so I think it should
> be updated.

Indeed, some comments were updated.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED




v3-0001-pg_upgrade-use-upgrade-function-to-restore-OID.patch
Description:  v3-0001-pg_upgrade-use-upgrade-function-to-restore-OID.patch


Re: Server crash on RHEL 9/s390x platform against PG16

2023-10-22 Thread Suraj Kharage
On Sat, Oct 21, 2023 at 5:17 AM Andres Freund  wrote:

> Hi,
>
> On 2023-09-12 15:27:21 +0530, Suraj Kharage wrote:
> > *[edb@9428da9d2137 postgres]$ cat /etc/redhat-release AlmaLinux release
> 9.2
> > (Turquoise Kodkod)[edb@9428da9d2137 postgres]$ lscpuArchitecture:
> > s390x  CPU op-mode(s):   32-bit, 64-bit  Address sizes:39
> bits
>
> Can you provide the rest of the lscpu output?  There have been issues with
> Z14
> vs Z15:
> https://github.com/llvm/llvm-project/issues/53009
>
> You're apparently not hitting that, but given that fact, you either are on
> a
> slightly older CPU, or you have applied a patch to work around it. Because
> otherwise your uild instructions below would hit that problem, I think.
>
>
> > physical, 48 bits virtual  Byte Order:   Big Endian*
> > *Configure command:*
> > ./configure --prefix=/home/edb/postgres/ --with-lz4 --with-zstd
> --with-llvm
> > --with-perl --with-python --with-tcl --with-openssl --enable-nls
> > --with-libxml --with-libxslt --with-systemd --with-libcurl --without-icu
> > --enable-debug --enable-cassert --with-pgport=5414
>
> Hm, based on "--with-libcurl" this isn't upstream postgres, correct? Have
> you
> verified the issue reproduces on upstream postgres?
>

Yes, I can reproduce this on upstream postgres master and v16 branch.

Here are details:

./configure --prefix=/home/edb/postgres/ --with-zstd --with-llvm
--with-perl --with-python --with-tcl --with-openssl --enable-nls
--with-libxml --with-libxslt --with-systemd --without-icu --enable-debug
--enable-cassert --with-pgport=5414 CFLAGS="-g -O0"



[edb@9428da9d2137 postgres]$ cat /etc/redhat-release

AlmaLinux release 9.2 (Turquoise Kodkod)


[edb@9428da9d2137 edbas]$ lscpu

Architecture:   s390x

  CPU op-mode(s):   32-bit, 64-bit

  Address sizes:39 bits physical, 48 bits virtual

  Byte Order:   Big Endian

CPU(s): 9

  On-line CPU(s) list:  0-8

Vendor ID:  GenuineIntel

  Model name:   Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz

CPU family: 6

Model:  158

Thread(s) per core: 1

Core(s) per socket: 1

Socket(s):  9

Stepping:   10

BogoMIPS:   5200.00

Flags:  fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht pbe syscall nx
pdpe1gb lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid pni
pclmulqdq dtes64 ds_cpl ssse3 sdbg fma cx

16 xtpr pcid sse4_1 sse4_2 movbe popcnt aes xsave
avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch fsgsbase bmi1 avx2
bmi2 erms xsaveopt arat

Caches (sum of all):

  L1d:  288 KiB (9 instances)

  L1i:  288 KiB (9 instances)

  L2:   2.3 MiB (9 instances)

  L3:   108 MiB (9 instances)

Vulnerabilities:

  Itlb multihit:KVM: Mitigation: VMX unsupported

  L1tf: Mitigation; PTE Inversion

  Mds:  Vulnerable; SMT Host state unknown

  Meltdown: Vulnerable

  Mmio stale data:  Vulnerable

  Spec store bypass:Vulnerable

  Spectre v1:   Vulnerable: __user pointer sanitization and
usercopy barriers only; no swapgs barriers

  Spectre v2:   Vulnerable, STIBP: disabled

  Srbds:Unknown: Dependent on hypervisor status

  Tsx async abort:  Not affected


[edb@9428da9d2137 postgres]$ clang --version

clang version 15.0.7 (Red Hat 15.0.7-2.el9)

Target: s390x-ibm-linux-gnu

Thread model: posix

InstalledDir: /usr/bin


[edb@9428da9d2137 postgres]$ rpm -qa | grep llvm

*llvm*-libs-15.0.7-1.el9.s390x

*llvm*-15.0.7-1.el9.s390x

*llvm*-test-15.0.7-1.el9.s390x

*llvm*-static-15.0.7-1.el9.s390x

*llvm*-devel-15.0.7-1.el9.s390x

Please let me know if any further information is required.


> >
> > *Test case:*
> > CREATE TABLE rm32044_t1
> > (
> > pkey   integer,
> > val  text
> > );
> > CREATE TABLE rm32044_t2
> > (
> > pkey   integer,
> > label  text,
> > hidden boolean
> > );
> > CREATE TABLE rm32044_t3
> > (
> > pkey integer,
> > val integer
> > );
> > CREATE TABLE rm32044_t4
> > (
> > pkey integer
> > );
> > insert into rm32044_t1 values ( 1 , 'row1');
> > insert into rm32044_t1 values ( 2 , 'row2');
> > insert into rm32044_t2 values ( 1 , 'hidden', true);
> > insert into rm32044_t2 values ( 2 , 'visible', false);
> > insert into rm32044_t3 values (1 , 1);
> > insert into rm32044_t3 values (2 , 1);
> >
> > postgres=# SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON
> rm32044_t1.pkey
> > = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey =
> > rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;
>
> > server closed the connection unexpectedly
> > This probably means the server terminated abnormally
> > before or while processing the request.
> > The connection to the server was lost. Attempting reset: Failed.
> > 

Re: Fix output of zero privileges in psql

2023-10-22 Thread David G. Johnston
On Fri, Oct 20, 2023 at 7:29 PM Erik Wienhold  wrote:

> On 2023-10-20 22:35 +0200, David G. Johnston wrote:
> > In short, I don't want default privileges to start to obey \pset null
> when
> > it never has before and is documented as displaying the empty string.  I
> do
> > want the empty string produced by empty privileges to change to (none) so
> > that it no longer is indistinguishable from our choice of presentation
> for
> > the default privilege case.
>
> I haven't thought off this yet.  The attached v3 of my initial patch
> does that.  It also includes Laurenz' fix to no longer ignore \pset null
> (minus the doc changes that suggest using \pset null to distinguish
> between default and empty privileges because that's no longer needed).
>
>
Thank you.

It looks good to me as-is, with one possible nit.

I wonder if it would be clearer to say:

"If the Access privileges column is *blank* for a given object..."

instead of "empty" to avoid having both "empty [string]" and "empty
privileges" present in the same paragraph and the empty string not
pertaining to the empty privileges.

David J.


Re: speed up a logical replica setup

2023-10-22 Thread Euler Taveira
On Mon, Feb 21, 2022, at 9:09 AM, Euler Taveira wrote:
> A new tool called pg_subscriber does this conversion and is tightly integrated
> with Postgres.

After a long period of inactivity, I'm back to this client tool. As suggested
by Andres, I added a new helper function to change the system identifier as the
last step. I also thought about including the pg_basebackup support but decided
to keep it simple (at least for this current version). The user can always
execute pg_basebackup as a preliminary step to create a standby replica and it
will work. (I will post a separate patch that includes the pg_basebackup
support on the top of this one.)

Amit asked if an extra replication slot is required. It is not. The reason I
keep it is to remember that at least the latest replication slot needs to be
created after the pg_basebackup finishes (pg_backup_stop() call). Regarding the
atexit() routine, it tries to do the best to remove all the objects it created,
however, there is no guarantee it can remove them because it depends on
external resources such as connectivity and authorization. I added a new
warning message if it cannot drop the transient replication slot. It is
probably a good idea to add such warning message into the cleanup routine too.
More to this point, another feature that checks and remove all left objects.
The transient replication slot is ok because it should always be removed at the
end. However, the big question is how to detect that you are not removing
objects (publications, subscriptions, replication slots) from a successful
conversion.

Amit also asked about setup a logical replica with m databases where m is less
than the total number of databases. One option is to remove the "extra"
databases in the target server after promoting the physical replica or in one
of the latest steps. Maybe it is time to propose partial physical replica that
contains only a subset of databases on primary. (I'm not volunteering to it.)
Hence, pg_basebackup has an option to remove these "extra" databases so this
tool can take advantage of it.

Let's continue with the bike shedding... I agree with Peter E that this name
does not express what this tool is. At the moment, it only have one action:
create. If I have to suggest other actions I would say that it could support
switchover option too (that removes the infrastructure created by this tool).
If we decide to keep this name, it should be a good idea to add an option to
indicate what action it is executing (similar to pg_recvlogical) as suggested
by Peter.

I included the documentation cleanups that Peter E shared. I also did small
adjustments into the documentation. It probably deserves a warning section that
advertises about the cleanup.

I refactored the transient replication slot code and decided to use a permanent
(instead of temporary) slot to avoid keeping a replication connection open for
a long time until the target server catches up.

The system identifier functions (get_control_from_datadir() and
get_sysid_from_conn()) now returns uint64 as suggested by Peter.

After reflection, the --verbose option should be renamed to --progress. There
are also some messages that should be converted to debug messages.

I fixed the useless malloc. I rearrange the code a bit but the main still has ~
370 lines (without options/validation ~ 255 lines. I'm trying to rearrange the
code to make the code easier to read and at the same time reduce the main size.
I already have a few candidates in mind such as the code that stops the standby
and the part that includes the recovery parameters. I removed the refactor I
proposed in the previous patch and the current code is relying on pg_ctl --wait
behavior. Are there issues with this choice? Well, one annoying situation is
that pg_ctl does not have a "wait forever" option. If one of the pg_ctl calls
fails, you could probably have to start again (unless you understand the
pg_subscriber internals and fix the setup by yourself). You have to choose an
arbitrary timeout value and expect that pg_ctl *does* perform the action less
than the timeout.

Real tests are included. The cleanup code does not have coverage because a
simple reproducible case isn't easy. I'm also not sure if it is worth it. We
can explain it in the warning section that was proposed.

It is still a WIP but I would like to share it and get some feedback.


--
Euler Taveira
EDB   https://www.enterprisedb.com/
From 0aca46ed57c15bce1972c9db6459c04bcc5cf73d Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Mon, 5 Jun 2023 14:39:40 -0400
Subject: [PATCH v2] Creates a new logical replica from a standby server

A new tool called pg_subscriber can convert a physical replica into a
logical replica. It runs on the target server and should be able to
connect to the source server (publisher) and the target server
(subscriber).

The conversion requires a few steps. Check if the target data directory
has the same system identifier than the source data directory. Stop the

Re: Removing unneeded self joins

2023-10-22 Thread Andrei Lepikhov

On 22/10/2023 05:01, Alexander Korotkov wrote:

On Thu, Oct 19, 2023 at 6:16 AM Andrei Lepikhov
 wrote:

On 19/10/2023 01:50, Alexander Korotkov wrote:

This query took 3778.432 ms with self-join removal disabled, and
3756.009 ms with self-join removal enabled.  So, no measurable
overhead.  Similar to the higher number of joins.  Can you imagine
some extreme case when self-join removal could introduce significant
overhead in comparison with other optimizer parts?  If not, should we
remove self_join_search_limit GUC?

Thanks,
It was Zhihong Yu who worried about that case [1]. And my purpose was to
show a method to avoid such a problem if it would be needed.
I guess the main idea here is that we have a lot of self-joins, but only
few of them (or no one) can be removed.
I can't imagine a practical situation when we can be stuck in the
problems here. So, I vote to remove this GUC.


I've removed the self_join_search_limit.  Anyway there is
enable_self_join_removal if the self join removal algorithm causes any
problems.  I also did some grammar corrections for the comments.  I
think the patch is getting to the committable shape.  I noticed some
failures on commitfest.cputube.org.  I'd like to check how this
version will pass it.


I have observed the final patch. A couple of minor changes can be made 
(see attachment).
Also, I see room for improvement, but it can be done later. For example, 
we limit the optimization to only ordinary tables in this patch. It can 
be extended at least with partitioned and foreign tables soon.


--
regards,
Andrey Lepikhov
Postgres Professional
diff --git a/src/backend/optimizer/plan/analyzejoins.c 
b/src/backend/optimizer/plan/analyzejoins.c
index 6b848aadad..b84197dadb 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -23,7 +23,6 @@
 #include "postgres.h"
 
 #include "catalog/pg_class.h"
-#include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/clauses.h"
 #include "optimizer/joininfo.h"
@@ -50,7 +49,7 @@ typedef struct UniqueRelInfo
 } UniqueRelInfo;
 
 /*
- * The context for replace_varno_walker() containing source and target relids2
+ * The context for replace_varno_walker() containing source and target relids.
  */
 typedef struct
 {


Re: Why is hot_standby_feedback off by default?

2023-10-22 Thread Tom Lane
sirisha chamarthi  writes:
> I believe that any reasonable use of a standby server for queries
> requires hot_standby_feedback to be turned on.

The fact that it's not the default should suggest to you that that's
not the majority opinion.

regards, tom lane




Re: Why is hot_standby_feedback off by default?

2023-10-22 Thread Vik Fearing

On 10/23/23 04:02, sirisha chamarthi wrote:

On Sun, Oct 22, 2023 at 4:56 AM Vik Fearing  wrote:


On 10/22/23 09:50, sirisha chamarthi wrote:

Is there any specific reason hot_standby_feedback default is set to off?



Yes.  No one wants a rogue standby to ruin production.



Agreed.



Okay...



I believe that any reasonable use of a standby server for queries
requires hot_standby_feedback to be turned on. Otherwise, we can
potentially see query cancellations, increased replication lag because of
conflicts (while replaying vacuum cleanup records) on standby (resulting in
longer failover times if the server is configured for disaster recovery +
read scaling). Recent logical decoding on standby as well requires
hot_standby_feedback to be turned on to avoid slot invalidation [1]. If
there is no requirement to query the standby, admins can always  set
hot_standby to off. My goal here is to minimize the amount of configuration
tuning required to use these features.

[1]:
https://www.postgresql.org/docs/current/logicaldecoding-explanation.html



This does not sound like you agree.
--
Vik Fearing





RE: Remove wal_level settings for subscribers in tap tests

2023-10-22 Thread Hayato Kuroda (Fujitsu)
Dear Michael,

I found it was pushed. Thanks!

Best Regards,
Hayato Kuroda
FUJITSU LIMITED





Re: Why is hot_standby_feedback off by default?

2023-10-22 Thread sirisha chamarthi
On Sun, Oct 22, 2023 at 4:56 AM Vik Fearing  wrote:

> On 10/22/23 09:50, sirisha chamarthi wrote:
> > Is there any specific reason hot_standby_feedback default is set to off?
>
>
> Yes.  No one wants a rogue standby to ruin production.
>

Agreed. I believe that any reasonable use of a standby server for queries
requires hot_standby_feedback to be turned on. Otherwise, we can
potentially see query cancellations, increased replication lag because of
conflicts (while replaying vacuum cleanup records) on standby (resulting in
longer failover times if the server is configured for disaster recovery +
read scaling). Recent logical decoding on standby as well requires
hot_standby_feedback to be turned on to avoid slot invalidation [1]. If
there is no requirement to query the standby, admins can always  set
hot_standby to off. My goal here is to minimize the amount of configuration
tuning required to use these features.

[1]:
https://www.postgresql.org/docs/current/logicaldecoding-explanation.html

Thanks,
Sirisha


Re: pgstatindex vs. !indisready

2023-10-22 Thread Michael Paquier
On Sun, Oct 22, 2023 at 02:02:45PM -0700, Noah Misch wrote:
> - /* OK, do it */
> - brinsummarize(indexRel, heapRel, heapBlk, true, , NULL);
> + /* see gin_clean_pending_list() */
> + if (indexRel->rd_index->indisvalid)
> + brinsummarize(indexRel, heapRel, heapBlk, true, , 
> NULL);
> + else
> + ereport(DEBUG1,
> + 
> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> +  errmsg("index \"%s\" is not valid",
> + 
> RelationGetRelationName(indexRel;

brinsummarize() could return 0 even for a valid index, and we would
not be able to make the difference with an invalid index.  Perhaps you
are right and this is not a big deal in practice to do as you are
suggesting.
--
Michael


signature.asc
Description: PGP signature


Re: Why is hot_standby_feedback off by default?

2023-10-22 Thread sirisha chamarthi
Hi Andres,

On Sun, Oct 22, 2023 at 12:08 PM Andres Freund  wrote:

> Hi,
>
> On October 22, 2023 4:56:15 AM PDT, Vik Fearing 
> wrote:
> >On 10/22/23 09:50, sirisha chamarthi wrote:
> >> Is there any specific reason hot_standby_feedback default is set to off?
> >
> >
> >Yes.  No one wants a rogue standby to ruin production.
>
> Medium term, I think we need an approximate xid->"time of assignment"
> mapping that's continually maintained on the primary. One of the things
> that'd show us to do is introduce a GUC to control the maximum effect of
> hs_feedback  on the primary, in a useful unit. Numbers of xids are not a
> useful unit (100k xids is forever on some systems, a few minutes at best on
> others, the rate is not necessarily that steady when plpgsql exception
> handles are used, ...)
>

+1 on this idea. Please let me give this a try.

Thanks,
Sirisha


Re: Remove extraneous break condition in logical slot advance function

2023-10-22 Thread Michael Paquier
On Sun, Oct 22, 2023 at 11:59:00PM +0530, Bharath Rupireddy wrote:
> AFAICS, there's no correctness argument for breaking before CFI. As
> rightly said, CFIs can happen before the break condition either down
> inside LogicalDecodingProcessRecord or XLogReadRecord (page_read
> callbacks for instance).
> 
> Having said that, what may happen if CFI happens and interrupts are
> processed before the break condition is that the decoding occurs again
> which IMV is not a big problem.
> 
> An idea to keep all of  XLogReadRecord() -
> LogicalDecodingProcessRecord() loops consistent is by having CFI at
> the start of the loops before the XLogReadRecord().

Passing by..  All that just looks like an oversight of 38a957316d7e
that simplified the main while loop, so I've just applied your v2.
--
Michael


signature.asc
Description: PGP signature


Re: Show version of OpenSSL in ./configure output

2023-10-22 Thread Michael Paquier
On Sun, Oct 22, 2023 at 08:34:40PM -0400, Tom Lane wrote:
> +1, I've wished for that too.  It's not 100% clear that whatever
> openssl is in your PATH matches the libraries we select, but this
> will get it right in most cases and it seems like about the right
> level of effort.

Yes, I don't reallt want to add more macros for the sake of this
information.

> +  pgac_openssl_version="$($OPENSSL version 2> /dev/null || echo no)"
> 
> Maybe "echo 'openssl not found'" would be better.

Sure.
--
Michael


signature.asc
Description: PGP signature


Re: Casual Meson fixups

2023-10-22 Thread Andres Freund
Hi,

On 2023-09-01 11:31:07 -0500, Tristan Partin wrote:
> Muon development has slowed quite a bit this year. Postgres is probably the
> largest project which tries its best to support Muon. It seems like if we
> want to keep supporting Muon, we should get a buildfarm machine to use it
> instead of Meson to catch regressions. OR we should contemplate removing
> support for it.

I found it to be quite useful to find bugs in the meson.build files...


> Subject: [PATCH v1 2/7] Add Meson override for libpq
> 
> Meson has the ability to do transparent overrides when projects are used
> as subprojects. For instance, say I am building a Postgres extension. I
> can define Postgres to be a subproject of my extension given the
> following wrap file:
> 
> [wrap-git]
> url = https://git.postgresql.org/git/postgresql.git
> revision = master
> depth = 1
> 
> [provide]
> dependency_names = libpq
> 
> Then in my extension (root project), I can have the following line
> snippet:
> 
> libpq = dependency('libpq')
> 
> This will tell Meson to transparently compile libpq prior to it
> compiling my extension (because I depend on libpq) if libpq isn't found
> on the host system.
> ---
>  src/interfaces/libpq/meson.build | 2 ++
>  1 file changed, 2 insertions(+)

This example doesn't seem convincing, because if you build a postgres
extension, you need postgres' headers - which makes it extremely likely that
libpq is available :)


> From 5455426c9944ff8c8694db46929eaa37e03d907f Mon Sep 17 00:00:00 2001
> From: Tristan Partin 
> Date: Fri, 1 Sep 2023 11:07:40 -0500
> Subject: [PATCH v1 7/7] Disable building contrib targets by default
> 
> This matches the autotools build.

Why should we match it here? IIRC this would actually break running meson
install, because it doesn't grok targets that are installed but not built by
default :/.

Greetings,

Andres Freund




Re: Patch: Improve Boolean Predicate JSON Path Docs

2023-10-22 Thread Erik Wienhold
On 2023-10-20 15:49 +0200, David Wheeler wrote:
> On Oct 19, 2023, at 23:49, Erik Wienhold  wrote:
> 
> > I don't even know what that represents, probably not some fancy file
> > compression.

That's an AppleSingle file according to [1][2].  It only contains the
resource fork and file name but no data fork.

> Oh, weird. Trying from a webmail client instead.

Thanks.

> +Does JSON path return any item for the specified JSON value? Use only
> +SQL-standard JSON path expressions, not
> +predicate check
> +expressions.

Any reason for calling it "predicate check expressions" (e.g. the link
text) and sometimes "predicate path expressions" (e.g. the linked
section title)?  I think it should be named consistently to avoid
confusion and also to simplify searching.

> +Returns the result of a JSON path
> +predicate
> +check for the specified JSON value. If the result is not 
> Boolean,
> +then NULL is returned. Use only with
> +predicate check
> +expressions.

Linking the same section twice in the same paragraph seems excessive.

> += select jsonb_path_query(:'json', 
> '$.track.segments');
> +select jsonb_path_query(:'json', '$.track.segments');

Please remove the second SELECT.

> += select jsonb_path_query(:'json', 'strict 
> $.track.segments[0].location');
> + jsonb_path_query
> +---
> + [47.763, 13.4034]

Strict mode is unnecessary to get that result and I'd omit it because
the different modes are not introduced yet at this point.

> += select jsonb_path_query(:'json', 'strict 
> $.track.segments.size()');
> + jsonb_path_query
> +--
> + 2

Strict mode is unnecessary here as well.

> + using the lax mode. To avoid surprising results, we recommend using
> + the .** accessor only in the strict mode. The

Please change to "in strict mode" (without "the").

[1] https://www.rfc-editor.org/rfc/rfc1740.txt
[2] 
https://web.archive.org/web/20180311140826/http://kaiser-edv.de/documents/AppleSingle_AppleDouble.pdf

-- 
Erik




Re: Show version of OpenSSL in ./configure output

2023-10-22 Thread Tom Lane
Michael Paquier  writes:
> 5e4dacb9878c has reminded me that we don't show the version of OpenSSL
> in the output of ./configure.  This would be useful to know when
> looking at issues within the buildfarm, and I've wanted that a few
> times.

+1, I've wished for that too.  It's not 100% clear that whatever
openssl is in your PATH matches the libraries we select, but this
will get it right in most cases and it seems like about the right
level of effort.

+  pgac_openssl_version="$($OPENSSL version 2> /dev/null || echo no)"

Maybe "echo 'openssl not found'" would be better.

regards, tom lane




Show version of OpenSSL in ./configure output

2023-10-22 Thread Michael Paquier
Hi all,

5e4dacb9878c has reminded me that we don't show the version of OpenSSL
in the output of ./configure.  This would be useful to know when
looking at issues within the buildfarm, and I've wanted that a few
times.

How about the attached to use the openssl command, if available, to
display this information?  Libraries may be installed while the
command is not available, but in most cases I'd like to think that it
is around, and it is less complex than using something like
SSLeay_version() from libcrypto.

meson already shows this information, so no additions are required
there.  Also, LibreSSL uses `openssl`, right?

Thoughts or comments?
--
Michael
diff --git a/configure b/configure
index c2cb1b1b24..6cb0310fb3 100755
--- a/configure
+++ b/configure
@@ -648,13 +648,13 @@ MSGFMT
 PG_CRC32C_OBJS
 CFLAGS_CRC
 LIBOBJS
-OPENSSL
 ZSTD
 LZ4
 UUID_LIBS
 LDAP_LIBS_BE
 LDAP_LIBS_FE
 with_ssl
+OPENSSL
 PTHREAD_CFLAGS
 PTHREAD_LIBS
 PTHREAD_CC
@@ -904,6 +904,7 @@ LDFLAGS_EX
 LDFLAGS_SL
 PERL
 PYTHON
+OPENSSL
 MSGFMT
 TCLSH'
 
@@ -1615,6 +1616,7 @@ Some influential environment variables:
   LDFLAGS_SL  extra linker flags for linking shared libraries only
   PERLPerl program
   PYTHON  Python program
+  OPENSSL path to openssl command
   MSGFMT  msgfmt program for NLS
   TCLSH   Tcl interpreter program (tclsh)
 
@@ -12863,6 +12865,65 @@ else
 fi
 
   fi
+  # Print version of OpenSSL, if command is available.
+
+  if test -z "$OPENSSL"; then
+  for ac_prog in openssl
+do
+  # Extract the first word of "$ac_prog", so it can be a program name with args.
+set dummy $ac_prog; ac_word=$2
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+$as_echo_n "checking for $ac_word... " >&6; }
+if ${ac_cv_path_OPENSSL+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  case $OPENSSL in
+  [\\/]* | ?:[\\/]*)
+  ac_cv_path_OPENSSL="$OPENSSL" # Let the user override the test with a path.
+  ;;
+  *)
+  as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+for ac_exec_ext in '' $ac_executable_extensions; do
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
+ac_cv_path_OPENSSL="$as_dir/$ac_word$ac_exec_ext"
+$as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
+break 2
+  fi
+done
+  done
+IFS=$as_save_IFS
+
+  ;;
+esac
+fi
+OPENSSL=$ac_cv_path_OPENSSL
+if test -n "$OPENSSL"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $OPENSSL" >&5
+$as_echo "$OPENSSL" >&6; }
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+
+
+  test -n "$OPENSSL" && break
+done
+
+else
+  # Report the value of OPENSSL in configure's output in all cases.
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for OPENSSL" >&5
+$as_echo_n "checking for OPENSSL... " >&6; }
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $OPENSSL" >&5
+$as_echo "$OPENSSL" >&6; }
+fi
+
+  pgac_openssl_version="$($OPENSSL version 2> /dev/null || echo no)"
+  { $as_echo "$as_me:${as_lineno-$LINENO}: using openssl $pgac_openssl_version" >&5
+$as_echo "$as_me: using openssl $pgac_openssl_version" >&6;}
   # Function introduced in OpenSSL 1.0.2, not in LibreSSL.
   for ac_func in SSL_CTX_set_cert_cb
 do :
diff --git a/configure.ac b/configure.ac
index 440b08d113..35585df598 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1360,6 +1360,11 @@ if test "$with_ssl" = openssl ; then
  AC_SEARCH_LIBS(CRYPTO_new_ex_data, [eay32 crypto], [], [AC_MSG_ERROR([library 'eay32' or 'crypto' is required for OpenSSL])])
  AC_SEARCH_LIBS(SSL_new, [ssleay32 ssl], [], [AC_MSG_ERROR([library 'ssleay32' or 'ssl' is required for OpenSSL])])
   fi
+  # Print version of OpenSSL, if command is available.
+  AC_ARG_VAR(OPENSSL, [path to openssl command])
+  PGAC_PATH_PROGS(OPENSSL, openssl)
+  pgac_openssl_version="$($OPENSSL version 2> /dev/null || echo no)"
+  AC_MSG_NOTICE([using openssl $pgac_openssl_version])
   # Function introduced in OpenSSL 1.0.2, not in LibreSSL.
   AC_CHECK_FUNCS([SSL_CTX_set_cert_cb])
   # Functions introduced in OpenSSL 1.1.0. We used to check for


signature.asc
Description: PGP signature


Re: LLVM 16 (opaque pointers)

2023-10-22 Thread Thomas Munro
On Sun, Oct 22, 2023 at 2:44 PM Thomas Munro  wrote:
> On Sat, Oct 21, 2023 at 2:45 PM Tom Lane  wrote:
> > Thomas Munro  writes:
> > > (It'd be nice if the
> > > build farm logged "$LLVM_CONFIG --version" somewhere.)
> >
> > It's not really the buildfarm script's responsibility to do that,
> > but feel free to make configure do so.
>
> Done, copying the example of how we do it for perl and various other things.

Build farm measles update:

With that we can see that nicator (LLVM 15 on POWER) is green.  We can
see that cavefish (LLVM 6 on POWER) is red as expected.  We can also
see that bonito (LLVM 7 on POWER) is red, so my earlier theory that
this might be due to the known bug we got fixed in LLVM 7 is not
enough.  Maybe there are other things fixed on POWER somewhere between
those LLVM versions?  I suspect it'll be hard to figure out without
debug builds and backtraces.

One thing is definitely our fault, though.  xenodermus shows failures
on REL_12_STABLE and REL_13_STABLE, using debug LLVM 6 on x86.  I
couldn't reproduce this locally on (newer) debug LLVM, so I bugged
Andres for access to the host/libraries and chased it down.  There is
some type punning for a function parameter REL_13_STABLE and earlier,
removed by Andres in REL_14_STABLE, and when I back-patched my
refactoring I effectively back-patched a few changes from his commit
df99ddc70b97 that removed the type punning, but I should have brought
one more line from that commit to remove another trace of it.  See
attached.
From 52397f8c70641080f2487ee5f019f143dd35957c Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Sun, 22 Oct 2023 23:20:56 +
Subject: [PATCH] jit: Fix LLVM back-patching bug in 12 and 13.

While back-patching f90b4a84, I missed that branches before 14 did some
type punning in a function parameter.  That didn't cause any problem for
release builds of LLVM, but debug builds of some older versions would
fail a type cross-check assertion.  To fix this, we need to back-patch a
line of df99ddc7.

Per build farm animal xenodermus, which runs a debug build of LLVM 6
with jit_above_cost=0.

diff --git a/src/backend/jit/llvm/llvmjit_expr.c b/src/backend/jit/llvm/llvmjit_expr.c
index d84fbba7cc..c2e367f00d 100644
--- a/src/backend/jit/llvm/llvmjit_expr.c
+++ b/src/backend/jit/llvm/llvmjit_expr.c
@@ -1126,7 +1126,7 @@ llvm_compile_expr(ExprState *state)
 		 llvm_pg_var_type("TypeExecEvalSubroutine"));
 
 	v_params[0] = v_state;
-	v_params[1] = l_ptr_const(op, l_ptr(TypeSizeT));
+	v_params[1] = l_ptr_const(op, l_ptr(StructExprEvalStep));
 	v_params[2] = v_econtext;
 	l_call(b,
 		   LLVMGetFunctionType(ExecEvalSubroutineTemplate),
-- 
2.42.0



Re: SQL:2011 application time

2023-10-22 Thread jian he
hi. also based on v16.
-tests.
drop table if exists for_portion_of_test1;
CREATE unlogged TABLE for_portion_of_test1 (id int4range, valid_at
tsrange,name text );
INSERT INTO for_portion_of_test1 VALUES  ('[1,1]', NULL,
'[1,1]_NULL'),('[1,1]', '(,)', '()_[1,]')
,('[1,1]', 'empty', '[1,1]_empty'),(NULL,NULL, NULL), (nuLL,
'(2018-01-01,2019-01-01)','misc');
--1
UPDATE for_portion_of_test1 FOR PORTION OF valid_at FROM NULL TO NULL
SET name = 'for_portition_NULLtoNULL';
select * from for_portion_of_test1;
--2
UPDATE for_portion_of_test1 FOR PORTION OF valid_at FROM null TO
UNBOUNDED SET name = 'NULL_TO_UNBOUNDED';
select * from for_portion_of_test1;
--3
UPDATE for_portion_of_test1 FOR PORTION OF valid_at FROM UNBOUNDED TO
null SET name = 'UNBOUNDED__TO_NULL';
select * from for_portion_of_test1;
--4
UPDATE for_portion_of_test1 FOR PORTION OF valid_at FROM UNBOUNDED TO
UNBOUNDED SET name = 'UNBOUNDED__TO_UNBOUNDED';
select * from for_portion_of_test1;

File: /src/backend/executor/nodeModifyTable.c
1277: oldRange = slot_getattr(oldtupleSlot,
forPortionOf->rangeVar->varattno, );
1278:
1279: if (isNull)
1280: elog(ERROR, "found a NULL range in a temporal table");
1281: oldRangeType = DatumGetRangeTypeP(oldRange);

I wonder when this isNull will be invoked. the above tests won't
invoke the error.
also the above test, NULL seems equivalent to unbounded. FOR PORTION
OF "from" and "to" both bound should not be null?

which means the following code does not work as intended? I also
cannot find a way to invoke the following elog error branch.
File:src/backend/executor/nodeModifyTable.c
4458: exprState = ExecPrepareExpr((Expr *) forPortionOf->targetRange, estate);
4459: targetRange = ExecEvalExpr(exprState, econtext, );
4460: if (isNull)
4461: elog(ERROR, "Got a NULL FOR PORTION OF target range");

---
i also made some changes in the function range_leftover_internal,
ExecForPortionOfLeftovers.
please see the attached patch.
From 5cdd99f9a63d381985541b0df539c2bb92ef7276 Mon Sep 17 00:00:00 2001
From: pgaddict 
Date: Sat, 21 Oct 2023 17:18:50 +0800
Subject: [PATCH v1 1/1] refactor function range_leftover_internal

renaming variable.
better comment.

---
 src/backend/utils/adt/rangetypes.c | 55 +-
 src/include/utils/rangetypes.h |  5 ++-
 2 files changed, 33 insertions(+), 27 deletions(-)

diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index d4e17323..a5a34450 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -1207,47 +1207,54 @@ range_split_internal(TypeCacheEntry *typcache, const RangeType *r1, const RangeT
 }
 
 /*
- * range_leftover_internal - Sets output1 and output2 to the remaining parts of r1
- * after subtracting r2, or if nothing is left then to the empty range.
- * output1 will always be "before" r2 and output2 "after".
+ * Sets leftoverRange and rightoverRange to the remaining parts of oldRange
+ * after subtracting targetRange, or if nothing is left then to the empty range.
+ * leftoverRange will always be "before" targetRange and rightoverRange "after" targetRange.
  */
 void
-range_leftover_internal(TypeCacheEntry *typcache, const RangeType *r1,
-		const RangeType *r2, RangeType **output1, RangeType **output2)
+range_leftover_internal(TypeCacheEntry *typcache, const RangeType *oldRange,
+		const RangeType *targetRange, RangeType **leftoverRange, RangeType **rightoverRange)
 {
-	RangeBound	lower1,
-lower2;
-	RangeBound	upper1,
-upper2;
-	bool		empty1,
-empty2;
+	RangeBound	lower_oldRange,
+lower_targetRange;
+	RangeBound	upper_oldRange,
+upper_targetRange;
+	bool		oldRange_is_empty,
+targetRange_is_empty;
 
-	range_deserialize(typcache, r1, , , );
-	range_deserialize(typcache, r2, , , );
+	range_deserialize(typcache, oldRange, _oldRange, _oldRange, _is_empty);
+	range_deserialize(typcache, targetRange, _targetRange, _targetRange, _is_empty);
 
-	if (range_cmp_bounds(typcache, , ) < 0)
+	/*
+	* FOR PORTION OF. upper_targetRange, if oldRange do not overlaps with targetRangeType.
+	* So these two range must overlaps, that means both range should not be empty.
+	*
+	*/
+	Assert(!oldRange_is_empty);
+	Assert(!targetRange_is_empty);
+
+	if (range_cmp_bounds(typcache, _oldRange, _targetRange) < 0)
 	{
-		lower2.inclusive = !lower2.inclusive;
-		lower2.lower = false;
-		*output1 = make_range(typcache, , , false, NULL);
+		lower_targetRange.inclusive = !lower_targetRange.inclusive;
+		lower_targetRange.lower = false;
+		*leftoverRange = make_range(typcache, _oldRange, _targetRange, false, NULL);
 	}
 	else
 	{
-		*output1 = make_empty_range(typcache);
+		*leftoverRange = make_empty_range(typcache);
 	}
 
-	if (range_cmp_bounds(typcache, , ) > 0)
+	if (range_cmp_bounds(typcache, _oldRange, _targetRange) > 0)
 	{
-		upper2.inclusive = !upper2.inclusive;
-		upper2.lower = true;
-		*output2 = make_range(typcache, , , 

Re: UniqueKey v2

2023-10-22 Thread jian he
On Fri, Oct 20, 2023 at 4:33 PM  wrote:
>
>
> > i did some simple tests using text data type.
> >
> > it works with the primary key, not with unique indexes.
> > it does not work when the column is unique, not null.
> >
> > The following is my test.
>
> Can you simplify your test case please? I can't undertand what "doesn't
> work" mean here and for which case. FWIW, this feature has nothing with
> the real data, I don't think inserting any data is helpful unless I
> missed anything.

Sorry for not explaining it very well.
"make distinct as no-op."
my understanding: it means: if fewer rows meet the criteria "columnX <
 const_a;" , after analyze the table, it should use index only scan
for the queryA?
--queryA:
select distinct columnX from the_table where columnX <  const_a;

There are several ways for columnX to be unique: primark key, unique
key, unique key nulls distinct, unique key nulls not distinct, unique
key and not null.

After applying your patch, only the primary key case will make the
queryA explain output using the index-only scan.




Re: pgstatindex vs. !indisready

2023-10-22 Thread Noah Misch
On Wed, Oct 04, 2023 at 09:00:23AM +0900, Michael Paquier wrote:
> On Sun, Oct 01, 2023 at 06:31:26PM -0700, Noah Misch wrote:
> > The !indisvalid index may be missing tuples, yes.  In what way does that 
> > make
> > one of those four operations incorrect?
> 
> Reminding myself of what these four do, it looks that you're right and
> that the state of indisvalid is not going to change what they report.
> 
> Still, I'd like to agree with Tom's point to be more conservative and
> check also for indisvalid which is what the planner does.  These
> functions will be used in SELECTs, and one thing that worries me is
> that checks based on indisready may get copy-pasted somewhere else,
> leading to incorrect results where they get copied.  (indisready &&
> !indisvalid) is a "short"-term combination in a concurrent build
> process, as well (depends on how long one waits for the old snapshots
> before switching indisvalid, but that's way shorter than the period of
> time where the built indexes remain valid).

Neither choice would harm the user experience in an important way, so I've
followed your preference in the attached patch.
Author: Noah Misch 
Commit: Noah Misch 

Diagnose !indisvalid in more SQL functions.

pgstatindex failed with ERRCODE_DATA_CORRUPTED, of the "can't-happen"
class XX.  The other functions succeeded on an empty index; they might
have malfunctioned if the failed index build left torn I/O or other
complex state.  Report an ERROR in statistics functions pgstatindex,
pgstatginindex, pgstathashindex, and pgstattuple.  Report DEBUG1 and
skip all index I/O in maintenance functions brin_desummarize_range,
brin_summarize_new_values, brin_summarize_range, and
gin_clean_pending_list.  Back-patch to v11 (all supported versions).

Reviewed by FIXME.

Discussion: https://postgr.es/m/20231001195309...@google.com

diff --git a/contrib/pgstattuple/pgstatindex.c 
b/contrib/pgstattuple/pgstatindex.c
index d69ac1c..8e5a4d6 100644
--- a/contrib/pgstattuple/pgstatindex.c
+++ b/contrib/pgstattuple/pgstatindex.c
@@ -238,6 +238,18 @@ pgstatindex_impl(Relation rel, FunctionCallInfo fcinfo)
 errmsg("cannot access temporary tables of 
other sessions")));
 
/*
+* A !indisready index could lead to ERRCODE_DATA_CORRUPTED later, so 
exit
+* early.  We're capable of assessing an indisready&&!indisvalid index,
+* but the results could be confusing.  For example, the index's size
+* could be too low for a valid index of the table.
+*/
+   if (!rel->rd_index->indisvalid)
+   ereport(ERROR,
+   
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+errmsg("index \"%s\" is not valid",
+   RelationGetRelationName(rel;
+
+   /*
 * Read metapage
 */
{
@@ -523,6 +535,13 @@ pgstatginindex_internal(Oid relid, FunctionCallInfo fcinfo)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg("cannot access temporary indexes of 
other sessions")));
 
+   /* see pgstatindex_impl */
+   if (!rel->rd_index->indisvalid)
+   ereport(ERROR,
+   
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+errmsg("index \"%s\" is not valid",
+   RelationGetRelationName(rel;
+
/*
 * Read metapage
 */
@@ -600,6 +619,13 @@ pgstathashindex(PG_FUNCTION_ARGS)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg("cannot access temporary indexes of 
other sessions")));
 
+   /* see pgstatindex_impl */
+   if (!rel->rd_index->indisvalid)
+   ereport(ERROR,
+   
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+errmsg("index \"%s\" is not valid",
+   RelationGetRelationName(rel;
+
/* Get the information we need from the metapage. */
memset(, 0, sizeof(stats));
metabuf = _hash_getbuf(rel, HASH_METAPAGE, HASH_READ, LH_META_PAGE);
diff --git a/contrib/pgstattuple/pgstattuple.c 
b/contrib/pgstattuple/pgstattuple.c
index 93b7834..3bd8b96 100644
--- a/contrib/pgstattuple/pgstattuple.c
+++ b/contrib/pgstattuple/pgstattuple.c
@@ -259,6 +259,13 @@ pgstat_relation(Relation rel, FunctionCallInfo fcinfo)
}
else if (rel->rd_rel->relkind == RELKIND_INDEX)
{
+   /* see pgstatindex_impl */
+   if (!rel->rd_index->indisvalid)
+   ereport(ERROR,
+   
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+errmsg("index \"%s\" is not 

Re: Guiding principle for dropping LLVM versions?

2023-10-22 Thread Andres Freund
Hi, 

On October 21, 2023 7:46:17 PM PDT, Xing Guo  wrote:
>Can we also check if the clang's version is compatible with llvm's version
>in llvm.m4? I have multiple llvm toolchains installed on my system and I
>have to specify the $CLANG and $LLVM_CONFIG variables each time I build the
>server against a toolchain that is not present in $PATH. If one of the
>variables is missing, the build system will pick up a default one whose
>version might not be compatible with the other. E.g., If we use clang-16
>and llvm-config-15, there will be issues when creating indexes for bitcodes
>at the end of installation.

It's unfortunately not that obvious to figure out what is compatible and what 
not. Older clang versions work, except if too old. Newer versions sometimes 
work. We could perhaps write a script that will find many, but not all, 
incompatibilities. 

For the meson build I made it just use clang belonging to the llvm install - 
but that's very painful when building against an assert enabled llvm, clang is 
slower by an order of magnitude or so.

I wonder if we should change the search order to 1) CLANG, iff explicitly 
specified, 2) use explicitly specified or inferred llvm-config, 3) only if that 
didn't find clang, search path. 

>wrote:
>
>> Rebased.  I also noticed this woefully out of date line:
>>
>> -  PGAC_PATH_PROGS(LLVM_CONFIG, llvm-config llvm-config-7
>> llvm-config-6.0 llvm-config-5.0 llvm-config-4.0 llvm-config-3.9)
>> +  PGAC_PATH_PROGS(LLVM_CONFIG, llvm-config llvm-config-17
>> llvm-config-16 llvm-config-15 llvm-config-14)
>>

It's outdated, but not completely absurd - back then often no llvm-config -> 
llvm-config-XY was installed, but these days there pretty much always is.


Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.




Re: Guiding principle for dropping LLVM versions?

2023-10-22 Thread Thomas Munro
On Sun, Oct 22, 2023 at 3:46 PM Xing Guo  wrote:
> Can we also check if the clang's version is compatible with llvm's version in 
> llvm.m4? I have multiple llvm toolchains installed on my system and I have to 
> specify the $CLANG and $LLVM_CONFIG variables each time I build the server 
> against a toolchain that is not present in $PATH. If one of the variables is 
> missing, the build system will pick up a default one whose version might not 
> be compatible with the other. E.g., If we use clang-16 and llvm-config-15, 
> there will be issues when creating indexes for bitcodes at the end of 
> installation.

Hmm.  Problems that occur to me:

1.  We need to decide if our rule is that clang must be <= llvm, or
==.  I think this question has been left unanswered in the past when
it has come up.  So far I think <= would be enough to avoid the error
you showed but can we find where this policy (ie especially
commitments for future releases) is written down in LLVM literature?
2.  Apple's clang lies about its version (I don't know the story
behind that, but my wild guess is that someone from marketing wanted
the compiler's version numbers to align with xcode's version numbers?
they're off by 1 or something like that).

Another idea could be to produce some bitcode with clang, and then
check if a relevant LLVM tool can deal with it.




Re: Why is hot_standby_feedback off by default?

2023-10-22 Thread Andres Freund
Hi,

On October 22, 2023 4:56:15 AM PDT, Vik Fearing  
wrote:
>On 10/22/23 09:50, sirisha chamarthi wrote:
>> Is there any specific reason hot_standby_feedback default is set to off?
>
>
>Yes.  No one wants a rogue standby to ruin production.

Medium term, I think we need an approximate xid->"time of assignment" mapping 
that's continually maintained on the primary. One of the things that'd show us 
to do is introduce a GUC to control the maximum effect of hs_feedback  on the 
primary, in a useful unit. Numbers of xids are not a useful unit (100k xids is 
forever on some systems, a few minutes at best on others, the rate is not 
necessarily that steady when plpgsql exception handles are used, ...)

It'd be useful to have such a mapping for other features too. E.g.

 - making it visible in pg_stat _activity how problematic a longrunning xact is 
- a 3 day old xact that doesn't have an xid assigned and has a recent xmin is 
fine, it won't prevent vacuum from doing things. But a somewhat recent xact 
that still has a snapshot from before an old xact was cancelled could be 
problematic.

- turn pg_class.relfrozenxid into an understandable timeframe. It's a fair bit 
of mental effort to classify "370M xids old" into problem/fine (it's e.g. not a 
problem on a system with a high xid rate, on a big table that takes a bit to a 
bit to vacuum).

- using the mapping to compute an xid consumption rate IMO would be one 
building block for smarter AV scheduling. Together with historical vacuum 
runtimes it'd allow us to start vacuuming early enough to prevent hitting 
thresholds, adapt pacing, prioritize between tables etc. 

Greetings,

Andres 
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: Remove extraneous break condition in logical slot advance function

2023-10-22 Thread Bharath Rupireddy
On Sat, Oct 21, 2023 at 11:40 PM Tom Lane  wrote:
>
> Gurjeet Singh  writes:
> > On Fri, Oct 20, 2023 at 7:30 PM Bharath Rupireddy
> >  wrote:
> >> There exists an extraneous break condition in
> >> pg_logical_replication_slot_advance(). When the end of WAL or moveto
> >> LSN is reached, the main while condition helps to exit the loop, so no
> >> separate break condition is needed. Attached patch removes it.
>
> > The only advantage I see of the code as it stands right now is that it
> > avoids one last call to CHECK_FOR_INTERRUPTS() by break'ing early. I
> > don't think we'd lose much in terms of performance by making one (very
> > cheap, in common case) extra call of this macro.
>
> Agreed, bypassing the last CHECK_FOR_INTERRUPTS() shouldn't save
> anything noticeable.  Could there be a correctness argument for it
> though?  Can't see what.  We should assume that CFIs might happen
> down inside LogicalDecodingProcessRecord.

AFAICS, there's no correctness argument for breaking before CFI. As
rightly said, CFIs can happen before the break condition either down
inside LogicalDecodingProcessRecord or XLogReadRecord (page_read
callbacks for instance).

Having said that, what may happen if CFI happens and interrupts are
processed before the break condition is that the decoding occurs again
which IMV is not a big problem.

An idea to keep all of  XLogReadRecord() -
LogicalDecodingProcessRecord() loops consistent is by having CFI at
the start of the loops before the XLogReadRecord().

> I wondered why the code looks like this, and whether there used
> to be more of a reason for it.  "git blame" reveals the probable
> answer: when this code was added, in 9c7d06d60, the loop
> condition was different so the break was necessary.
> 38a957316 simplified the loop condition to what we see today,
> but didn't notice that the break was thereby made pointless.

Right. Thanks for these references.

> While we're here ... the comment above the loop seems wrong
> already, and this makes it more so.  I suggest something like
>
> -   /* Decode at least one record, until we run out of records */
> +   /* Decode records until we reach the requested target */
> while (ctx->reader->EndRecPtr < moveto)

+1 and done so in the attached v2 patch.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v2-0001-Remove-extraneous-break-condition-in-logical-slot.patch
Description: Binary data


Re: BRIN minmax multi - incorrect distance for infinite timestamp/date

2023-10-22 Thread Tomas Vondra
On 10/20/23 11:52, Ashutosh Bapat wrote:
> On Thu, Oct 19, 2023 at 6:11 PM Ashutosh Bapat
>  wrote:
>>
>> I think we should provide generate_series(date, date, integer) which
>> will use date + integer -> date.
> 
> Just to be clear, I don't mean that this patch should add it.
> 

I'm not against adding such generate_series() variant. For this patch
I'll use something like the query you proposed, I think.

I was thinking about the (date + interval) failure a bit more, and while
I think it's confusing it's not quite wrong. The problem is that the
interval may have hours/minutes, so it makes sense that the operator
returns timestamp. That's not what most operators do, where the data
type does not change. So a bit unexpected, but seems correct.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




PoC: prefetching index leaf pages (for inserts)

2023-10-22 Thread Tomas Vondra
Hi,

Some time ago I started a thread about prefetching heap pages during
index scans [1]. That however only helps when reading rows, not when
inserting them.

Imagine inserting a row into a table with many indexes - we insert the
row into the table, and then into the indexes one by one, synchronously.
We walk the index, determine the appropriate leaf page, read it from
disk into memory, insert the index tuple, and then do the same thing for
the next index.

If there are many indexes, and there's not much correlation with the
table, this may easily results in I/O happening synchronously with queue
depth 1. Hard to make it even slower ...

This can be a problem even with a modest number of indexes - imagine
bulk-loading data into a table using COPY (say, in 100-row batches).
Inserting the rows into heap happens in a bulk, but the indexes are
still modified in a loop, as if for single-row inserts. Not great.

The with multiple connections the concurrent I/O may be generated that
way, but for low-concurrency workloads (e.g. batch jobs) that may not
really work.

I had an idea what we might do about this - we can walk the index,
almost as if we're inserting the index tuple, but only the "inner"
non-leaf pages. And instead of descending to the leaf page, we just
prefetch it. The non-leaf pages are typically <1% of the index, and hot,
so likely already cached, so not worth prefetching those.

The attached patch does a PoC of this. It adds a new AM function
"amprefetch", with an implementation for btree indexes, mimicking the
index lookup, except that it only prefetches the leaf page as explained
a bit earlier.

In the executor, this is wrapped in ExecInsertPrefetchIndexes() which
gets called in various places right before ExecInsertPrefetchIndexes().
I thought about doing that in ExecInsertPrefetchIndexes() directly, but
that would not work for COPY, where we want to issue the prefetches for
the whole batch, not for individual tuples.

This may need various improvements - the prefetch duplicates a couple
steps that could be expensive (e.g. evaluation of index predicates,
forming index tuples, and so on). Would be nice to improve this, but
good enough for PoC I think.

Another gap is lack of incremental prefetch (ramp-up). We just prefetch
all the indexes, for all tuples. But I think that's OK. We know we'll
need those pages, and the number is fairly limited.

There's a GUC enable_insert_prefetch, that can be used to enable this
insert prefetching.

I did a simple test on two machines - one with SATA SSD RAID, one with
NVMe SSD. In both cases the data (table+indexes) are an order of
magnitude larger than RAM. The indexes are on UUID, so pretty random and
there's no correlation. Then batches of 100, 1000 and 1 rows are
inserted, with/without the prefetching.

With 5 indexes, the results look like this:

SATA SSD RAID
-

 rows no prefetchprefetch
  100  176.872 ms   70.910 ms
 1000 1035.056 ms  590.495 ms
1 8494.836 ms 3216.206 ms


NVMe


 rows no prefetchprefetch
  100  133.365 ms   72.899 ms
 1000 1572.379 ms  829.298 ms
111889.143 ms 3621.981 ms


Not bad, I guess. Cutting the time to ~30% is nice.

The fewer the indexes, the smaller the difference (with 1 index there is
almost no difference), of course.


regards


[1] https://commitfest.postgresql.org/45/4351/

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Companydiff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index af392bc032..dc3197401c 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -117,6 +117,7 @@ brinhandler(PG_FUNCTION_ARGS)
 	amroutine->ambuild = brinbuild;
 	amroutine->ambuildempty = brinbuildempty;
 	amroutine->aminsert = brininsert;
+	amroutine->amprefetch = NULL;
 	amroutine->ambulkdelete = brinbulkdelete;
 	amroutine->amvacuumcleanup = brinvacuumcleanup;
 	amroutine->amcanreturn = NULL;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 7a4cd93f30..666d58a750 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -64,6 +64,7 @@ ginhandler(PG_FUNCTION_ARGS)
 	amroutine->ambuild = ginbuild;
 	amroutine->ambuildempty = ginbuildempty;
 	amroutine->aminsert = gininsert;
+	amroutine->amprefetch = NULL;
 	amroutine->ambulkdelete = ginbulkdelete;
 	amroutine->amvacuumcleanup = ginvacuumcleanup;
 	amroutine->amcanreturn = NULL;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 8ef5fa0329..3ed72cce44 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -86,6 +86,7 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->ambuild = gistbuild;
 	amroutine->ambuildempty = gistbuildempty;
 	amroutine->aminsert = gistinsert;
+	amroutine->amprefetch = NULL;
 	amroutine->ambulkdelete = gistbulkdelete;
 	amroutine->amvacuumcleanup = 

Re: Why is hot_standby_feedback off by default?

2023-10-22 Thread Vik Fearing

On 10/22/23 09:50, sirisha chamarthi wrote:

Is there any specific reason hot_standby_feedback default is set to off?



Yes.  No one wants a rogue standby to ruin production.
--
Vik Fearing





Re: missing dependencies PostgreSQL 16 OpenSUSE Tumbleweed (SLES 15.5 packages)

2023-10-22 Thread Michael Banck
Hi,

On Sun, Oct 22, 2023 at 11:00:07AM +0200, André Verwijs wrote:
> missing dependencies  PostgreSQL  16 OpenSUSE Tumbleweed   (SLES 15.5
> packages)
> 
> ---
> 
>  YaST2 conflicts list - generated 2023-10-22 10:30:07 
> 
> there is no package providing 'libldap_r-2.4.so.2())(64bit)' required by
> installing postgresql16-server-16.0-1PGDG.sles15.x86_64

Those are the packages from zypp.postgresql.org, right? There is a link
to the issue tracker at https://redmine.postgresql.org/projects/pgrpms/
from the home page, I think it would best to report the problem there.


Michael




missing dependencies PostgreSQL 16 OpenSUSE Tumbleweed (SLES 15.5 packages)

2023-10-22 Thread André Verwijs



missing dependencies  PostgreSQL  16 OpenSUSE Tumbleweed   (SLES 15.5 
packages)


---

 YaST2 conflicts list - generated 2023-10-22 10:30:07 

there is no package providing 'libldap_r-2.4.so.2())(64bit)' required by 
installing postgresql16-server-16.0-1PGDG.sles15.x86_64


 [ ] break 'postgresql16-server-16.0-1PGDG.sles15.x86_64' by 
ignoring some dependencies


 [ ] do not install postgresql16-server-16.0-1PGDG.sles15.x86_64


 YaST2 conflicts list END ###

dependencies needed:

libpq.so.5() (64bit)
libpthread.so.0() (64bit)
libpthread.so.0(GLIBC_2.2.5)(64bit)
libm.so.6() (64bit)
libm.so.6(GLIBC_2.2.5)(64bit)
libm.so.6(GLIBC_2.29)(64bit)
libcrypto.so.1.1() (64bit)
libcrypto.so.1.1(OPENSSL_1_1_0)(64bit)
libssl.so.1.1() (64bit)
liblz4.so.1() (64bit)
libssl.so.1.1(OPENSSL_1_1_0)(64bit)
libz.so.1() (64bit)
libxml2.so.2() (64bit)
libxml2.so.2(LIBXML2_2.4.30)(64bit)
libzstd.so.1() (64bit)
libxml2.so.2(LIBXML2_2.6.0)(64bit)
libc.so.6(GLIBC_2.25)(64bit)
libcrypto.so.1.1(OPENSSL_1_1_1)(64bit)
libdl.so.2() (64bit)
libgssapi_krb5.so.2() (64bit)
libgssapi_krb5.so.2(gssapi_krb5_2_MIT)(64bit)
libldap_r-2.4.so.2() (64bit)
libpam.so.0() (64bit)
libpam.so.0(LIBPAM_1.0)(64bit)
libsystemd.so.0() (64bit)
libsystemd.so.0(LIBSYSTEMD_209)(64bit)
libdl.so.2(GLIBC_2.2.5)(64bit)
libicudata.so.suse65.1() (64bit)
libicui18n.so.suse65.1() (64bit)
libicuuc.so.suse65.1() (64bit)
librt.so.1() (64bit)
librt.so.1(GLIBC_2.2.5)(64bit)
libxml2.so.2(LIBXML2_2.6.23)(64bit)
libxml2.so.2(LIBXML2_2.6.8)(64bit)
util-linux
postgresql16(x86-64) = 16.0-1PGDG.sles15
postgresql16-libs(x86-64) = 16.0-1PGDG.sles15
/bin/sh
/usr/sbin/useradd
systemd
/usr/sbin/groupadd
glibc


used repos:

https://download.postgresql.org/pub/repos/zypp/16/suse/sles-15.5-x86_64/
https://download.postgresql.org/pub/repos/zypp/srpms/16/suse/sles-15.5-x86_64/
http://download.opensuse.org/repositories/server:database:postgresql/openSUSE_Tumbleweed/


how can i resolve dependencies..?





Why is hot_standby_feedback off by default?

2023-10-22 Thread sirisha chamarthi
Hi Hackers,

Is there any specific reason hot_standby_feedback default is set to off? I
see some explanation in the thread [1] about recovery_min_apply_delay value
> 0 causing table bloat. However, recovery_min_apply_delay is set to 0 by
default. So, if a server admin wants to change this value, they can change
hot_standby_feedback as well if needed right?

Thanks!

[1]: https://www.postgresql.org/message-id/55f981ec.5040...@gmx.net