RE: [PoC] pg_upgrade: allow to upgrade publisher node
Dear Bharath, Thank you for reviewing! PSA new version. > 1. A nit: > + > +/* > + * We also skip decoding in 'fast_forward' mode. In passing set the > + * 'processing_required' flag to indicate, were it not for this mode, > + * processing *would* have been required. > + */ > How about "We also skip decoding in fast_forward mode. In passing set > the processing_required flag to indicate that if it were not for > fast_forward mode, processing would have been required."? Fixed. > 2. Don't we need InvalidateSystemCaches() after FreeDecodingContext()? > > +/* Clean up */ > +FreeDecodingContext(ctx); Right. Older system caches should be thrown away here for upcoming pg_dump. > 3. Don't we need to put CreateDecodingContext in PG_TRY-PG_CATCH with > InvalidateSystemCaches() in PG_CATCH block? I think we need to clear > all timetravel entries with InvalidateSystemCaches(), no? Added. > 4. The following assertion better be an error? Or we ensure that > binary_upgrade_slot_has_caught_up isn't called for an invalidated slot > at all? > + > +/* Slots must be valid as otherwise we won't be able to scan the WAL */ > +Assert(MyReplicationSlot->data.invalidated == RS_INVAL_NONE); I kept the Assert() because pg_upgrade won't call this function for invalidated slots. > 5. This better be an error instead of returning false? IMO, null value > for slot name is an error. > +/* Quick exit if the input is NULL */ > +if (PG_ARGISNULL(0)) > +PG_RETURN_BOOL(false); Hmm, OK, changed to elog(ERROR). If current style is kept and NULL were to input, an empty string may be reported as slotname in invalid_logical_replication_slots.txt. It is quite strange. Note again that it won't be expected. > 6. A nit: how about is_decodable_txn or is_decodable_change or some > other instead of just a plain name processing_required? > +/* Do we need to process any change in 'fast_forward' mode? */ > +boolprocessing_required; I preferred current one. Because not only decodable txn, non-txn change and empty transactions also be processed. > 7. Can the following pg_fatal message be consistent and start with > lowercase letter something like "expected 0 logical replication slots > "? > +pg_fatal("Expected 0 logical replication slots but found %d.", > + nslots_on_new); Note that the Upper/Lower case rule has been broken in this file. Lower case was used here because I regarded this sentence as hint message. Please see previous posts [1] [2]. > 8. s/problem/problematic - "A list of problematic slots is in the file:\n" > + "A list of the problem slots is in the file:\n" Fixed. > 9. IMO, binary_upgrade_logical_replication_slot_has_caught_up seems > better, meaningful and consistent despite a bit long than just > binary_upgrade_slot_has_caught_up. Fixed. > 10. How about an assert that the passed-in replication slot is logical > in binary_upgrade_slot_has_caught_up? Fixed. > 11. How about adding CheckLogicalDecodingRequirements too in > binary_upgrade_slot_has_caught_up after CheckSlotPermissions just in > case? Not added. CheckLogicalDecodingRequirements() ensures that WALs can be decodable and the changes can be applied, but both of them are not needed for fast_forward mode. Also, pre-existing function pg_logical_replication_slot_advance() does not call it. > 12. Not necessary but adding ReplicationSlotValidateName(slot_name, > ERROR); for the passed-in slotname in > binary_upgrade_slot_has_caught_up may be a good idea, at least in > assert builds to help with input validations. Not added because ReplicationSlotAcquire() can report even if invalid name is added. Also, pre-existing function pg_logical_replication_slot_advance() does not call it. > 13. Can the functionality of LogicalReplicationSlotHasPendingWal be > moved to binary_upgrade_slot_has_caught_up and get rid of a separate > function LogicalReplicationSlotHasPendingWal? Or is it that the > function exists in logical.c to avoid extra dependencies between > logical.c and pg_upgrade_support.c? I kept current style. I think upgrade functions should be short so that actual tasks should be done in other place. SetAttrMissing() is called only from an upgrading function, so we do not have a policy to avoid deviding function. Also, LogicalDecodingProcessRecord() is called from only files in src/backend/replication, so we can keep them. > 14. I think it's better to check if the old cluster contains the > necessary function binary_upgrade_slot_has_caught_up instead of just > relying on major version. > +/* Logical slots can be migrated since PG17. */ > +if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1600) > +return; I kept current style because I could not find a merit for the approach. If the patch is committed PG17.X surely has binary_upgrade_logical_replication_slot_has_caught_up(). Also, other upgrading function are not checked from the
RE: pg_upgrade's interaction with pg_resetwal seems confusing
Dear Vignesh, Thank you for reviewing! New patch can be available in [1]. > We can update the commit message with the details of the same, it will > help to understand that it is intentionally done. Both comments and a commit message were updated related. > There are couple of typos with the new patch: > 1) "uprade logical replication slot" should be "upgrade logical > replication slot": > Previously, the OID counter is restored by invoking pg_resetwal with the -o > option, at the end of upgrade. This is not problematic for now, but WAL > removals > are not happy if we want to uprade logical replication slot. Therefore, a new > upgrade function is introduced to reset next OID. Fixed. > 2) "becasue the value" should be "because the value": > Note that we only update the on-memory data to avoid concurrent update of > control with the chekpointer. It is harmless becasue the value would be set at > shutdown checkpoint. Fixed. [1]: https://www.postgresql.org/message-id/TYAPR01MB5866DAFE000F8677C49ACD66F5D8A%40TYAPR01MB5866.jpnprd01.prod.outlook.com Best Regards, Hayato Kuroda FUJITSU LIMITED
RE: pg_upgrade's interaction with pg_resetwal seems confusing
Dear Alvaro, Thank you for updating! PSA new version. > Note that this patch falsifies the comment in SetNextObjectId that > taking the lock is pro forma only -- it no longer is, since in upgrade > mode there can be multiple subprocesses running -- so I think it should > be updated. Indeed, some comments were updated. Best Regards, Hayato Kuroda FUJITSU LIMITED v3-0001-pg_upgrade-use-upgrade-function-to-restore-OID.patch Description: v3-0001-pg_upgrade-use-upgrade-function-to-restore-OID.patch
Re: Server crash on RHEL 9/s390x platform against PG16
On Sat, Oct 21, 2023 at 5:17 AM Andres Freund wrote: > Hi, > > On 2023-09-12 15:27:21 +0530, Suraj Kharage wrote: > > *[edb@9428da9d2137 postgres]$ cat /etc/redhat-release AlmaLinux release > 9.2 > > (Turquoise Kodkod)[edb@9428da9d2137 postgres]$ lscpuArchitecture: > > s390x CPU op-mode(s): 32-bit, 64-bit Address sizes:39 > bits > > Can you provide the rest of the lscpu output? There have been issues with > Z14 > vs Z15: > https://github.com/llvm/llvm-project/issues/53009 > > You're apparently not hitting that, but given that fact, you either are on > a > slightly older CPU, or you have applied a patch to work around it. Because > otherwise your uild instructions below would hit that problem, I think. > > > > physical, 48 bits virtual Byte Order: Big Endian* > > *Configure command:* > > ./configure --prefix=/home/edb/postgres/ --with-lz4 --with-zstd > --with-llvm > > --with-perl --with-python --with-tcl --with-openssl --enable-nls > > --with-libxml --with-libxslt --with-systemd --with-libcurl --without-icu > > --enable-debug --enable-cassert --with-pgport=5414 > > Hm, based on "--with-libcurl" this isn't upstream postgres, correct? Have > you > verified the issue reproduces on upstream postgres? > Yes, I can reproduce this on upstream postgres master and v16 branch. Here are details: ./configure --prefix=/home/edb/postgres/ --with-zstd --with-llvm --with-perl --with-python --with-tcl --with-openssl --enable-nls --with-libxml --with-libxslt --with-systemd --without-icu --enable-debug --enable-cassert --with-pgport=5414 CFLAGS="-g -O0" [edb@9428da9d2137 postgres]$ cat /etc/redhat-release AlmaLinux release 9.2 (Turquoise Kodkod) [edb@9428da9d2137 edbas]$ lscpu Architecture: s390x CPU op-mode(s): 32-bit, 64-bit Address sizes:39 bits physical, 48 bits virtual Byte Order: Big Endian CPU(s): 9 On-line CPU(s) list: 0-8 Vendor ID: GenuineIntel Model name: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz CPU family: 6 Model: 158 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 9 Stepping: 10 BogoMIPS: 5200.00 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht pbe syscall nx pdpe1gb lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid pni pclmulqdq dtes64 ds_cpl ssse3 sdbg fma cx 16 xtpr pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch fsgsbase bmi1 avx2 bmi2 erms xsaveopt arat Caches (sum of all): L1d: 288 KiB (9 instances) L1i: 288 KiB (9 instances) L2: 2.3 MiB (9 instances) L3: 108 MiB (9 instances) Vulnerabilities: Itlb multihit:KVM: Mitigation: VMX unsupported L1tf: Mitigation; PTE Inversion Mds: Vulnerable; SMT Host state unknown Meltdown: Vulnerable Mmio stale data: Vulnerable Spec store bypass:Vulnerable Spectre v1: Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers Spectre v2: Vulnerable, STIBP: disabled Srbds:Unknown: Dependent on hypervisor status Tsx async abort: Not affected [edb@9428da9d2137 postgres]$ clang --version clang version 15.0.7 (Red Hat 15.0.7-2.el9) Target: s390x-ibm-linux-gnu Thread model: posix InstalledDir: /usr/bin [edb@9428da9d2137 postgres]$ rpm -qa | grep llvm *llvm*-libs-15.0.7-1.el9.s390x *llvm*-15.0.7-1.el9.s390x *llvm*-test-15.0.7-1.el9.s390x *llvm*-static-15.0.7-1.el9.s390x *llvm*-devel-15.0.7-1.el9.s390x Please let me know if any further information is required. > > > > *Test case:* > > CREATE TABLE rm32044_t1 > > ( > > pkey integer, > > val text > > ); > > CREATE TABLE rm32044_t2 > > ( > > pkey integer, > > label text, > > hidden boolean > > ); > > CREATE TABLE rm32044_t3 > > ( > > pkey integer, > > val integer > > ); > > CREATE TABLE rm32044_t4 > > ( > > pkey integer > > ); > > insert into rm32044_t1 values ( 1 , 'row1'); > > insert into rm32044_t1 values ( 2 , 'row2'); > > insert into rm32044_t2 values ( 1 , 'hidden', true); > > insert into rm32044_t2 values ( 2 , 'visible', false); > > insert into rm32044_t3 values (1 , 1); > > insert into rm32044_t3 values (2 , 1); > > > > postgres=# SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON > rm32044_t1.pkey > > = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey = > > rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden; > > > server closed the connection unexpectedly > > This probably means the server terminated abnormally > > before or while processing the request. > > The connection to the server was lost. Attempting reset: Failed. > >
Re: Fix output of zero privileges in psql
On Fri, Oct 20, 2023 at 7:29 PM Erik Wienhold wrote: > On 2023-10-20 22:35 +0200, David G. Johnston wrote: > > In short, I don't want default privileges to start to obey \pset null > when > > it never has before and is documented as displaying the empty string. I > do > > want the empty string produced by empty privileges to change to (none) so > > that it no longer is indistinguishable from our choice of presentation > for > > the default privilege case. > > I haven't thought off this yet. The attached v3 of my initial patch > does that. It also includes Laurenz' fix to no longer ignore \pset null > (minus the doc changes that suggest using \pset null to distinguish > between default and empty privileges because that's no longer needed). > > Thank you. It looks good to me as-is, with one possible nit. I wonder if it would be clearer to say: "If the Access privileges column is *blank* for a given object..." instead of "empty" to avoid having both "empty [string]" and "empty privileges" present in the same paragraph and the empty string not pertaining to the empty privileges. David J.
Re: speed up a logical replica setup
On Mon, Feb 21, 2022, at 9:09 AM, Euler Taveira wrote: > A new tool called pg_subscriber does this conversion and is tightly integrated > with Postgres. After a long period of inactivity, I'm back to this client tool. As suggested by Andres, I added a new helper function to change the system identifier as the last step. I also thought about including the pg_basebackup support but decided to keep it simple (at least for this current version). The user can always execute pg_basebackup as a preliminary step to create a standby replica and it will work. (I will post a separate patch that includes the pg_basebackup support on the top of this one.) Amit asked if an extra replication slot is required. It is not. The reason I keep it is to remember that at least the latest replication slot needs to be created after the pg_basebackup finishes (pg_backup_stop() call). Regarding the atexit() routine, it tries to do the best to remove all the objects it created, however, there is no guarantee it can remove them because it depends on external resources such as connectivity and authorization. I added a new warning message if it cannot drop the transient replication slot. It is probably a good idea to add such warning message into the cleanup routine too. More to this point, another feature that checks and remove all left objects. The transient replication slot is ok because it should always be removed at the end. However, the big question is how to detect that you are not removing objects (publications, subscriptions, replication slots) from a successful conversion. Amit also asked about setup a logical replica with m databases where m is less than the total number of databases. One option is to remove the "extra" databases in the target server after promoting the physical replica or in one of the latest steps. Maybe it is time to propose partial physical replica that contains only a subset of databases on primary. (I'm not volunteering to it.) Hence, pg_basebackup has an option to remove these "extra" databases so this tool can take advantage of it. Let's continue with the bike shedding... I agree with Peter E that this name does not express what this tool is. At the moment, it only have one action: create. If I have to suggest other actions I would say that it could support switchover option too (that removes the infrastructure created by this tool). If we decide to keep this name, it should be a good idea to add an option to indicate what action it is executing (similar to pg_recvlogical) as suggested by Peter. I included the documentation cleanups that Peter E shared. I also did small adjustments into the documentation. It probably deserves a warning section that advertises about the cleanup. I refactored the transient replication slot code and decided to use a permanent (instead of temporary) slot to avoid keeping a replication connection open for a long time until the target server catches up. The system identifier functions (get_control_from_datadir() and get_sysid_from_conn()) now returns uint64 as suggested by Peter. After reflection, the --verbose option should be renamed to --progress. There are also some messages that should be converted to debug messages. I fixed the useless malloc. I rearrange the code a bit but the main still has ~ 370 lines (without options/validation ~ 255 lines. I'm trying to rearrange the code to make the code easier to read and at the same time reduce the main size. I already have a few candidates in mind such as the code that stops the standby and the part that includes the recovery parameters. I removed the refactor I proposed in the previous patch and the current code is relying on pg_ctl --wait behavior. Are there issues with this choice? Well, one annoying situation is that pg_ctl does not have a "wait forever" option. If one of the pg_ctl calls fails, you could probably have to start again (unless you understand the pg_subscriber internals and fix the setup by yourself). You have to choose an arbitrary timeout value and expect that pg_ctl *does* perform the action less than the timeout. Real tests are included. The cleanup code does not have coverage because a simple reproducible case isn't easy. I'm also not sure if it is worth it. We can explain it in the warning section that was proposed. It is still a WIP but I would like to share it and get some feedback. -- Euler Taveira EDB https://www.enterprisedb.com/ From 0aca46ed57c15bce1972c9db6459c04bcc5cf73d Mon Sep 17 00:00:00 2001 From: Euler Taveira Date: Mon, 5 Jun 2023 14:39:40 -0400 Subject: [PATCH v2] Creates a new logical replica from a standby server A new tool called pg_subscriber can convert a physical replica into a logical replica. It runs on the target server and should be able to connect to the source server (publisher) and the target server (subscriber). The conversion requires a few steps. Check if the target data directory has the same system identifier than the source data directory. Stop the
Re: Removing unneeded self joins
On 22/10/2023 05:01, Alexander Korotkov wrote: On Thu, Oct 19, 2023 at 6:16 AM Andrei Lepikhov wrote: On 19/10/2023 01:50, Alexander Korotkov wrote: This query took 3778.432 ms with self-join removal disabled, and 3756.009 ms with self-join removal enabled. So, no measurable overhead. Similar to the higher number of joins. Can you imagine some extreme case when self-join removal could introduce significant overhead in comparison with other optimizer parts? If not, should we remove self_join_search_limit GUC? Thanks, It was Zhihong Yu who worried about that case [1]. And my purpose was to show a method to avoid such a problem if it would be needed. I guess the main idea here is that we have a lot of self-joins, but only few of them (or no one) can be removed. I can't imagine a practical situation when we can be stuck in the problems here. So, I vote to remove this GUC. I've removed the self_join_search_limit. Anyway there is enable_self_join_removal if the self join removal algorithm causes any problems. I also did some grammar corrections for the comments. I think the patch is getting to the committable shape. I noticed some failures on commitfest.cputube.org. I'd like to check how this version will pass it. I have observed the final patch. A couple of minor changes can be made (see attachment). Also, I see room for improvement, but it can be done later. For example, we limit the optimization to only ordinary tables in this patch. It can be extended at least with partitioned and foreign tables soon. -- regards, Andrey Lepikhov Postgres Professional diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c index 6b848aadad..b84197dadb 100644 --- a/src/backend/optimizer/plan/analyzejoins.c +++ b/src/backend/optimizer/plan/analyzejoins.c @@ -23,7 +23,6 @@ #include "postgres.h" #include "catalog/pg_class.h" -#include "nodes/makefuncs.h" #include "nodes/nodeFuncs.h" #include "optimizer/clauses.h" #include "optimizer/joininfo.h" @@ -50,7 +49,7 @@ typedef struct UniqueRelInfo } UniqueRelInfo; /* - * The context for replace_varno_walker() containing source and target relids2 + * The context for replace_varno_walker() containing source and target relids. */ typedef struct {
Re: Why is hot_standby_feedback off by default?
sirisha chamarthi writes: > I believe that any reasonable use of a standby server for queries > requires hot_standby_feedback to be turned on. The fact that it's not the default should suggest to you that that's not the majority opinion. regards, tom lane
Re: Why is hot_standby_feedback off by default?
On 10/23/23 04:02, sirisha chamarthi wrote: On Sun, Oct 22, 2023 at 4:56 AM Vik Fearing wrote: On 10/22/23 09:50, sirisha chamarthi wrote: Is there any specific reason hot_standby_feedback default is set to off? Yes. No one wants a rogue standby to ruin production. Agreed. Okay... I believe that any reasonable use of a standby server for queries requires hot_standby_feedback to be turned on. Otherwise, we can potentially see query cancellations, increased replication lag because of conflicts (while replaying vacuum cleanup records) on standby (resulting in longer failover times if the server is configured for disaster recovery + read scaling). Recent logical decoding on standby as well requires hot_standby_feedback to be turned on to avoid slot invalidation [1]. If there is no requirement to query the standby, admins can always set hot_standby to off. My goal here is to minimize the amount of configuration tuning required to use these features. [1]: https://www.postgresql.org/docs/current/logicaldecoding-explanation.html This does not sound like you agree. -- Vik Fearing
RE: Remove wal_level settings for subscribers in tap tests
Dear Michael, I found it was pushed. Thanks! Best Regards, Hayato Kuroda FUJITSU LIMITED
Re: Why is hot_standby_feedback off by default?
On Sun, Oct 22, 2023 at 4:56 AM Vik Fearing wrote: > On 10/22/23 09:50, sirisha chamarthi wrote: > > Is there any specific reason hot_standby_feedback default is set to off? > > > Yes. No one wants a rogue standby to ruin production. > Agreed. I believe that any reasonable use of a standby server for queries requires hot_standby_feedback to be turned on. Otherwise, we can potentially see query cancellations, increased replication lag because of conflicts (while replaying vacuum cleanup records) on standby (resulting in longer failover times if the server is configured for disaster recovery + read scaling). Recent logical decoding on standby as well requires hot_standby_feedback to be turned on to avoid slot invalidation [1]. If there is no requirement to query the standby, admins can always set hot_standby to off. My goal here is to minimize the amount of configuration tuning required to use these features. [1]: https://www.postgresql.org/docs/current/logicaldecoding-explanation.html Thanks, Sirisha
Re: pgstatindex vs. !indisready
On Sun, Oct 22, 2023 at 02:02:45PM -0700, Noah Misch wrote: > - /* OK, do it */ > - brinsummarize(indexRel, heapRel, heapBlk, true, , NULL); > + /* see gin_clean_pending_list() */ > + if (indexRel->rd_index->indisvalid) > + brinsummarize(indexRel, heapRel, heapBlk, true, , > NULL); > + else > + ereport(DEBUG1, > + > (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > + errmsg("index \"%s\" is not valid", > + > RelationGetRelationName(indexRel; brinsummarize() could return 0 even for a valid index, and we would not be able to make the difference with an invalid index. Perhaps you are right and this is not a big deal in practice to do as you are suggesting. -- Michael signature.asc Description: PGP signature
Re: Why is hot_standby_feedback off by default?
Hi Andres, On Sun, Oct 22, 2023 at 12:08 PM Andres Freund wrote: > Hi, > > On October 22, 2023 4:56:15 AM PDT, Vik Fearing > wrote: > >On 10/22/23 09:50, sirisha chamarthi wrote: > >> Is there any specific reason hot_standby_feedback default is set to off? > > > > > >Yes. No one wants a rogue standby to ruin production. > > Medium term, I think we need an approximate xid->"time of assignment" > mapping that's continually maintained on the primary. One of the things > that'd show us to do is introduce a GUC to control the maximum effect of > hs_feedback on the primary, in a useful unit. Numbers of xids are not a > useful unit (100k xids is forever on some systems, a few minutes at best on > others, the rate is not necessarily that steady when plpgsql exception > handles are used, ...) > +1 on this idea. Please let me give this a try. Thanks, Sirisha
Re: Remove extraneous break condition in logical slot advance function
On Sun, Oct 22, 2023 at 11:59:00PM +0530, Bharath Rupireddy wrote: > AFAICS, there's no correctness argument for breaking before CFI. As > rightly said, CFIs can happen before the break condition either down > inside LogicalDecodingProcessRecord or XLogReadRecord (page_read > callbacks for instance). > > Having said that, what may happen if CFI happens and interrupts are > processed before the break condition is that the decoding occurs again > which IMV is not a big problem. > > An idea to keep all of XLogReadRecord() - > LogicalDecodingProcessRecord() loops consistent is by having CFI at > the start of the loops before the XLogReadRecord(). Passing by.. All that just looks like an oversight of 38a957316d7e that simplified the main while loop, so I've just applied your v2. -- Michael signature.asc Description: PGP signature
Re: Show version of OpenSSL in ./configure output
On Sun, Oct 22, 2023 at 08:34:40PM -0400, Tom Lane wrote: > +1, I've wished for that too. It's not 100% clear that whatever > openssl is in your PATH matches the libraries we select, but this > will get it right in most cases and it seems like about the right > level of effort. Yes, I don't reallt want to add more macros for the sake of this information. > + pgac_openssl_version="$($OPENSSL version 2> /dev/null || echo no)" > > Maybe "echo 'openssl not found'" would be better. Sure. -- Michael signature.asc Description: PGP signature
Re: Casual Meson fixups
Hi, On 2023-09-01 11:31:07 -0500, Tristan Partin wrote: > Muon development has slowed quite a bit this year. Postgres is probably the > largest project which tries its best to support Muon. It seems like if we > want to keep supporting Muon, we should get a buildfarm machine to use it > instead of Meson to catch regressions. OR we should contemplate removing > support for it. I found it to be quite useful to find bugs in the meson.build files... > Subject: [PATCH v1 2/7] Add Meson override for libpq > > Meson has the ability to do transparent overrides when projects are used > as subprojects. For instance, say I am building a Postgres extension. I > can define Postgres to be a subproject of my extension given the > following wrap file: > > [wrap-git] > url = https://git.postgresql.org/git/postgresql.git > revision = master > depth = 1 > > [provide] > dependency_names = libpq > > Then in my extension (root project), I can have the following line > snippet: > > libpq = dependency('libpq') > > This will tell Meson to transparently compile libpq prior to it > compiling my extension (because I depend on libpq) if libpq isn't found > on the host system. > --- > src/interfaces/libpq/meson.build | 2 ++ > 1 file changed, 2 insertions(+) This example doesn't seem convincing, because if you build a postgres extension, you need postgres' headers - which makes it extremely likely that libpq is available :) > From 5455426c9944ff8c8694db46929eaa37e03d907f Mon Sep 17 00:00:00 2001 > From: Tristan Partin > Date: Fri, 1 Sep 2023 11:07:40 -0500 > Subject: [PATCH v1 7/7] Disable building contrib targets by default > > This matches the autotools build. Why should we match it here? IIRC this would actually break running meson install, because it doesn't grok targets that are installed but not built by default :/. Greetings, Andres Freund
Re: Patch: Improve Boolean Predicate JSON Path Docs
On 2023-10-20 15:49 +0200, David Wheeler wrote: > On Oct 19, 2023, at 23:49, Erik Wienhold wrote: > > > I don't even know what that represents, probably not some fancy file > > compression. That's an AppleSingle file according to [1][2]. It only contains the resource fork and file name but no data fork. > Oh, weird. Trying from a webmail client instead. Thanks. > +Does JSON path return any item for the specified JSON value? Use only > +SQL-standard JSON path expressions, not > +predicate check > +expressions. Any reason for calling it "predicate check expressions" (e.g. the link text) and sometimes "predicate path expressions" (e.g. the linked section title)? I think it should be named consistently to avoid confusion and also to simplify searching. > +Returns the result of a JSON path > +predicate > +check for the specified JSON value. If the result is not > Boolean, > +then NULL is returned. Use only with > +predicate check > +expressions. Linking the same section twice in the same paragraph seems excessive. > += select jsonb_path_query(:'json', > '$.track.segments'); > +select jsonb_path_query(:'json', '$.track.segments'); Please remove the second SELECT. > += select jsonb_path_query(:'json', 'strict > $.track.segments[0].location'); > + jsonb_path_query > +--- > + [47.763, 13.4034] Strict mode is unnecessary to get that result and I'd omit it because the different modes are not introduced yet at this point. > += select jsonb_path_query(:'json', 'strict > $.track.segments.size()'); > + jsonb_path_query > +-- > + 2 Strict mode is unnecessary here as well. > + using the lax mode. To avoid surprising results, we recommend using > + the .** accessor only in the strict mode. The Please change to "in strict mode" (without "the"). [1] https://www.rfc-editor.org/rfc/rfc1740.txt [2] https://web.archive.org/web/20180311140826/http://kaiser-edv.de/documents/AppleSingle_AppleDouble.pdf -- Erik
Re: Show version of OpenSSL in ./configure output
Michael Paquier writes: > 5e4dacb9878c has reminded me that we don't show the version of OpenSSL > in the output of ./configure. This would be useful to know when > looking at issues within the buildfarm, and I've wanted that a few > times. +1, I've wished for that too. It's not 100% clear that whatever openssl is in your PATH matches the libraries we select, but this will get it right in most cases and it seems like about the right level of effort. + pgac_openssl_version="$($OPENSSL version 2> /dev/null || echo no)" Maybe "echo 'openssl not found'" would be better. regards, tom lane
Show version of OpenSSL in ./configure output
Hi all, 5e4dacb9878c has reminded me that we don't show the version of OpenSSL in the output of ./configure. This would be useful to know when looking at issues within the buildfarm, and I've wanted that a few times. How about the attached to use the openssl command, if available, to display this information? Libraries may be installed while the command is not available, but in most cases I'd like to think that it is around, and it is less complex than using something like SSLeay_version() from libcrypto. meson already shows this information, so no additions are required there. Also, LibreSSL uses `openssl`, right? Thoughts or comments? -- Michael diff --git a/configure b/configure index c2cb1b1b24..6cb0310fb3 100755 --- a/configure +++ b/configure @@ -648,13 +648,13 @@ MSGFMT PG_CRC32C_OBJS CFLAGS_CRC LIBOBJS -OPENSSL ZSTD LZ4 UUID_LIBS LDAP_LIBS_BE LDAP_LIBS_FE with_ssl +OPENSSL PTHREAD_CFLAGS PTHREAD_LIBS PTHREAD_CC @@ -904,6 +904,7 @@ LDFLAGS_EX LDFLAGS_SL PERL PYTHON +OPENSSL MSGFMT TCLSH' @@ -1615,6 +1616,7 @@ Some influential environment variables: LDFLAGS_SL extra linker flags for linking shared libraries only PERLPerl program PYTHON Python program + OPENSSL path to openssl command MSGFMT msgfmt program for NLS TCLSH Tcl interpreter program (tclsh) @@ -12863,6 +12865,65 @@ else fi fi + # Print version of OpenSSL, if command is available. + + if test -z "$OPENSSL"; then + for ac_prog in openssl +do + # Extract the first word of "$ac_prog", so it can be a program name with args. +set dummy $ac_prog; ac_word=$2 +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 +$as_echo_n "checking for $ac_word... " >&6; } +if ${ac_cv_path_OPENSSL+:} false; then : + $as_echo_n "(cached) " >&6 +else + case $OPENSSL in + [\\/]* | ?:[\\/]*) + ac_cv_path_OPENSSL="$OPENSSL" # Let the user override the test with a path. + ;; + *) + as_save_IFS=$IFS; IFS=$PATH_SEPARATOR +for as_dir in $PATH +do + IFS=$as_save_IFS + test -z "$as_dir" && as_dir=. +for ac_exec_ext in '' $ac_executable_extensions; do + if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then +ac_cv_path_OPENSSL="$as_dir/$ac_word$ac_exec_ext" +$as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 +break 2 + fi +done + done +IFS=$as_save_IFS + + ;; +esac +fi +OPENSSL=$ac_cv_path_OPENSSL +if test -n "$OPENSSL"; then + { $as_echo "$as_me:${as_lineno-$LINENO}: result: $OPENSSL" >&5 +$as_echo "$OPENSSL" >&6; } +else + { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 +$as_echo "no" >&6; } +fi + + + test -n "$OPENSSL" && break +done + +else + # Report the value of OPENSSL in configure's output in all cases. + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for OPENSSL" >&5 +$as_echo_n "checking for OPENSSL... " >&6; } + { $as_echo "$as_me:${as_lineno-$LINENO}: result: $OPENSSL" >&5 +$as_echo "$OPENSSL" >&6; } +fi + + pgac_openssl_version="$($OPENSSL version 2> /dev/null || echo no)" + { $as_echo "$as_me:${as_lineno-$LINENO}: using openssl $pgac_openssl_version" >&5 +$as_echo "$as_me: using openssl $pgac_openssl_version" >&6;} # Function introduced in OpenSSL 1.0.2, not in LibreSSL. for ac_func in SSL_CTX_set_cert_cb do : diff --git a/configure.ac b/configure.ac index 440b08d113..35585df598 100644 --- a/configure.ac +++ b/configure.ac @@ -1360,6 +1360,11 @@ if test "$with_ssl" = openssl ; then AC_SEARCH_LIBS(CRYPTO_new_ex_data, [eay32 crypto], [], [AC_MSG_ERROR([library 'eay32' or 'crypto' is required for OpenSSL])]) AC_SEARCH_LIBS(SSL_new, [ssleay32 ssl], [], [AC_MSG_ERROR([library 'ssleay32' or 'ssl' is required for OpenSSL])]) fi + # Print version of OpenSSL, if command is available. + AC_ARG_VAR(OPENSSL, [path to openssl command]) + PGAC_PATH_PROGS(OPENSSL, openssl) + pgac_openssl_version="$($OPENSSL version 2> /dev/null || echo no)" + AC_MSG_NOTICE([using openssl $pgac_openssl_version]) # Function introduced in OpenSSL 1.0.2, not in LibreSSL. AC_CHECK_FUNCS([SSL_CTX_set_cert_cb]) # Functions introduced in OpenSSL 1.1.0. We used to check for signature.asc Description: PGP signature
Re: LLVM 16 (opaque pointers)
On Sun, Oct 22, 2023 at 2:44 PM Thomas Munro wrote: > On Sat, Oct 21, 2023 at 2:45 PM Tom Lane wrote: > > Thomas Munro writes: > > > (It'd be nice if the > > > build farm logged "$LLVM_CONFIG --version" somewhere.) > > > > It's not really the buildfarm script's responsibility to do that, > > but feel free to make configure do so. > > Done, copying the example of how we do it for perl and various other things. Build farm measles update: With that we can see that nicator (LLVM 15 on POWER) is green. We can see that cavefish (LLVM 6 on POWER) is red as expected. We can also see that bonito (LLVM 7 on POWER) is red, so my earlier theory that this might be due to the known bug we got fixed in LLVM 7 is not enough. Maybe there are other things fixed on POWER somewhere between those LLVM versions? I suspect it'll be hard to figure out without debug builds and backtraces. One thing is definitely our fault, though. xenodermus shows failures on REL_12_STABLE and REL_13_STABLE, using debug LLVM 6 on x86. I couldn't reproduce this locally on (newer) debug LLVM, so I bugged Andres for access to the host/libraries and chased it down. There is some type punning for a function parameter REL_13_STABLE and earlier, removed by Andres in REL_14_STABLE, and when I back-patched my refactoring I effectively back-patched a few changes from his commit df99ddc70b97 that removed the type punning, but I should have brought one more line from that commit to remove another trace of it. See attached. From 52397f8c70641080f2487ee5f019f143dd35957c Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Sun, 22 Oct 2023 23:20:56 + Subject: [PATCH] jit: Fix LLVM back-patching bug in 12 and 13. While back-patching f90b4a84, I missed that branches before 14 did some type punning in a function parameter. That didn't cause any problem for release builds of LLVM, but debug builds of some older versions would fail a type cross-check assertion. To fix this, we need to back-patch a line of df99ddc7. Per build farm animal xenodermus, which runs a debug build of LLVM 6 with jit_above_cost=0. diff --git a/src/backend/jit/llvm/llvmjit_expr.c b/src/backend/jit/llvm/llvmjit_expr.c index d84fbba7cc..c2e367f00d 100644 --- a/src/backend/jit/llvm/llvmjit_expr.c +++ b/src/backend/jit/llvm/llvmjit_expr.c @@ -1126,7 +1126,7 @@ llvm_compile_expr(ExprState *state) llvm_pg_var_type("TypeExecEvalSubroutine")); v_params[0] = v_state; - v_params[1] = l_ptr_const(op, l_ptr(TypeSizeT)); + v_params[1] = l_ptr_const(op, l_ptr(StructExprEvalStep)); v_params[2] = v_econtext; l_call(b, LLVMGetFunctionType(ExecEvalSubroutineTemplate), -- 2.42.0
Re: SQL:2011 application time
hi. also based on v16. -tests. drop table if exists for_portion_of_test1; CREATE unlogged TABLE for_portion_of_test1 (id int4range, valid_at tsrange,name text ); INSERT INTO for_portion_of_test1 VALUES ('[1,1]', NULL, '[1,1]_NULL'),('[1,1]', '(,)', '()_[1,]') ,('[1,1]', 'empty', '[1,1]_empty'),(NULL,NULL, NULL), (nuLL, '(2018-01-01,2019-01-01)','misc'); --1 UPDATE for_portion_of_test1 FOR PORTION OF valid_at FROM NULL TO NULL SET name = 'for_portition_NULLtoNULL'; select * from for_portion_of_test1; --2 UPDATE for_portion_of_test1 FOR PORTION OF valid_at FROM null TO UNBOUNDED SET name = 'NULL_TO_UNBOUNDED'; select * from for_portion_of_test1; --3 UPDATE for_portion_of_test1 FOR PORTION OF valid_at FROM UNBOUNDED TO null SET name = 'UNBOUNDED__TO_NULL'; select * from for_portion_of_test1; --4 UPDATE for_portion_of_test1 FOR PORTION OF valid_at FROM UNBOUNDED TO UNBOUNDED SET name = 'UNBOUNDED__TO_UNBOUNDED'; select * from for_portion_of_test1; File: /src/backend/executor/nodeModifyTable.c 1277: oldRange = slot_getattr(oldtupleSlot, forPortionOf->rangeVar->varattno, ); 1278: 1279: if (isNull) 1280: elog(ERROR, "found a NULL range in a temporal table"); 1281: oldRangeType = DatumGetRangeTypeP(oldRange); I wonder when this isNull will be invoked. the above tests won't invoke the error. also the above test, NULL seems equivalent to unbounded. FOR PORTION OF "from" and "to" both bound should not be null? which means the following code does not work as intended? I also cannot find a way to invoke the following elog error branch. File:src/backend/executor/nodeModifyTable.c 4458: exprState = ExecPrepareExpr((Expr *) forPortionOf->targetRange, estate); 4459: targetRange = ExecEvalExpr(exprState, econtext, ); 4460: if (isNull) 4461: elog(ERROR, "Got a NULL FOR PORTION OF target range"); --- i also made some changes in the function range_leftover_internal, ExecForPortionOfLeftovers. please see the attached patch. From 5cdd99f9a63d381985541b0df539c2bb92ef7276 Mon Sep 17 00:00:00 2001 From: pgaddict Date: Sat, 21 Oct 2023 17:18:50 +0800 Subject: [PATCH v1 1/1] refactor function range_leftover_internal renaming variable. better comment. --- src/backend/utils/adt/rangetypes.c | 55 +- src/include/utils/rangetypes.h | 5 ++- 2 files changed, 33 insertions(+), 27 deletions(-) diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c index d4e17323..a5a34450 100644 --- a/src/backend/utils/adt/rangetypes.c +++ b/src/backend/utils/adt/rangetypes.c @@ -1207,47 +1207,54 @@ range_split_internal(TypeCacheEntry *typcache, const RangeType *r1, const RangeT } /* - * range_leftover_internal - Sets output1 and output2 to the remaining parts of r1 - * after subtracting r2, or if nothing is left then to the empty range. - * output1 will always be "before" r2 and output2 "after". + * Sets leftoverRange and rightoverRange to the remaining parts of oldRange + * after subtracting targetRange, or if nothing is left then to the empty range. + * leftoverRange will always be "before" targetRange and rightoverRange "after" targetRange. */ void -range_leftover_internal(TypeCacheEntry *typcache, const RangeType *r1, - const RangeType *r2, RangeType **output1, RangeType **output2) +range_leftover_internal(TypeCacheEntry *typcache, const RangeType *oldRange, + const RangeType *targetRange, RangeType **leftoverRange, RangeType **rightoverRange) { - RangeBound lower1, -lower2; - RangeBound upper1, -upper2; - bool empty1, -empty2; + RangeBound lower_oldRange, +lower_targetRange; + RangeBound upper_oldRange, +upper_targetRange; + bool oldRange_is_empty, +targetRange_is_empty; - range_deserialize(typcache, r1, , , ); - range_deserialize(typcache, r2, , , ); + range_deserialize(typcache, oldRange, _oldRange, _oldRange, _is_empty); + range_deserialize(typcache, targetRange, _targetRange, _targetRange, _is_empty); - if (range_cmp_bounds(typcache, , ) < 0) + /* + * FOR PORTION OF. upper_targetRange, if oldRange do not overlaps with targetRangeType. + * So these two range must overlaps, that means both range should not be empty. + * + */ + Assert(!oldRange_is_empty); + Assert(!targetRange_is_empty); + + if (range_cmp_bounds(typcache, _oldRange, _targetRange) < 0) { - lower2.inclusive = !lower2.inclusive; - lower2.lower = false; - *output1 = make_range(typcache, , , false, NULL); + lower_targetRange.inclusive = !lower_targetRange.inclusive; + lower_targetRange.lower = false; + *leftoverRange = make_range(typcache, _oldRange, _targetRange, false, NULL); } else { - *output1 = make_empty_range(typcache); + *leftoverRange = make_empty_range(typcache); } - if (range_cmp_bounds(typcache, , ) > 0) + if (range_cmp_bounds(typcache, _oldRange, _targetRange) > 0) { - upper2.inclusive = !upper2.inclusive; - upper2.lower = true; - *output2 = make_range(typcache, , ,
Re: UniqueKey v2
On Fri, Oct 20, 2023 at 4:33 PM wrote: > > > > i did some simple tests using text data type. > > > > it works with the primary key, not with unique indexes. > > it does not work when the column is unique, not null. > > > > The following is my test. > > Can you simplify your test case please? I can't undertand what "doesn't > work" mean here and for which case. FWIW, this feature has nothing with > the real data, I don't think inserting any data is helpful unless I > missed anything. Sorry for not explaining it very well. "make distinct as no-op." my understanding: it means: if fewer rows meet the criteria "columnX < const_a;" , after analyze the table, it should use index only scan for the queryA? --queryA: select distinct columnX from the_table where columnX < const_a; There are several ways for columnX to be unique: primark key, unique key, unique key nulls distinct, unique key nulls not distinct, unique key and not null. After applying your patch, only the primary key case will make the queryA explain output using the index-only scan.
Re: pgstatindex vs. !indisready
On Wed, Oct 04, 2023 at 09:00:23AM +0900, Michael Paquier wrote: > On Sun, Oct 01, 2023 at 06:31:26PM -0700, Noah Misch wrote: > > The !indisvalid index may be missing tuples, yes. In what way does that > > make > > one of those four operations incorrect? > > Reminding myself of what these four do, it looks that you're right and > that the state of indisvalid is not going to change what they report. > > Still, I'd like to agree with Tom's point to be more conservative and > check also for indisvalid which is what the planner does. These > functions will be used in SELECTs, and one thing that worries me is > that checks based on indisready may get copy-pasted somewhere else, > leading to incorrect results where they get copied. (indisready && > !indisvalid) is a "short"-term combination in a concurrent build > process, as well (depends on how long one waits for the old snapshots > before switching indisvalid, but that's way shorter than the period of > time where the built indexes remain valid). Neither choice would harm the user experience in an important way, so I've followed your preference in the attached patch. Author: Noah Misch Commit: Noah Misch Diagnose !indisvalid in more SQL functions. pgstatindex failed with ERRCODE_DATA_CORRUPTED, of the "can't-happen" class XX. The other functions succeeded on an empty index; they might have malfunctioned if the failed index build left torn I/O or other complex state. Report an ERROR in statistics functions pgstatindex, pgstatginindex, pgstathashindex, and pgstattuple. Report DEBUG1 and skip all index I/O in maintenance functions brin_desummarize_range, brin_summarize_new_values, brin_summarize_range, and gin_clean_pending_list. Back-patch to v11 (all supported versions). Reviewed by FIXME. Discussion: https://postgr.es/m/20231001195309...@google.com diff --git a/contrib/pgstattuple/pgstatindex.c b/contrib/pgstattuple/pgstatindex.c index d69ac1c..8e5a4d6 100644 --- a/contrib/pgstattuple/pgstatindex.c +++ b/contrib/pgstattuple/pgstatindex.c @@ -238,6 +238,18 @@ pgstatindex_impl(Relation rel, FunctionCallInfo fcinfo) errmsg("cannot access temporary tables of other sessions"))); /* +* A !indisready index could lead to ERRCODE_DATA_CORRUPTED later, so exit +* early. We're capable of assessing an indisready&&!indisvalid index, +* but the results could be confusing. For example, the index's size +* could be too low for a valid index of the table. +*/ + if (!rel->rd_index->indisvalid) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), +errmsg("index \"%s\" is not valid", + RelationGetRelationName(rel; + + /* * Read metapage */ { @@ -523,6 +535,13 @@ pgstatginindex_internal(Oid relid, FunctionCallInfo fcinfo) (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("cannot access temporary indexes of other sessions"))); + /* see pgstatindex_impl */ + if (!rel->rd_index->indisvalid) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), +errmsg("index \"%s\" is not valid", + RelationGetRelationName(rel; + /* * Read metapage */ @@ -600,6 +619,13 @@ pgstathashindex(PG_FUNCTION_ARGS) (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("cannot access temporary indexes of other sessions"))); + /* see pgstatindex_impl */ + if (!rel->rd_index->indisvalid) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), +errmsg("index \"%s\" is not valid", + RelationGetRelationName(rel; + /* Get the information we need from the metapage. */ memset(, 0, sizeof(stats)); metabuf = _hash_getbuf(rel, HASH_METAPAGE, HASH_READ, LH_META_PAGE); diff --git a/contrib/pgstattuple/pgstattuple.c b/contrib/pgstattuple/pgstattuple.c index 93b7834..3bd8b96 100644 --- a/contrib/pgstattuple/pgstattuple.c +++ b/contrib/pgstattuple/pgstattuple.c @@ -259,6 +259,13 @@ pgstat_relation(Relation rel, FunctionCallInfo fcinfo) } else if (rel->rd_rel->relkind == RELKIND_INDEX) { + /* see pgstatindex_impl */ + if (!rel->rd_index->indisvalid) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), +errmsg("index \"%s\" is not
Re: Guiding principle for dropping LLVM versions?
Hi, On October 21, 2023 7:46:17 PM PDT, Xing Guo wrote: >Can we also check if the clang's version is compatible with llvm's version >in llvm.m4? I have multiple llvm toolchains installed on my system and I >have to specify the $CLANG and $LLVM_CONFIG variables each time I build the >server against a toolchain that is not present in $PATH. If one of the >variables is missing, the build system will pick up a default one whose >version might not be compatible with the other. E.g., If we use clang-16 >and llvm-config-15, there will be issues when creating indexes for bitcodes >at the end of installation. It's unfortunately not that obvious to figure out what is compatible and what not. Older clang versions work, except if too old. Newer versions sometimes work. We could perhaps write a script that will find many, but not all, incompatibilities. For the meson build I made it just use clang belonging to the llvm install - but that's very painful when building against an assert enabled llvm, clang is slower by an order of magnitude or so. I wonder if we should change the search order to 1) CLANG, iff explicitly specified, 2) use explicitly specified or inferred llvm-config, 3) only if that didn't find clang, search path. >wrote: > >> Rebased. I also noticed this woefully out of date line: >> >> - PGAC_PATH_PROGS(LLVM_CONFIG, llvm-config llvm-config-7 >> llvm-config-6.0 llvm-config-5.0 llvm-config-4.0 llvm-config-3.9) >> + PGAC_PATH_PROGS(LLVM_CONFIG, llvm-config llvm-config-17 >> llvm-config-16 llvm-config-15 llvm-config-14) >> It's outdated, but not completely absurd - back then often no llvm-config -> llvm-config-XY was installed, but these days there pretty much always is. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: Guiding principle for dropping LLVM versions?
On Sun, Oct 22, 2023 at 3:46 PM Xing Guo wrote: > Can we also check if the clang's version is compatible with llvm's version in > llvm.m4? I have multiple llvm toolchains installed on my system and I have to > specify the $CLANG and $LLVM_CONFIG variables each time I build the server > against a toolchain that is not present in $PATH. If one of the variables is > missing, the build system will pick up a default one whose version might not > be compatible with the other. E.g., If we use clang-16 and llvm-config-15, > there will be issues when creating indexes for bitcodes at the end of > installation. Hmm. Problems that occur to me: 1. We need to decide if our rule is that clang must be <= llvm, or ==. I think this question has been left unanswered in the past when it has come up. So far I think <= would be enough to avoid the error you showed but can we find where this policy (ie especially commitments for future releases) is written down in LLVM literature? 2. Apple's clang lies about its version (I don't know the story behind that, but my wild guess is that someone from marketing wanted the compiler's version numbers to align with xcode's version numbers? they're off by 1 or something like that). Another idea could be to produce some bitcode with clang, and then check if a relevant LLVM tool can deal with it.
Re: Why is hot_standby_feedback off by default?
Hi, On October 22, 2023 4:56:15 AM PDT, Vik Fearing wrote: >On 10/22/23 09:50, sirisha chamarthi wrote: >> Is there any specific reason hot_standby_feedback default is set to off? > > >Yes. No one wants a rogue standby to ruin production. Medium term, I think we need an approximate xid->"time of assignment" mapping that's continually maintained on the primary. One of the things that'd show us to do is introduce a GUC to control the maximum effect of hs_feedback on the primary, in a useful unit. Numbers of xids are not a useful unit (100k xids is forever on some systems, a few minutes at best on others, the rate is not necessarily that steady when plpgsql exception handles are used, ...) It'd be useful to have such a mapping for other features too. E.g. - making it visible in pg_stat _activity how problematic a longrunning xact is - a 3 day old xact that doesn't have an xid assigned and has a recent xmin is fine, it won't prevent vacuum from doing things. But a somewhat recent xact that still has a snapshot from before an old xact was cancelled could be problematic. - turn pg_class.relfrozenxid into an understandable timeframe. It's a fair bit of mental effort to classify "370M xids old" into problem/fine (it's e.g. not a problem on a system with a high xid rate, on a big table that takes a bit to a bit to vacuum). - using the mapping to compute an xid consumption rate IMO would be one building block for smarter AV scheduling. Together with historical vacuum runtimes it'd allow us to start vacuuming early enough to prevent hitting thresholds, adapt pacing, prioritize between tables etc. Greetings, Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: Remove extraneous break condition in logical slot advance function
On Sat, Oct 21, 2023 at 11:40 PM Tom Lane wrote: > > Gurjeet Singh writes: > > On Fri, Oct 20, 2023 at 7:30 PM Bharath Rupireddy > > wrote: > >> There exists an extraneous break condition in > >> pg_logical_replication_slot_advance(). When the end of WAL or moveto > >> LSN is reached, the main while condition helps to exit the loop, so no > >> separate break condition is needed. Attached patch removes it. > > > The only advantage I see of the code as it stands right now is that it > > avoids one last call to CHECK_FOR_INTERRUPTS() by break'ing early. I > > don't think we'd lose much in terms of performance by making one (very > > cheap, in common case) extra call of this macro. > > Agreed, bypassing the last CHECK_FOR_INTERRUPTS() shouldn't save > anything noticeable. Could there be a correctness argument for it > though? Can't see what. We should assume that CFIs might happen > down inside LogicalDecodingProcessRecord. AFAICS, there's no correctness argument for breaking before CFI. As rightly said, CFIs can happen before the break condition either down inside LogicalDecodingProcessRecord or XLogReadRecord (page_read callbacks for instance). Having said that, what may happen if CFI happens and interrupts are processed before the break condition is that the decoding occurs again which IMV is not a big problem. An idea to keep all of XLogReadRecord() - LogicalDecodingProcessRecord() loops consistent is by having CFI at the start of the loops before the XLogReadRecord(). > I wondered why the code looks like this, and whether there used > to be more of a reason for it. "git blame" reveals the probable > answer: when this code was added, in 9c7d06d60, the loop > condition was different so the break was necessary. > 38a957316 simplified the loop condition to what we see today, > but didn't notice that the break was thereby made pointless. Right. Thanks for these references. > While we're here ... the comment above the loop seems wrong > already, and this makes it more so. I suggest something like > > - /* Decode at least one record, until we run out of records */ > + /* Decode records until we reach the requested target */ > while (ctx->reader->EndRecPtr < moveto) +1 and done so in the attached v2 patch. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com v2-0001-Remove-extraneous-break-condition-in-logical-slot.patch Description: Binary data
Re: BRIN minmax multi - incorrect distance for infinite timestamp/date
On 10/20/23 11:52, Ashutosh Bapat wrote: > On Thu, Oct 19, 2023 at 6:11 PM Ashutosh Bapat > wrote: >> >> I think we should provide generate_series(date, date, integer) which >> will use date + integer -> date. > > Just to be clear, I don't mean that this patch should add it. > I'm not against adding such generate_series() variant. For this patch I'll use something like the query you proposed, I think. I was thinking about the (date + interval) failure a bit more, and while I think it's confusing it's not quite wrong. The problem is that the interval may have hours/minutes, so it makes sense that the operator returns timestamp. That's not what most operators do, where the data type does not change. So a bit unexpected, but seems correct. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
PoC: prefetching index leaf pages (for inserts)
Hi, Some time ago I started a thread about prefetching heap pages during index scans [1]. That however only helps when reading rows, not when inserting them. Imagine inserting a row into a table with many indexes - we insert the row into the table, and then into the indexes one by one, synchronously. We walk the index, determine the appropriate leaf page, read it from disk into memory, insert the index tuple, and then do the same thing for the next index. If there are many indexes, and there's not much correlation with the table, this may easily results in I/O happening synchronously with queue depth 1. Hard to make it even slower ... This can be a problem even with a modest number of indexes - imagine bulk-loading data into a table using COPY (say, in 100-row batches). Inserting the rows into heap happens in a bulk, but the indexes are still modified in a loop, as if for single-row inserts. Not great. The with multiple connections the concurrent I/O may be generated that way, but for low-concurrency workloads (e.g. batch jobs) that may not really work. I had an idea what we might do about this - we can walk the index, almost as if we're inserting the index tuple, but only the "inner" non-leaf pages. And instead of descending to the leaf page, we just prefetch it. The non-leaf pages are typically <1% of the index, and hot, so likely already cached, so not worth prefetching those. The attached patch does a PoC of this. It adds a new AM function "amprefetch", with an implementation for btree indexes, mimicking the index lookup, except that it only prefetches the leaf page as explained a bit earlier. In the executor, this is wrapped in ExecInsertPrefetchIndexes() which gets called in various places right before ExecInsertPrefetchIndexes(). I thought about doing that in ExecInsertPrefetchIndexes() directly, but that would not work for COPY, where we want to issue the prefetches for the whole batch, not for individual tuples. This may need various improvements - the prefetch duplicates a couple steps that could be expensive (e.g. evaluation of index predicates, forming index tuples, and so on). Would be nice to improve this, but good enough for PoC I think. Another gap is lack of incremental prefetch (ramp-up). We just prefetch all the indexes, for all tuples. But I think that's OK. We know we'll need those pages, and the number is fairly limited. There's a GUC enable_insert_prefetch, that can be used to enable this insert prefetching. I did a simple test on two machines - one with SATA SSD RAID, one with NVMe SSD. In both cases the data (table+indexes) are an order of magnitude larger than RAM. The indexes are on UUID, so pretty random and there's no correlation. Then batches of 100, 1000 and 1 rows are inserted, with/without the prefetching. With 5 indexes, the results look like this: SATA SSD RAID - rows no prefetchprefetch 100 176.872 ms 70.910 ms 1000 1035.056 ms 590.495 ms 1 8494.836 ms 3216.206 ms NVMe rows no prefetchprefetch 100 133.365 ms 72.899 ms 1000 1572.379 ms 829.298 ms 111889.143 ms 3621.981 ms Not bad, I guess. Cutting the time to ~30% is nice. The fewer the indexes, the smaller the difference (with 1 index there is almost no difference), of course. regards [1] https://commitfest.postgresql.org/45/4351/ -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Companydiff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c index af392bc032..dc3197401c 100644 --- a/src/backend/access/brin/brin.c +++ b/src/backend/access/brin/brin.c @@ -117,6 +117,7 @@ brinhandler(PG_FUNCTION_ARGS) amroutine->ambuild = brinbuild; amroutine->ambuildempty = brinbuildempty; amroutine->aminsert = brininsert; + amroutine->amprefetch = NULL; amroutine->ambulkdelete = brinbulkdelete; amroutine->amvacuumcleanup = brinvacuumcleanup; amroutine->amcanreturn = NULL; diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c index 7a4cd93f30..666d58a750 100644 --- a/src/backend/access/gin/ginutil.c +++ b/src/backend/access/gin/ginutil.c @@ -64,6 +64,7 @@ ginhandler(PG_FUNCTION_ARGS) amroutine->ambuild = ginbuild; amroutine->ambuildempty = ginbuildempty; amroutine->aminsert = gininsert; + amroutine->amprefetch = NULL; amroutine->ambulkdelete = ginbulkdelete; amroutine->amvacuumcleanup = ginvacuumcleanup; amroutine->amcanreturn = NULL; diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c index 8ef5fa0329..3ed72cce44 100644 --- a/src/backend/access/gist/gist.c +++ b/src/backend/access/gist/gist.c @@ -86,6 +86,7 @@ gisthandler(PG_FUNCTION_ARGS) amroutine->ambuild = gistbuild; amroutine->ambuildempty = gistbuildempty; amroutine->aminsert = gistinsert; + amroutine->amprefetch = NULL; amroutine->ambulkdelete = gistbulkdelete; amroutine->amvacuumcleanup =
Re: Why is hot_standby_feedback off by default?
On 10/22/23 09:50, sirisha chamarthi wrote: Is there any specific reason hot_standby_feedback default is set to off? Yes. No one wants a rogue standby to ruin production. -- Vik Fearing
Re: missing dependencies PostgreSQL 16 OpenSUSE Tumbleweed (SLES 15.5 packages)
Hi, On Sun, Oct 22, 2023 at 11:00:07AM +0200, André Verwijs wrote: > missing dependencies PostgreSQL 16 OpenSUSE Tumbleweed (SLES 15.5 > packages) > > --- > > YaST2 conflicts list - generated 2023-10-22 10:30:07 > > there is no package providing 'libldap_r-2.4.so.2())(64bit)' required by > installing postgresql16-server-16.0-1PGDG.sles15.x86_64 Those are the packages from zypp.postgresql.org, right? There is a link to the issue tracker at https://redmine.postgresql.org/projects/pgrpms/ from the home page, I think it would best to report the problem there. Michael
missing dependencies PostgreSQL 16 OpenSUSE Tumbleweed (SLES 15.5 packages)
missing dependencies PostgreSQL 16 OpenSUSE Tumbleweed (SLES 15.5 packages) --- YaST2 conflicts list - generated 2023-10-22 10:30:07 there is no package providing 'libldap_r-2.4.so.2())(64bit)' required by installing postgresql16-server-16.0-1PGDG.sles15.x86_64 [ ] break 'postgresql16-server-16.0-1PGDG.sles15.x86_64' by ignoring some dependencies [ ] do not install postgresql16-server-16.0-1PGDG.sles15.x86_64 YaST2 conflicts list END ### dependencies needed: libpq.so.5() (64bit) libpthread.so.0() (64bit) libpthread.so.0(GLIBC_2.2.5)(64bit) libm.so.6() (64bit) libm.so.6(GLIBC_2.2.5)(64bit) libm.so.6(GLIBC_2.29)(64bit) libcrypto.so.1.1() (64bit) libcrypto.so.1.1(OPENSSL_1_1_0)(64bit) libssl.so.1.1() (64bit) liblz4.so.1() (64bit) libssl.so.1.1(OPENSSL_1_1_0)(64bit) libz.so.1() (64bit) libxml2.so.2() (64bit) libxml2.so.2(LIBXML2_2.4.30)(64bit) libzstd.so.1() (64bit) libxml2.so.2(LIBXML2_2.6.0)(64bit) libc.so.6(GLIBC_2.25)(64bit) libcrypto.so.1.1(OPENSSL_1_1_1)(64bit) libdl.so.2() (64bit) libgssapi_krb5.so.2() (64bit) libgssapi_krb5.so.2(gssapi_krb5_2_MIT)(64bit) libldap_r-2.4.so.2() (64bit) libpam.so.0() (64bit) libpam.so.0(LIBPAM_1.0)(64bit) libsystemd.so.0() (64bit) libsystemd.so.0(LIBSYSTEMD_209)(64bit) libdl.so.2(GLIBC_2.2.5)(64bit) libicudata.so.suse65.1() (64bit) libicui18n.so.suse65.1() (64bit) libicuuc.so.suse65.1() (64bit) librt.so.1() (64bit) librt.so.1(GLIBC_2.2.5)(64bit) libxml2.so.2(LIBXML2_2.6.23)(64bit) libxml2.so.2(LIBXML2_2.6.8)(64bit) util-linux postgresql16(x86-64) = 16.0-1PGDG.sles15 postgresql16-libs(x86-64) = 16.0-1PGDG.sles15 /bin/sh /usr/sbin/useradd systemd /usr/sbin/groupadd glibc used repos: https://download.postgresql.org/pub/repos/zypp/16/suse/sles-15.5-x86_64/ https://download.postgresql.org/pub/repos/zypp/srpms/16/suse/sles-15.5-x86_64/ http://download.opensuse.org/repositories/server:database:postgresql/openSUSE_Tumbleweed/ how can i resolve dependencies..?
Why is hot_standby_feedback off by default?
Hi Hackers, Is there any specific reason hot_standby_feedback default is set to off? I see some explanation in the thread [1] about recovery_min_apply_delay value > 0 causing table bloat. However, recovery_min_apply_delay is set to 0 by default. So, if a server admin wants to change this value, they can change hot_standby_feedback as well if needed right? Thanks! [1]: https://www.postgresql.org/message-id/55f981ec.5040...@gmx.net