Re: DSA overflow in hash join

2025-07-31 Thread Konstantin Knizhnik
On 27/07/2025 8:24 PM, Konstantin Knizhnik wrote: I still trying to understand the reason of DSA overflow in hash join. In addition to two suspicious places where number of buckets is doubled without chek for overflow (nodeHash.c:1668 and nodeHash.c:3290), there is one  more place  where

Re: DSA overflow in hash join

2025-07-27 Thread Konstantin Knizhnik
I still trying to understand the reason of DSA overflow in hash join. In addition to two suspicious places where number of buckets is doubled without chek for overflow (nodeHash.c:1668 and nodeHash.c:3290), there is one  more place  where number of batches is multiplied by `EstimateParallelHashJ

Re: Logical replication prefetch

2025-07-26 Thread Konstantin Knizhnik
Completely rewritten version of prefetch patch. Now prefetch workers do not try to apply LR application and then rollback transaction. They just perform indexes lookup and so prefetch index and referenced heap pages. So no any hacks are needed to prevent lock conflicts and WAL logging. Perform

DSA overflow in hash join

2025-07-20 Thread Konstantin Knizhnik
Hi hackers! There is weird error rarely reproduced with sqlancer: `ERROR: invalid DSA memory alloc request size 1140850688`: ** *-* * FinalizeAggregate(cost=114075075706156

Re: Logical replication prefetch

2025-07-19 Thread Konstantin Knizhnik
On 15/07/2025 2:31 PM, Amit Kapila wrote: If you are interested, I would like to know your opinion on a somewhat related topic, which has triggered my interest in your patch. We are working on an update_delete conflict detection patch. The exact problem was explained in the initial email [1]. T

Re: Logical replication prefetch

2025-07-13 Thread Konstantin Knizhnik
On 14/07/2025 4:20 am, Zhijie Hou (Fujitsu) wrote: Thank you for the proposal ! I find it to be a very interesting feature。 I tested the patch you shared in your original email and encountered potential deadlocks when testing pgbench TPC-B like workload. Could you please provide an updated patc

Re: Logical replication prefetch

2025-07-13 Thread Konstantin Knizhnik
On 13/07/2025 1:28 pm, Amit Kapila wrote: On Tue, Jul 8, 2025 at 12:06 PM Konstantin Knizhnik wrote: There is well known Postgres problem that logical replication subscriber can not caught-up with publisher just because LR changes are applied by single worker and at publisher changes are

Re: Logical replication prefetch

2025-07-13 Thread Konstantin Knizhnik
On 13/07/2025 9:28 am, Amit Kapila wrote: I didn't understand your scenario. pa_launch_parallel_worker() should spawn a new worker only if all the workers in the pool are busy, and then it will free the worker if the pool already has enough workers. So, do you mean to say that the workers in th

Re: Logical replication prefetch

2025-07-11 Thread Konstantin Knizhnik
On 08/07/2025 2:51 pm, Amit Kapila wrote: On Tue, Jul 8, 2025 at 12:06 PM Konstantin Knizhnik wrote: There is well known Postgres problem that logical replication subscriber can not caught-up with publisher just because LR changes are applied by single worker and at publisher changes are

Re: Logical replication prefetch

2025-07-11 Thread Konstantin Knizhnik
On 11/07/2025 11:52 am, Amit Kapila wrote: On Wed, Jul 9, 2025 at 12:08 AM Konstantin Knizhnik wrote: On 08/07/2025 2:51 pm, Amit Kapila wrote: On Tue, Jul 8, 2025 at 12:06 PM Konstantin Knizhnik wrote: It is possible to enforce parallel apply of short transactions using

Re: Logical replication prefetch

2025-07-08 Thread Konstantin Knizhnik
On 08/07/2025 2:51 pm, Amit Kapila wrote: On Tue, Jul 8, 2025 at 12:06 PM Konstantin Knizhnik wrote: It is possible to enforce parallel apply of short transactions using `debug_logical_replication_streaming` but then performance is ~2x times slower than in case of sequential apply by single

Logical replication prefetch

2025-07-07 Thread Konstantin Knizhnik
There is well known Postgres problem that logical replication subscriber can not caught-up with publisher just because LR changes are applied by single worker and at publisher changes are made by multiple concurrent backends. The problem is not logical replication specific: physical replication

Re: Non-reproducible AIO failure

2025-06-19 Thread Konstantin Knizhnik
On 18/06/2025 7:08 pm, Andres Freund wrote: Hi, On 2025-06-18 10:32:08 +0300, Konstantin Knizhnik wrote: On 17/06/2025 6:08 pm, Andres Freund wrote: I don't think it can - this must be an independent bug from the one that Tom and I were encountering. I see... It's a pit

Re: Non-reproducible AIO failure

2025-06-18 Thread Konstantin Knizhnik
On 17/06/2025 6:08 pm, Andres Freund wrote: I don't think it can - this must be an independent bug from the one that Tom and I were encountering. I see... It's a pity. By the way, I have a questions concerning using interrupts in AIO. The comments say: pgaio_io_release(PgAioHandle *ioh)  

Re: Non-reproducible AIO failure

2025-06-17 Thread Konstantin Knizhnik
On 17/06/2025 4:47 pm, Andres Freund wrote: I and Alexandr are using completely different devices with different hardware, OS and clang version. Both of you are running Ventura, right? No, Alexandr is using darwin23.5 Alexandr also noticed that he can reproduce the problem only with --with-l

Re: Non-reproducible AIO failure

2025-06-17 Thread Konstantin Knizhnik
On 12/06/2025 4:57 pm, Andres Freund wrote: The problem appears to be in that switch between "when submitted, by the IO worker" and "then again by the backend". It's not concurrent access in the sense of two processes writing to the same value, it's that when switching from the worker updating

Re: Non-reproducible AIO failure

2025-06-17 Thread Konstantin Knizhnik
On 17/06/2025 4:35 pm, Andres Freund wrote: Konstantin, Alexander - are you using the same device to reproduce this or different ones? I wonder if this somehow depends on some MDM / corporate enforcement tooling running or such. What does: - profiles status -type enrollment - kextstat -l show?

Re: Non-reproducible AIO failure

2025-06-17 Thread Konstantin Knizhnik
On 17/06/2025 3:22 am, Tom Lane wrote: Konstantin Knizhnik writes: On 16/06/2025 6:11 pm, Andres Freund wrote: I unfortunately can't repro this issue so far. But unfortunately it means that the problem is not fixed. FWIW, I get similar results to Andres' on a Mac Mini M4 Pro usin

Re: Non-reproducible AIO failure

2025-06-16 Thread Konstantin Knizhnik
On 16/06/2025 6:11 pm, Andres Freund wrote: Hi, On 2025-06-16 14:11:39 +0300, Konstantin Knizhnik wrote: One more update: with the proposed patch (memory barrier before `ConditionVariableBroadcast` in `pgaio_io_process_completion` I don't see how that barrier could be require

Re: Non-reproducible AIO failure

2025-06-16 Thread Konstantin Knizhnik
One more update: with the proposed patch (memory barrier before `ConditionVariableBroadcast` in `pgaio_io_process_completion` and replacing bit fields with `uint8`) the problem is not reproduced at my system during 5 seconds.

Re: Non-reproducible AIO failure

2025-06-15 Thread Konstantin Knizhnik
With this two additional changes: diff --git a/src/backend/storage/aio/aio.c b/src/backend/storage/aio/aio.c index 6c6c0a908e2..6dd2816bea9 100644 --- a/src/backend/storage/aio/aio.c +++ b/src/backend/storage/aio/aio.c @@ -538,6 +538,9 @@ pgaio_io_process_completion(PgAioHandle *ioh, int result)

Re: Non-reproducible AIO failure

2025-06-15 Thread Konstantin Knizhnik
On 13/06/2025 11:20 pm, Andres Freund wrote: Attached is a patch that fixes the problem for me. Alexander, Konstantin, could you verify that it also fixes the problem for you? Given that it does address the problem for me, I'm inclined to push this fairly soon, the barrier is pretty obviously r

Re: Non-reproducible AIO failure

2025-06-14 Thread Konstantin Knizhnik
On 13/06/2025 11:20 pm, Andres Freund wrote: Hi, On 2025-06-12 12:23:13 -0400, Andres Freund wrote: On 2025-06-12 11:52:31 -0400, Andres Freund wrote: On 2025-06-12 17:22:22 +0300, Konstantin Knizhnik wrote: On 12/06/2025 4:57 pm, Andres Freund wrote: The problem appears to be in that

Re: Non-reproducible AIO failure

2025-06-12 Thread Konstantin Knizhnik
On 12/06/2025 4:57 pm, Andres Freund wrote: The problem appears to be in that switch between "when submitted, by the IO worker" and "then again by the backend". It's not concurrent access in the sense of two processes writing to the same value, it's that when switching from the worker updating

Re: Non-reproducible AIO failure

2025-06-12 Thread Konstantin Knizhnik
On 12/06/2025 4:13 pm, Andres Freund wrote: Hi, On 2025-06-12 15:12:00 +0300, Konstantin Knizhnik wrote: Reproduced it once again with with write-protected io handle. But once again - no access violation, just assert failure. Previously "op" field was overwritten somewhe

Re: Non-reproducible AIO failure

2025-06-12 Thread Konstantin Knizhnik
Reproduced it once again with with write-protected io handle. But once again - no access violation, just assert failure. Previously "op" field was overwritten somewhere between `pgaio_io_reclaim` and `AsyncReadBuffers`: !!!pgaio_io_reclaim [20376]| ioh: 0x1019bc000, ioh->op: 0, ioh->generatio

Re: Non-reproducible AIO failure

2025-06-11 Thread Konstantin Knizhnik
I tried to catch moment when memory is changed using mprotect. I have aligned PgAioHandle on page boundary (16kb at MacOS), and disable writes in `pgaio_io_reclaim`: ``` static void pgaio_io_reclaim(PgAioHandle *ioh) {    RESUME_INTERRUPTS();     rc = mprotect(ioh, sizeof(*ioh), PROT_READ);    

Re: Non-reproducible AIO failure

2025-06-10 Thread Konstantin Knizhnik
On 10/06/2025 8:41 pm, Andres Freund wrote: I was able to reproduce it with gcc, too. I've reproduced it without that bitfield, unfortunately :(. But also only at MacOS? I wonder if it is possible to set hardware watchpoint fro program itself (not using gdb)? I.e. using ptrace? Looks lik

Re: Non-reproducible AIO failure

2025-06-10 Thread Konstantin Knizhnik
On 09/06/2025 2:05 am, Thomas Munro wrote: On Sat, Jun 7, 2025 at 6:47 AM Andres Freund wrote: On 2025-06-06 14:03:12 +0300, Konstantin Knizhnik wrote: There is really essential difference in code generated by clang 15 (working) and 16 (not working). There also are code gen differences

Re: Non-reproducible AIO failure

2025-06-07 Thread Konstantin Knizhnik
On 06/06/2025 10:21 pm, Tom Lane wrote: Konstantin Knizhnik writes: There is really essential difference in code generated by clang 15 (working) and 16 (not working). It's a mistake to think that this is a compiler bug. The C standard explicitly allows compilers to use word-wide opera

Re: Non-reproducible AIO failure

2025-06-06 Thread Konstantin Knizhnik
On 06/06/2025 9:47 pm, Andres Freund wrote: Hi, On 2025-06-06 14:03:12 +0300, Konstantin Knizhnik wrote: There is really essential difference in code generated by clang 15 (working) and 16 (not working). There also are code gen differences between upstream clang 17 and apple's clang,

Re: Non-reproducible AIO failure

2025-06-06 Thread Konstantin Knizhnik
There is really essential difference in code generated by clang 15 (working) and 16 (not working). ``` pgaio_io_stage(PgAioHandle *ioh, PgAioOp op) { ... HOLD_INTERRUPTS();     ioh->op = op;     ioh->result = 0;     pgaio_io_update_state(ioh, PGAIO_HS_DEFINED);     ... } ``` c

Re: Non-reproducible AIO failure

2025-06-05 Thread Konstantin Knizhnik
On 06/06/2025 2:31 am, Tom Lane wrote: Matthias van de Meent writes: I have a very wild guess that's probably wrong in a weird way, but here goes anyway: Did anyone test if interleaving the enum-typed bitfield fields of PgAioHandle with the uint8 fields might solve the issue? Ugh. I think y

Re: Changing shared_buffers without restart

2025-04-18 Thread Konstantin Knizhnik
On 25/02/2025 11:52 am, Dmitry Dolgov wrote: On Fri, Oct 18, 2024 at 09:21:19PM GMT, Dmitry Dolgov wrote: TL;DR A PoC for changing shared_buffers without PostgreSQL restart, via changing shared memory mapping layout. Any feedback is appreciated. Hi Dmitry, I am sorry that I have not participa

Re: Changing shared_buffers without restart

2025-04-18 Thread Konstantin Knizhnik
On 18/04/2025 12:26 am, Dmitry Dolgov wrote: On Thu, Apr 17, 2025 at 02:21:07PM GMT, Konstantin Knizhnik wrote: 1. Performance of Postgres CLOCK page eviction algorithm depends on number of shared buffers. My first native attempt just to mark unused buffers as invalid cause significant

Re: New criteria for autovacuum

2025-04-05 Thread Konstantin Knizhnik
On 03/04/2025 6:29 pm, Aleksander Alekseev wrote: I have mixed feelings about addressing this. Although this behavior is somewhat counterintuitive, if the user has a read-only lookup table he/she can always execute VACUUM manually. In order to relieve him of this unbearable burden we are going

Re: New criteria for autovacuum

2025-04-04 Thread Konstantin Knizhnik
On 04/04/2025 10:41 pm, Melanie Plageman wrote: On Fri, Apr 4, 2025 at 3:27 PM Konstantin Knizhnik wrote: From logical point of view I agree with you: taken in account number of inserted tuples makes sense if it allows to mark page as all-visible. So `ins_since_vacuum` should be better

Re: New criteria for autovacuum

2025-04-04 Thread Konstantin Knizhnik
On 04/04/2025 7:10 pm, Melanie Plageman wrote: On Fri, Apr 4, 2025 at 1:53 AM Konstantin Knizhnik wrote: What is needed to reproduce the problem? 1. Table with populated data 2. Presence of transaction with assigned XID which prevents vacuum from marking pages of this table as all visible 3

Re: New criteria for autovacuum

2025-04-03 Thread Konstantin Knizhnik
On 03/04/2025 6:50 pm, Aleksander Alekseev wrote: Hi, ... and it is claimed that autovacuum will never be triggered in order to set hint bits, assuming we never modify the table again. Actually I waited a bit and got a better EXPLAIN: ``` Index Only Scan using humidity_idx on humidity (c

New criteria for autovacuum

2025-04-03 Thread Konstantin Knizhnik
Hi hackers, Sometime ago I investigated slow query performance case of one customer and noticed that index-only scan has made a lot of heap fetches. -> Index Only Scan using ix_client_objects_vendor_object_id on client_objects client_objects_1 (cost=0.56..2.78 rows=1 width=0) (actual time=0

Re: Contradictory behavior of array_agg(distinct) aggregate.

2024-12-03 Thread Konstantin Knizhnik
On 04/12/2024 9:03 am, Tom Lane wrote: Konstantin Knizhnik writes: postgres=# create table t(x integer unique); CREATE TABLE postgres=# insert into t values (null),(null); INSERT 0 2 postgres=# select count(distinct x) from t;  count --- 0 (1 row) postgres=# select array_agg

Contradictory behavior of array_agg(distinct) aggregate.

2024-12-03 Thread Konstantin Knizhnik
Hi hackers! Is it only me who consider that current behavior of array_agg(distinct) contradicts to interpretation of nulls in other cases ("null" is something like "unknown" which means that we can not say weather two nulls are the same or not). This is why it is allowed to insert multiple n

Incorrect result of bitmap heap scan.

2024-12-02 Thread Konstantin Knizhnik
Hi hackers, Attached script reproduces the problem with incorrect results of `select count(*)` (it returns larger number of records than really available in the table). It is not always reproduced, so you may need to repeat it multiple times - at my system it failed 3 times from 10. The prob

asynchronous commit&synchronous replication

2024-08-10 Thread Konstantin Knizhnik
Hi hackers, Logical replication apply worker by default switches off asynchronous commit. Cite from documentation of subscription parameters: ``` |synchronous_commit|(|enum|)

Re: 回复: An implementation of multi-key sort

2024-07-06 Thread Konstantin Knizhnik
On 04/07/2024 3:45 pm, Yao Wang wrote: Generally, the benefit of mksort is mainly from duplicated values and sort keys: the more duplicated values and sort keys are, the bigger benefit it gets. ... 1. Use distinct stats info of table to enable mksort It's kind of heuristics: in optimizer, ch

Re: Flush pgstats file during checkpoints

2024-06-28 Thread Konstantin Knizhnik
On 18/06/2024 9:01 am, Michael Paquier wrote: Hi all, On HEAD, xlog.c has the following comment, which has been on my own TODO list for a couple of weeks now: * TODO: With a bit of extra work we could just start with a pgstat file * associated with the checkpoint redo location we'

Re: PGC_SIGHUP shared_buffers?

2024-02-18 Thread Konstantin Knizhnik
On 16/02/2024 10:37 pm, Thomas Munro wrote: On Fri, Feb 16, 2024 at 5:29 PM Robert Haas wrote: 3. Reserve lots of address space and then only use some of it. I hear rumors that some forks of PG have implemented something like this. The idea is that you convince the OS to give you a whole bunch

Re: index prefetching

2024-01-21 Thread Konstantin Knizhnik
On 22/01/2024 1:39 am, Tomas Vondra wrote: Why we can prefer covering index  to compound index? I see only two good reasons: 1. Extra columns type do not  have comparison function need for AM. 2. The extra columns are never used in query predicate. Or maybe you don't want to include the column

Re: index prefetching

2024-01-21 Thread Konstantin Knizhnik
On 22/01/2024 1:47 am, Tomas Vondra wrote: h, right. Well, you're right in this case we perhaps could set just one of those flags, but the "purpose" of the two places is quite different. The "prefetch" flag is fully controlled by the prefetcher, and it's up to it to change it (e.g. I can easily

Re: index prefetching

2024-01-21 Thread Konstantin Knizhnik
On 19/01/2024 2:35 pm, Tomas Vondra wrote: On 1/19/24 09:34, Konstantin Knizhnik wrote: On 18/01/2024 6:00 pm, Tomas Vondra wrote: On 1/17/24 09:45, Konstantin Knizhnik wrote: I have integrated your prefetch patch in Neon and it actually works! Moreover, I combined it with prefetch of leaf

Re: index prefetching

2024-01-21 Thread Konstantin Knizhnik
On 20/01/2024 12:14 am, Tomas Vondra wrote: Looks like I was not true, even if it is not index-only scan but index condition involves only index attributes, then heap is not accessed until we find tuple satisfying search condition. Inclusive index case described above (https://commitfest.postgr

Re: index prefetching

2024-01-19 Thread Konstantin Knizhnik
On 18/01/2024 5:57 pm, Tomas Vondra wrote: On 1/16/24 21:10, Konstantin Knizhnik wrote: ... 4. I think that performing prefetch at executor level is really great idea and so prefetch can be used by all indexes, including custom indexes. But prefetch will be efficient only if index can

Re: index prefetching

2024-01-19 Thread Konstantin Knizhnik
On 18/01/2024 6:00 pm, Tomas Vondra wrote: On 1/17/24 09:45, Konstantin Knizhnik wrote: I have integrated your prefetch patch in Neon and it actually works! Moreover, I combined it with prefetch of leaf pages for IOS and it also seems to work. Cool! And do you think this is the right design

Re: index prefetching

2024-01-17 Thread Konstantin Knizhnik
I have integrated your prefetch patch in Neon and it actually works! Moreover, I combined it with prefetch of leaf pages for IOS and it also seems to work. Just small notice: you are reporting `blks_prefetch_rounds` in explain, but it is not incremented anywhere. Moreover, I do not precisely u

Re: index prefetching

2024-01-17 Thread Konstantin Knizhnik
On 16/01/2024 11:58 pm, Jim Nasby wrote: On 1/16/24 2:10 PM, Konstantin Knizhnik wrote: Amazon RDS is just vanilla Postgres with file system mounted on EBS (Amazon  distributed file system). EBS provides good throughput but larger latencies comparing with local SSDs. I am not sure if read

Re: index prefetching

2024-01-16 Thread Konstantin Knizhnik
On 16/01/2024 11:58 pm, Jim Nasby wrote: On 1/16/24 2:10 PM, Konstantin Knizhnik wrote: Amazon RDS is just vanilla Postgres with file system mounted on EBS (Amazon  distributed file system). EBS provides good throughput but larger latencies comparing with local SSDs. I am not sure if read

Re: index prefetching

2024-01-16 Thread Konstantin Knizhnik
On 16/01/2024 6:25 pm, Tomas Vondra wrote: On 1/16/24 09:13, Konstantin Knizhnik wrote: Hi, On 12/01/2024 6:42 pm, Tomas Vondra wrote: Hi, Here's an improved version of this patch, finishing a lot of the stuff that I alluded to earlier - moving the code from indexam.c, renaming a bun

Re: Custom explain options

2024-01-16 Thread Konstantin Knizhnik
On 16/01/2024 5:38 pm, Tomas Vondra wrote: By "broken" you mean that you prefetch items only from a single leaf page, so immediately after reading the next one nothing is prefetched. Correct? Yes, exactly. It means that reading first heap page from next leaf page will be done without prefetc

Re: index prefetching

2024-01-16 Thread Konstantin Knizhnik
Hi, On 12/01/2024 6:42 pm, Tomas Vondra wrote: Hi, Here's an improved version of this patch, finishing a lot of the stuff that I alluded to earlier - moving the code from indexam.c, renaming a bunch of stuff, etc. I've also squashed it into a single patch, to make it easier to review. I am th

Re: Custom explain options

2024-01-15 Thread Konstantin Knizhnik
On 15/01/2024 5:08 pm, Tomas Vondra wrote: My patch does not care about prefetching internal index pages. Yes, it's a limitation, but my assumption is the internal pages are maybe 0.1% of the index, and typically very hot / cached. Yes, if the index is not used very often, this may be untrue. B

Re: Custom explain options

2024-01-15 Thread Konstantin Knizhnik
On 14/01/2024 11:47 pm, Tomas Vondra wrote: The thing that was not clear to me is who decides what to prefetch, which code issues the prefetch requests etc. In the github links you shared I see it happens in the index AM code (in nbtsearch.c). It is up to the particular plan node (seqscan, in

Re: Custom explain options

2024-01-13 Thread Konstantin Knizhnik
On 13/01/2024 4:51 pm, Tomas Vondra wrote: On 1/12/24 20:30, Konstantin Knizhnik wrote: On 12/01/2024 7:03 pm, Tomas Vondra wrote: On 10/21/23 14:16, Konstantin Knizhnik wrote: Hi hackers, EXPLAIN statement has a list of options (i.e. ANALYZE, BUFFERS, COST,...) which help to provide

Re: Custom explain options

2024-01-12 Thread Konstantin Knizhnik
On 12/01/2024 7:03 pm, Tomas Vondra wrote: On 10/21/23 14:16, Konstantin Knizhnik wrote: Hi hackers, EXPLAIN statement has a list of options (i.e. ANALYZE, BUFFERS, COST,...) which help to provide useful details of query execution. In Neon we have added PREFETCH option which shows

Re: Custom explain options

2024-01-10 Thread Konstantin Knizhnik
On 09/01/2024 10:33 am, vignesh C wrote: On Sat, 21 Oct 2023 at 18:34, Konstantin Knizhnik wrote: Hi hackers, EXPLAIN statement has a list of options (i.e. ANALYZE, BUFFERS, COST,...) which help to provide useful details of query execution. In Neon we have added PREFETCH option which shows

Re: Custom explain options

2024-01-10 Thread Konstantin Knizhnik
On 10/01/2024 8:29 am, Andrei Lepikhov wrote: On 30/11/2023 22:40, Konstantin Knizhnik wrote: In all this cases we are using array of `Instrumentation` and if it contains varying part, then it is not clear where to place it. Yes, there is also code which serialize and sends instrumentations

Re: Custom explain options

2024-01-10 Thread Konstantin Knizhnik
On 10/01/2024 8:46 am, Michael Paquier wrote: On Wed, Jan 10, 2024 at 01:29:30PM +0700, Andrei Lepikhov wrote: What do you think about this really useful feature? Do you wish to develop it further? I am biased here. This seems like a lot of code for something we've been delegating to the exp

Re: Custom explain options

2023-11-30 Thread Konstantin Knizhnik
On 30/11/2023 5:59 am, Andrei Lepikhov wrote: On 21/10/2023 19:16, Konstantin Knizhnik wrote: EXPLAIN statement has a list of options (i.e. ANALYZE, BUFFERS, COST,...) which help to provide useful details of query execution. In Neon we have added PREFETCH option which shows information about

Custom explain options

2023-10-21 Thread Konstantin Knizhnik
Hi hackers, EXPLAIN statement has a list of options (i.e. ANALYZE, BUFFERS, COST,...) which help to provide useful details of query execution. In Neon we have added PREFETCH option which shows information about page prefetching during query execution (prefetching is more critical for Neon archi

Re: Can concurrent create index concurrently block each other?

2023-10-15 Thread Konstantin Knizhnik
On 15/10/2023 10:59 pm, Tom Lane wrote: Konstantin Knizhnik writes: One our customer complains that he spawned two `create index concurrently` for two different tables and both stuck in"waiting for old snapshots". I wonder if two CIC can really block each other in `WaitForOlde

Can concurrent create index concurrently block each other?

2023-10-15 Thread Konstantin Knizhnik
One our customer complains that he spawned two `create index concurrently` for two different tables and both stuck in"waiting for old snapshots". I wonder if two CIC can really block each other in `WaitForOlderSnapshots`? I found the similar question in hacker archive: https://www.postgresql.or

Re: Index range search optimization

2023-10-06 Thread Konstantin Knizhnik
On 04/10/2023 3:00 am, Alexander Korotkov wrote: On Wed, Oct 4, 2023 at 12:59 AM Pavel Borisov wrote: I've looked through the patch v8. I think it's good enough to be pushed if Peter has no objections. Thank you, Pavel. I'll push this if there are no objections. -- Regards, Alexander Ko

Re: Support prepared statement invalidation when result types change

2023-08-28 Thread Konstantin Knizhnik
On 25.08.2023 8:57 PM, Jelte Fennema wrote: The cached plan for a prepared statements can get invalidated when DDL changes the tables used in the query, or when search_path changes. When this happens the prepared statement can still be executed, but it will be replanned in the new context. Thi

Sync scan & regression tests

2023-08-06 Thread Konstantin Knizhnik
Hi hackers, Is it is ok, that regression tests do not pass with small value of shared buffers (for example 1Mb)? Two tests are failed because of sync scan - this tests cluster.sql and portals.sql perform seqscan without explicit order by and expect that data will be returned in particular or

One more problem with JIT

2023-08-01 Thread Konstantin Knizhnik
Hi hackers, I am using pg_embedding extension for Postgres which implements HNSW index (some kind of ANN search). Search query looks something like this:     SELECT _id FROM documents ORDER BY openai <=> ARRAY[0.024466066, -0.00042, -0.0012917554,... , -0.008700027] LIMIT 1; I do not p

Index range search optimization

2023-06-23 Thread Konstantin Knizhnik
Hi hackers. _bt_readpage performs key check for each item on the page trying to locate upper boundary. While comparison of simple integer keys are very fast, comparison of long strings can be quite expensive. We can first make check for the largest key on the page and if it is not larger than

Re: Let's make PostgreSQL multi-threaded

2023-06-15 Thread Konstantin Knizhnik
On 15.06.2023 12:04 PM, Hannu Krosing wrote: So a fair bit of work but also a clearly defined benefits of 1) reduced memory usage 2) no need to rebuild caches for each new connection 3) no need to track PREPARE statements inside connection poolers. Shared plan cache (not only prepared statem

Re: Let's make PostgreSQL multi-threaded

2023-06-15 Thread Konstantin Knizhnik
On 15.06.2023 11:41 AM, James Addison wrote: On Thu, 15 Jun 2023 at 08:12, Konstantin Knizhnik wrote: On 15.06.2023 1:23 AM, James Addison wrote: On Tue, 13 Jun 2023 at 07:55, Konstantin Knizhnik wrote: On 12.06.2023 3:23 PM, Pavel Borisov wrote: Is the following true or not? 1. If

Re: Bypassing shared_buffers

2023-06-15 Thread Konstantin Knizhnik
On 15.06.2023 4:37 AM, Vladimir Churyukin wrote: Ok, got it, thanks. Is there any alternative approach to measuring the performance as if the cache was empty? The goal is basically to calculate the max possible I/O time for a query, to get a range between min and max timing. It's ok if it's

Re: Let's make PostgreSQL multi-threaded

2023-06-15 Thread Konstantin Knizhnik
On 15.06.2023 1:23 AM, James Addison wrote: On Tue, 13 Jun 2023 at 07:55, Konstantin Knizhnik wrote: On 12.06.2023 3:23 PM, Pavel Borisov wrote: Is the following true or not? 1. If we switch processes to threads but leave the amount of session local variables unchanged, there would be

Re: Let's make PostgreSQL multi-threaded

2023-06-13 Thread Konstantin Knizhnik
On 13.06.2023 11:46 AM, Kyotaro Horiguchi wrote: So we can assume that catalog  and relation cache should always fit in memory memory (otherwise significant rewriting of all Postgtres code working with relations will be needed). I'm not sure that is ture.. But likely to be? Sorry, looks li

Re: Let's make PostgreSQL multi-threaded

2023-06-13 Thread Konstantin Knizhnik
On 13.06.2023 10:55 AM, Kyotaro Horiguchi wrote: At Tue, 13 Jun 2023 09:55:36 +0300, Konstantin Knizhnik wrote in Postgres backend is "thick" not because of large number of local variables. It is because of local caches: catalog cache, relation cache, prepared statements cache,.

Re: Let's make PostgreSQL multi-threaded

2023-06-12 Thread Konstantin Knizhnik
On 12.06.2023 3:23 PM, Pavel Borisov wrote: Is the following true or not? 1. If we switch processes to threads but leave the amount of session local variables unchanged, there would be hardly any performance gain. 2. If we move some backend's local variables into shared memory then the perfor

Re: Named Prepared statement problems and possible solutions

2023-06-08 Thread Konstantin Knizhnik
On 08.06.2023 6:18 PM, Dave Cramer wrote: On Thu, 8 Jun 2023 at 11:15, Jan Wieck wrote: On 6/8/23 10:56, Dave Cramer wrote: > > > > > On Thu, 8 Jun 2023 at 10:31, Jan Wieck > wrote: > >     On 6/8/23 09:53, Jan Wieck wrote: >     

Re: Let's make PostgreSQL multi-threaded

2023-06-08 Thread Konstantin Knizhnik
On 07.06.2023 3:53 PM, Robert Haas wrote: I think I remember a previous conversation with Andres where he opined that thread-local variables are "really expensive" (and I apologize in advance if I'm mis-remembering this). Now, Andres is not a man who accepts a tax on performance of any size wi

Re: Named Prepared statement problems and possible solutions

2023-06-08 Thread Konstantin Knizhnik
On 08.06.2023 3:43 PM, Jan Wieck wrote: On 6/8/23 02:15, Konstantin Knizhnik wrote: There is a PR with support of prepared statement support to pgbouncer: https://github.com/pgbouncer/pgbouncer/pull/845 any feedback, reviews and suggestions are welcome. I was about to say that the support

Re: Named Prepared statement problems and possible solutions

2023-06-07 Thread Konstantin Knizhnik
On 07.06.2023 10:48 PM, Dave Cramer wrote: Greetings, At pgcon last week I was speaking to some people about the problem we have with connection pools and named prepared statements. For context pgjdbc (and others) use un-named statements and then switch to named statements after using the

Re: Let's make PostgreSQL multi-threaded

2023-06-06 Thread Konstantin Knizhnik
On 06.06.2023 5:13 PM, Robert Haas wrote: On Tue, Jun 6, 2023 at 9:40 AM Robert Haas wrote: I'm not sure that there's a strong consensus, but I do think it's a good idea. Let me elaborate on this a bit. Not all databases have this problem, and PostgreSQL isn't going to be able to stop ha

Re: Let's make PostgreSQL multi-threaded

2023-06-06 Thread Konstantin Knizhnik
On 06.06.2023 12:07 AM, Jonah H. Harris wrote: On Mon, Jun 5, 2023 at 8:18 AM Tom Lane wrote: For the record, I think this will be a disaster.  There is far too much code that will get broken, largely silently, and much of it is not under our control. While I've long been in

Re: Memory leak from ExecutorState context?

2023-04-21 Thread Konstantin Knizhnik
On 21.04.2023 1:51 AM, Melanie Plageman wrote: On Thu, Apr 20, 2023 at 12:42 PM Konstantin Knizhnik wrote: On 11.04.2023 8:14 PM, Jehan-Guillaume de Rorthais wrote: On Sat, 8 Apr 2023 02:01:19 +0200 Jehan-Guillaume de Rorthais wrote: On Fri, 31 Mar 2023 14:06:11 +0200 Jehan-Guillaume de

Re: Memory leak from ExecutorState context?

2023-04-20 Thread Konstantin Knizhnik
On 11.04.2023 8:14 PM, Jehan-Guillaume de Rorthais wrote: On Sat, 8 Apr 2023 02:01:19 +0200 Jehan-Guillaume de Rorthais wrote: On Fri, 31 Mar 2023 14:06:11 +0200 Jehan-Guillaume de Rorthais wrote: [...] After rebasing Tomas' memory balancing patch, I did some memory measures to answer

OOM in hash join

2023-04-14 Thread Konstantin Knizhnik
Hi hackers, Too small value of work_mem cause memory overflow in parallel hash join because of too much number batches. There is the plan: explain SELECT * FROM solixschema.MIG_50GB_APR04_G1_H a join solixschema.MIG_50GB_APR04_G2_H b on a.seq_pk = b.seq_pk join solixschema.MIG_50GB_APR04_G3_

Parallel plan cost

2023-03-27 Thread Konstantin Knizhnik
Hi hackers, I wonder why while calculating cost of parallel scan we divide by parallel_divisor only CPU run cost, but not storage access cost? So we do not take in account that reading pages is also performed in parallel. Actually I observed strange behavior when increasing work_mem disables p

Speed-up shared buffers prewarming

2023-03-15 Thread Konstantin Knizhnik
Hi hackers, It is well known fact that queries using sequential scan can not be used to prewarm cache, because them are using ring buffer even if shared buffers are almost empty. I have searched hackers archive but failed to find any discussion about it. What are the drawbacks of using free buf

Index-only scan and random_page_cost

2023-02-03 Thread Konstantin Knizhnik
Hi hackers, Right now cost of index-only scan is using `random_page_cost`. Certainly for point selects we really have random access pattern, but queries like "select count(*) from hits"  access pattern is more or less sequential: we are iterating through subsequent leaf B-Tree pages.  As far as

Re: Lack of PageSetLSN in heap_xlog_visible

2022-11-11 Thread Konstantin Knizhnik
On 11.11.2022 03:20, Jeff Davis wrote: On Thu, 2022-10-13 at 12:49 -0700, Jeff Davis wrote: It may violate our torn page protections for checksums, as well... I could not reproduce a problem here, but I believe one exists when checksums are enabled, because it bypasses the protections of Updat

Lack of PageSetLSN in heap_xlog_visible

2022-10-13 Thread Konstantin Knizhnik
Hi hackers! heap_xlog_visible is not bumping heap page LSN when setting all-visible flag in it. There is long comment explaining it:    /*     * We don't bump the LSN of the heap page when setting the visibility     * map bit (unless checksums or wal_hint_bits is enabled, in w

Re: SLRUs in the main buffer pool, redux

2022-06-16 Thread Konstantin Knizhnik
On 28.05.2022 04:13, Thomas Munro wrote: On Fri, May 27, 2022 at 11:24 PM Thomas Munro wrote: Rebased, debugged and fleshed out a tiny bit more, but still with plenty of TODO notes and questions. I will talk about this idea at PGCon, so I figured it'd help to have a patch that actually appli

Race condition in TransactionIdIsInProgress

2022-02-10 Thread Konstantin Knizhnik
Hi hackers, Postgres first records state transaction in CLOG, then removes transaction from procarray and finally releases locks. But it can happen that transaction is marked as committed in CLOG, XMAX_COMMITTED bit is set in modified tuple but TransactionIdIsInProgress still returns true. As a

Re: Throttling WAL inserts when the standby falls behind more than the configured replica_lag_in_bytes

2022-01-11 Thread Konstantin Knizhnik
On 11.01.2022 03:06, Bossart, Nathan wrote: I noticed this thread and thought I'd share my experiences building something similar for Multi-AZ DB clusters [0]. It's not a strict RPO mechanism, but it does throttle backends in an effort to keep the replay lag below a configured maximum. I can

Re: Add ZSON extension to /contrib/

2021-05-26 Thread Konstantin Knizhnik
On 25.05.2021 13:55, Aleksander Alekseev wrote: Hi hackers, Back in 2016 while being at PostgresPro I developed the ZSON extension [1]. The extension introduces the new ZSON type, which is 100% compatible with JSONB but uses a shared dictionary of strings most frequently used in given JSON

  1   2   3   4   5   6   7   8   >