RE: Poor row estimates from planner, stat `most_common_elems` sometimes missing for a text[] column

2025-06-06 Thread Mark Frost
> On 6/5/25 17:42, Mark Frost wrote: > > Is there any good explanation for this behaviour? Preferably we’d like > > some way for proper `most_common_elems` statistics to be collected in > > our production database, in the hope that influences a good query plan > > to always be selected. > most_c

Re: Poor row estimates from planner, stat `most_common_elems` sometimes missing for a text[] column

2025-06-06 Thread Frédéric Yhuel
On 6/5/25 23:52, Tom Lane wrote: The idea of treating lack of MCELEM differently from complete lack of stats still seems to have merit, though. Couldn't we count / estimate the number of distinct two-by-two elements, and use that instead of the default selectivity estimate?

Re: Poor row estimates from planner, stat `most_common_elems` sometimes missing for a text[] column

2025-06-05 Thread Tom Lane
I wrote: > The part of that that seems to be going off the rails is > this selection of a cutoff frequency below which element values > will be dropped: > cutoff_freq = 9 * element_no / bucket_width; > The first thing I find suspicious here is that the calculation is > based on element_no

Re: Poor row estimates from planner, stat `most_common_elems` sometimes missing for a text[] column

2025-06-05 Thread Tom Lane
Mark Frost writes: > We're seeing intermittently very poor performance of a query, when > occasionally a poor query plan is chosen. We're using Postgres 16.9. > One suspicious factor when looking at the EXPLAIN ANALYZE output, is a very > wrong estimated number of rows to be returned from a text

Re: Database creation performance drop going from pg 14 to pg 15+

2025-06-05 Thread Mahdi Bahrami
Here's what the OP of the PostgresNIO issue has mentioned about what performance impact usage of `file_copy` has in his setup (https://github.com/SwiftPackageIndex/SwiftPackageIndex-Server/pull/3812): `Series: Default Strategy Suite AllTests passed after 5.081 seconds Suite AllTests passed after

Re: Poor row estimates from planner, stat `most_common_elems` sometimes missing for a text[] column

2025-06-05 Thread Frédéric Yhuel
On 6/5/25 17:42, Mark Frost wrote: Is there any good explanation for this behaviour? Preferably we’d like some way for proper `most_common_elems` statistics to be collected in our production database, in the hope that influences a good query plan to always be selected. most_common_elems h

Re: ALTER TABLE ADD FOREIGN KEY to partitioned table, is not parallelized

2025-06-05 Thread Frédéric Yhuel
On 6/5/25 16:13, Frédéric Yhuel wrote: On 6/4/25 16:12, Dimitrios Apostolou wrote: In general I have noticed most operations are slower after a succesful pg_restore until VACUUM is complete, which is unfortunate as the database is huge and it takes days to run. Something I have on my list

Re: ALTER TABLE ADD FOREIGN KEY to partitioned table, is not parallelized

2025-06-05 Thread Frédéric Yhuel
On 6/4/25 16:12, Dimitrios Apostolou wrote: In general I have noticed most operations are slower after a succesful pg_restore until VACUUM is complete, which is unfortunate as the database is huge and it takes days to run. Something I have on my list to try, is whether a COPY FREEZE would al

Re: ALTER TABLE ADD FOREIGN KEY to partitioned table, is not parallelized

2025-06-04 Thread Dimitrios Apostolou
On Wed, 4 Jun 2025, Frédéric Yhuel wrote: On 6/3/25 17:34, Dimitrios Apostolou wrote: The backend process for each of the above ALTER TABLE commands, does not   parallelize the foreign key checks for the different partitions. I   know, because in the logs I see gigabytes of temporary files

Re: ALTER TABLE ADD FOREIGN KEY to partitioned table, is not parallelized

2025-06-04 Thread Frédéric Yhuel
On 6/3/25 17:34, Dimitrios Apostolou wrote: The backend process for each of the above ALTER TABLE commands, does not   parallelize the foreign key checks for the different partitions. I   know, because in the logs I see gigabytes of temporary files being   written, with the CONTEXT showing

Re: Database creation performance drop going from pg 14 to pg 15+

2025-05-28 Thread Mahdi Bahrami
To be clear the "15s instead of 1m30s" was related to another issue about SASL implementation of PostgresNIO being slow, not this CREATE DATABASE strategy change. I was just trying to explain that in their set up, these saved milliseconds do matter for whatever reason.

Re: Database creation performance drop going from pg 14 to pg 15+

2025-05-28 Thread Thomas Munro
On Wed, May 28, 2025 at 11:09 PM Mahdi Bahrami wrote: > their tests now run in 15s instead of 1m30s Wow. If/when they get to v18, they might also try setting file_copy_method=clone when using strategy=file_copy.

Re: Database creation performance drop going from pg 14 to pg 15+

2025-05-28 Thread Mahdi Bahrami
Thanks for the hint, that was exactly it! I also asked the original author of the issue and he has confirmed the performance of pg 15+ now matches those of pg 13/14. As mentioned "This issue is not necessarily a big deal to us, but can anyone please explain the why?" so it was partially out of

Re: Database creation performance drop going from pg 14 to pg 15+

2025-05-27 Thread Álvaro Herrera
On 2025-May-27, Mahdi Bahrami wrote: > Here's how I ran the commands that seem to be at fault when using > PostgresNIO. I only ran the drop-db create-db commands, not the whole > chain, which should still be sufficient: > ``` > PGPASSWORD=xxx psql -U spi_test -h 127.0.0.1 postgres -c "DROP DATABAS

Re: inefficient/wrong plan cache mode selection for queries with partitioned tables (postgresql 17)

2025-05-18 Thread Andrei Lepikhov
On 5/12/25 16:04, Maxim Boguk wrote: On Mon, May 12, 2025 at 4:48 PM Andrei Lepikhov If I'm not mistaken, it will work with all PG versions that are currently in support. What do you think? Such extension would be very useful (and in general - the solution based on the actual execution

Re: a lot of session wait on lock relation

2025-05-15 Thread James Pang
Thanks, we are checking the partition maintain job , we have 12 partitions , each week one partition there is a default partition attached with this table and huge rows in default partition too , default partition has primary key that include partition key (time range based) too. partition job det

Re: a lot of session wait on lock relation

2025-05-15 Thread James Pang
thanks, we are checking partition maintain jobs ,that hold access exclusive lock. Tom Lane 於 2025年5月15日週四 下午9:24寫道: > Laurenz Albe writes: > > On Thu, 2025-05-15 at 16:27 +0800, James Pang wrote: > >> why inserts into partition table cause "relation lock" ? > > > Something else does; use the

Re: a lot of session wait on lock relation

2025-05-15 Thread Tom Lane
Laurenz Albe writes: > On Thu, 2025-05-15 at 16:27 +0800, James Pang wrote: >> why inserts into partition table cause "relation lock" ? > Something else does; use the pg_blocking_pids() function with the process ID > of > a blocked backend to find out who is holding the lock. More specifically:

Re: a lot of session wait on lock relation

2025-05-15 Thread Laurenz Albe
On Thu, 2025-05-15 at 16:27 +0800, James Pang wrote: > PGv 15.10,  many session coming in doing "insert into tablexxx values" in > parallel, > this is a range partition tables, total 12 partitions plus one default > partition, > only three btree indexes, no others constraints ,no foreign key.

Re: a lot of session wait on lock relation

2025-05-15 Thread Chetan
Hi James, Could share the related test setup details? Would like to look into this. Thanks, Chetan On Thu, 15 May 2025 at 09:27, James Pang wrote: > Hi, >PGv 15.10, many session coming in doing "insert into tablexxx > values" in parallel, this is a range partition tables, total 12 >

Re: inefficient/wrong plan cache mode selection for queries with partitioned tables (postgresql 17)

2025-05-13 Thread Maxim Boguk
On Mon, May 12, 2025 at 9:07 PM Tom Lane wrote: > Maxim Boguk writes: > > Reading the code - probably the lowest hanging fruit is to make > > 'The current multiplier of 1000 * cpu_operator_cost' configurable in the > > future versions. Is the 100x backend memory usage per cached plan differenc

Re: inefficient/wrong plan cache mode selection for queries with partitioned tables (postgresql 17)

2025-05-13 Thread David Rowley
On Tue, 13 May 2025 at 03:19, Maxim Boguk wrote: > On Mon, May 12, 2025 at 6:01 PM David Rowley wrote: >> This is just an artifact of the fact that runtime pruning is not factored >> into the costs. Note the cost of the generic plan. The plan_cache_mode GUC >> is about the only way to overrule

Re: inefficient/wrong plan cache mode selection for queries with partitioned tables (postgresql 17)

2025-05-12 Thread Andrei Lepikhov
On 5/12/25 20:07, Tom Lane wrote: Maxim Boguk writes: Reading the code - probably the lowest hanging fruit is to make 'The current multiplier of 1000 * cpu_operator_cost' configurable in the future versions. I'm wondering whether we should try to make the planner not expend the effort in the

Re: inefficient/wrong plan cache mode selection for queries with partitioned tables (postgresql 17)

2025-05-12 Thread Tom Lane
Maxim Boguk writes: > Reading the code - probably the lowest hanging fruit is to make > 'The current multiplier of 1000 * cpu_operator_cost' configurable in the > future versions. I'm wondering whether we should try to make the planner not expend the effort in the first place, but leave partition

Re: inefficient/wrong plan cache mode selection for queries with partitioned tables (postgresql 17)

2025-05-12 Thread Maxim Boguk
On Mon, May 12, 2025 at 6:33 PM Tom Lane wrote: > Maxim Boguk writes: > > And the problem is that the cost of a custom plan ignores the cost of > > planning itself (which is like 2x orders of magnitude worse than the cost > > of real time partition pruning of a generic plan). > > False. The est

Re: inefficient/wrong plan cache mode selection for queries with partitioned tables (postgresql 17)

2025-05-12 Thread Andrei Lepikhov
On 12/5/2025 16:04, Maxim Boguk wrote: On Mon, May 12, 2025 at 4:48 PM Andrei Lepikhov It is not hard to write such a tiny extension. As I see, the only extra stored "C" procedure is needed to set up force-plan-type flag employing FetchPreparedStatement(). The rest of the code - query

Re: inefficient/wrong plan cache mode selection for queries with partitioned tables (postgresql 17)

2025-05-12 Thread Tom Lane
Maxim Boguk writes: > And the problem is that the cost of a custom plan ignores the cost of > planning itself (which is like 2x orders of magnitude worse than the cost > of real time partition pruning of a generic plan). False. The estimate is evidently pretty wrong, but it's not that there is n

Re: inefficient/wrong plan cache mode selection for queries with partitioned tables (postgresql 17)

2025-05-12 Thread Maxim Boguk
On Mon, May 12, 2025 at 6:01 PM David Rowley wrote: > On Mon, 12 May 2025, 05:08 Andrei Lepikhov, wrote: > >> Thanks for this puzzle! >> I suppose, in case generic planning is much faster than custom one, >> there are two candidates exist: >> 1. Touching the index during planning causes too much

Re: inefficient/wrong plan cache mode selection for queries with partitioned tables (postgresql 17)

2025-05-12 Thread David Rowley
On Mon, 12 May 2025, 05:08 Andrei Lepikhov, wrote: > Thanks for this puzzle! > I suppose, in case generic planning is much faster than custom one, > there are two candidates exist: > 1. Touching the index during planning causes too much overhead - see > get_actual_variable_range > 2. You have a m

Re: inefficient/wrong plan cache mode selection for queries with partitioned tables (postgresql 17)

2025-05-12 Thread Maxim Boguk
On Mon, May 12, 2025 at 4:48 PM Andrei Lepikhov wrote: > On 5/12/25 15:08, Maxim Boguk wrote: > > PS: problem not with difference between custom and generic planning time > > but with prepared statements > > generic plan plans only once, but custom plan plan every call (and plan > > time cost 95%

Re: inefficient/wrong plan cache mode selection for queries with partitioned tables (postgresql 17)

2025-05-12 Thread Maxim Boguk
On Mon, May 12, 2025 at 4:48 PM Andrei Lepikhov wrote: > On 5/12/25 15:08, Maxim Boguk wrote: > > PS: problem not with difference between custom and generic planning time > > but with prepared statements > > generic plan plans only once, but custom plan plan every call (and plan > > time cost 95%

Re: inefficient/wrong plan cache mode selection for queries with partitioned tables (postgresql 17)

2025-05-12 Thread Andrei Lepikhov
On 5/12/25 15:08, Maxim Boguk wrote: PS: problem not with difference between custom and generic planning time but with prepared statements generic plan plans only once, but custom plan plan every call (and plan time cost 95% on total query runtime). Ah, now I got it. I'm aware of this problem f

Re: inefficient/wrong plan cache mode selection for queries with partitioned tables (postgresql 17)

2025-05-12 Thread Maxim Boguk
On Mon, May 12, 2025 at 3:08 PM Andrei Lepikhov wrote: > On 5/12/25 13:49, Maxim Boguk wrote: > > I suspect this situation should be quite common with queries over > > partitioned tables (where planning time is usually quite a high). > > > > Any suggestions what could be done there outside of usi

Re: inefficient/wrong plan cache mode selection for queries with partitioned tables (postgresql 17)

2025-05-12 Thread Andrei Lepikhov
On 5/12/25 13:49, Maxim Boguk wrote: I suspect this situation should be quite common with queries over partitioned tables (where planning time is usually quite a high). Any suggestions what could be done there outside of using force_generic_plan for a particular db user (which will kill perfor

Re: Vacuum Questions

2025-05-05 Thread Craig Jackson
One additional point about reindexing I should have mentioned, make sure you have adequate disk space to complete the reindexing. Craig On Mon, May 5, 2025 at 8:33 AM Craig Jackson wrote: > For the two indexes that take 12 hours to vacuum: If you can drop and > rebuild them in less than the 12

Re: Vacuum Questions

2025-05-05 Thread Craig Jackson
For the two indexes that take 12 hours to vacuum: If you can drop and rebuild them in less than the 12 hours it takes to vacuum them and you can have them be offline then I would do that. If you can't take the indexes offline then consider reindexing online. Also, if the indexes aren't needed for

RE: Vacuum Questions

2025-05-03 Thread msalais
Hi It is not your question but for such situations, you should consider using partitioning. And more closely to your question: I would not disable autovacuum but it must not work with default values. Best regards Michel SALAIS De : Leo Envoyé : vendredi 2 mai 2025 16:23 À : pgsql-

Re: Vacuum Questions

2025-05-02 Thread John Naylor
On Fri, May 2, 2025 at 9:23 PM Leo wrote: > I am purging old records from a table (500 million rows, but I am doing it in > sets of 50,000,000 with a smaller loop of 100,000). That works just fine. > > Because of the amount of data/rows deleted, I disabled the autovacuum for > this table (I w

Re: Vacuum Questions

2025-05-02 Thread Leo
Also, is there a way to estimate the vacuum execution? Something like explain plan - without actually vacuuming, just to see how it will perform it - like a degree of parallelism? On Fri, May 2, 2025 at 10:23 AM Leo wrote: > I have been working on AWS PostgreSQL RDS for a few years, but still n

Re: PostgreSQL Choosing Full Index Over Partial Index

2025-04-30 Thread Tom Lane
=?UTF-8?Q?Felipe_L=C3=B3pez_Montes?= writes: > Thanks a lot for your response Tom. > May I ask how do you debug those functions? > Or is it just that you read the code and more or less guess what should be > the value for each variable with information coming from querying Postgres > tables? The

Re: PostgreSQL Choosing Full Index Over Partial Index

2025-04-30 Thread Felipe López Montes
Thanks a lot for your response Tom. May I ask how do you debug those functions? Or is it just that you read the code and more or less guess what should be the value for each variable with information coming from querying Postgres tables? Thanks a lot. El lun, 28 abr 2025 a las 17:07, Tom Lane ()

Re: PostgreSQL Choosing Full Index Over Partial Index

2025-04-28 Thread Tom Lane
Laurenz Albe writes: > On Mon, 2025-04-28 at 15:22 +0200, Felipe López Montes wrote: >> Following the book PostgreSQL Query Optimization (Second Edition), there is a >> statement on page 90 talking about Partial Indexes that says that the planner >> will use the partial index rather than the full

Re: PostgreSQL Choosing Full Index Over Partial Index

2025-04-28 Thread Felipe López Montes
Hi Mr. Laurenz, Thanks a lot for your response :). The full index is bigger as it has an entry for all the rows of the table, whilst the partial one only has entries for canceled flights. Output of pgstatindex() for the *partial index:* version,tree_level,index_size,root_block_no,internal_pages

Re: PostgreSQL Choosing Full Index Over Partial Index

2025-04-28 Thread Laurenz Albe
On Mon, 2025-04-28 at 15:22 +0200, Felipe López Montes wrote: > I am using PostgreSQL 17.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) > 12.4.0, 64-bit, > and working with the postgres_air Database. > > I have a very simple query (please forget about the sense of the query itself, > I just want

Re: Constraints elimination during runtime

2025-04-16 Thread Laurenz Albe
On Wed, 2025-04-16 at 11:16 +, Weck, Luis wrote: > I am not sure if this list is the most appropriate, but I figured I’d share > it here… > > If a column has a check constraint, such as CHECK (length(value) < 10) or even > something like a VARCHAR(10) shouldn’t a query like this become a no-o

Re: many sessions wait on LWlock WALWrite suddenly

2025-04-15 Thread Yura Sokolov
15.04.2025 13:00, Andres Freund пишет: > 1) Increasing NUM_XLOGINSERT_LOCKS allows more contention on insertpos_lck and >spinlocks scale really badly under heavy contention > > I think we can redesign the mechanism so that there's an LSN ordered > ringbuffer of in-progress insertions, with the

Re: many sessions wait on LWlock WALWrite suddenly

2025-04-15 Thread Yura Sokolov
15.04.2025 13:53, Andres Freund пишет: > Hi, > > On 2025-04-15 13:44:09 +0300, Yura Sokolov wrote: >> 15.04.2025 13:00, Andres Freund пишет: >>> 1) Increasing NUM_XLOGINSERT_LOCKS allows more contention on insertpos_lck >>> and >>>spinlocks scale really badly under heavy contention >>> >>> I

Re: many sessions wait on LWlock WALWrite suddenly

2025-04-15 Thread Andres Freund
Hi, On 2025-04-15 13:44:09 +0300, Yura Sokolov wrote: > 15.04.2025 13:00, Andres Freund пишет: > > 1) Increasing NUM_XLOGINSERT_LOCKS allows more contention on insertpos_lck > > and > >spinlocks scale really badly under heavy contention > > > > I think we can redesign the mechanism so that t

Re: many sessions wait on LWlock WALWrite suddenly

2025-04-15 Thread Andres Freund
Hi, On 2025-04-15 12:16:40 +0300, Yura Sokolov wrote: > 11.04.2025 17:36, James Pang пишет: > >    pgv14.8 , during peak time, we suddenly see hundreds of active sessions > > waiting on LWlock  WALWrite at the same time, but we did not find any issue > > on storage . > > any suggestions ? > > No

Re: many sessions wait on LWlock WALWrite suddenly

2025-04-15 Thread Yura Sokolov
11.04.2025 17:36, James Pang пишет: >    pgv14.8 , during peak time, we suddenly see hundreds of active sessions > waiting on LWlock  WALWrite at the same time, but we did not find any issue > on storage . > any suggestions ? No real suggestions... There is single WALWrite lock. So only single pr

Re: many sessions wait on LWlock WALWrite suddenly

2025-04-11 Thread MichaelDBA
LWLock always shows up in the case where you have too many concurrent active connections.  Do a select from the pg_stat_activity table where state in ('idle in transaction','active'); Then count how many CPUs you have. If the sql query count returned is greater than 2-3 times the number of CPUs, y

Re: many sessions wait on LWlock WALWrite suddenly

2025-04-11 Thread Laurenz Albe
On Fri, 2025-04-11 at 22:36 +0800, James Pang wrote: > pgv14.8 , during peak time, we suddenly see hundreds of active sessions > waiting on LWlock > WALWrite at the same time, but we did not find any issue on storage . > any suggestions ?  You should get a reasonably sized (much smaller) connecti

Re: parallel pg_restore blocks on heavy random read I/O on all children processes

2025-04-09 Thread Hannu Krosing
You may be interested in a patch "Adding pg_dump flag for parallel export to pipes"[1] which allows using pipes in directory former parallel dump and restore. There the offsets are implicitly taken care of by the file system. [1] https://www.postgresql.org/message-id/CAH5HC97p4kkpikar%2BswuC0Lx4Y

RE: Question on what Duration in the log

2025-04-09 Thread Vitale, Anthony, Sony Music
-performance@lists.postgresql.org Subject: Re: Question on what Duration in the log EXTERNAL SENDER "Vitale, Anthony, Sony Music" writes: > In my PG Version 14 log I have set to log the duration of sql (Example of one > log row below). > 2025-04-08 07:31:57 UTC:XXX(55994):XXX:[8

Re: Question on what Duration in the log

2025-04-09 Thread Tom Lane
"Vitale, Anthony, Sony Music" writes: > In my PG Version 14 log I have set to log the duration of sql (Example of one > log row below). > 2025-04-08 07:31:57 UTC:XXX(55994):XXX:[8949]:LOG: duration: 6555.756 ms > execute S_381: {Actual Parameterized Sql executed} > This Shows the Execution of

Re: Memory Not Released After Batch Completion – Checkpointer/Background Writer Behavior , postgres 15

2025-04-07 Thread Евгений Чекан
Checkpointer process holds the shared_buffers memory and it is the expected behaviour. Even during idle, shared buffers are there for reads and future modifications. You have SB configured to be 10GB and it will stay allocated throughout the whole uptime of the server. This is totally OK and sho

Re: Memory Not Released After Batch Completion – Checkpointer/Background Writer Behavior , postgres 15

2025-04-06 Thread Motog Plus
Thank you so much for your prompt response. Regards, Ramzy On Sun, Apr 6, 2025, 19:35 Евгений Чекан wrote: > Checkpointer process holds the shared_buffers memory and it is the > expected behaviour. Even during idle, shared buffers are there for reads > and future modifications. You have SB conf

Re: parallel pg_restore blocks on heavy random read I/O on all children processes

2025-04-04 Thread Tom Lane
Dimitrios Apostolou writes: > On Thu, 20 Mar 2025, Tom Lane wrote: >> I am betting that the problem is that the dump's TOC (table of >> contents) lacks offsets to the actual data of the database objects, >> and thus the readers have to reconstruct that information by scanning >> the dump file. No

Re: Very slow query performance when using CTE

2025-04-03 Thread Chris Joysn
> CREATE STATISTICS st_simrun_component_metadata (dependencies) ON >> sim_run_id, key FROM sim_run_component_metadata; >> ANALYZE sim_run_component_metadata; >> >> When I run this query, no statistics are returned: >> >> SELECT m.* FROM pg_statistic_ext join pg_statistic_ext_data on (oid = >> stxoi

Re: partition table optimizer join cost misestimation

2025-04-03 Thread James Pang
Follow your suggestion to increase statistics_target (I increase target_statistic to 1000 for aa.mmm_id and cc.sss_id ,analyze tablea, tablec again), optimizer choose the good SQL plan. Thanks, James Andrei Lepikhov 於 2025年4月3日週四 下午4:44寫道: > On 4/3/25 10:04, James Pang wrote: > > one more

Re: Very slow query performance when using CTE

2025-04-03 Thread Michael Christofides
> > CREATE STATISTICS st_simrun_component_metadata (dependencies) ON > sim_run_id, key FROM sim_run_component_metadata; > ANALYZE sim_run_component_metadata; > > When I run this query, no statistics are returned: > > SELECT m.* FROM pg_statistic_ext join pg_statistic_ext_data on (oid = > stxoid), >

Re: partition table optimizer join cost misestimation

2025-04-02 Thread Andrei Lepikhov
On 4/3/25 02:46, James Pang wrote: Andrei,    Yes, from explain output, since optimizer already get the merge_append cost but not take account into total cost, that make a big difference.  I shared table DDLs and explain analyze,buffers output , I think  the data maybe generated by other way

Re: partition table optimizer join cost misestimation

2025-04-02 Thread James Pang
Andrei, Yes, from explain output, since optimizer already get the merge_append cost but not take account into total cost, that make a big difference. I shared table DDLs and explain analyze,buffers output , I think the data maybe generated by other way to reproduce this issue. sorry for not sh

Re: partition table optimizer join cost misestimation

2025-04-02 Thread Andrei Lepikhov
On 4/2/25 12:18, James Pang wrote: Hi,    Postgresq v14.8, we found optimizer doest not take "merge append" cost into sql plan total cost and then make a bad sql plan. attached please find details. I suppose there is a different type of issue. MegeJoin sometimes doesn't need to scan the whole

Re: Very slow query performance when using CTE

2025-04-01 Thread Renan Alves Fonseca
Chris Joysn writes: > Hello, > I have an issue when using CTEs. A query, which consists of multiple CTEs, > runs usually rather fast (~5s on my > environment). But it turned out that using one CTE can lead to execution > times of up to one minute. > That CTE is used two times within the query.

Re: Very slow query performance when using CTE

2025-04-01 Thread Tom Lane
Chris Joysn writes: > I have an issue when using CTEs. A query, which consists of multiple CTEs, > runs usually rather fast (~5s on my environment). But it turned out that > using one CTE can lead to execution times of up to one minute. > That CTE is used two times within the query. Try labeling

Re: parallel pg_restore blocks on heavy random read I/O on all children processes

2025-03-28 Thread Dimitrios Apostolou
Sent proper patch to pgsql-hackers mailing list, at: https://www.postgresql.org/message-id/flat/2edb7a57-b225-3b23-a680-62ba90658fec%40gmx.net

Re: parallel pg_restore blocks on heavy random read I/O on all children processes

2025-03-27 Thread Dimitrios Apostolou
Hello again, I traced the seeking-reading behaviour of parallel pg_restore inside _skipData() when called from _PrintTocData(). Since most of today's I/O devices (both rotating and solid state) can read 1MB faster sequentially than it takes to seek and read 4KB, I tried the following change: dif

Re: parallel pg_restore blocks on heavy random read I/O on all children processes

2025-03-24 Thread Dimitrios Apostolou
On Sun, 23 Mar 2025, Tom Lane wrote: Dimitrios Apostolou writes: On Thu, 20 Mar 2025, Tom Lane wrote: I am betting that the problem is that the dump's TOC (table of contents) lacks offsets to the actual data of the database objects, and thus the readers have to reconstruct that information by

Re: parallel pg_restore blocks on heavy random read I/O on all children processes

2025-03-23 Thread Dimitrios Apostolou
On Thu, 20 Mar 2025, Tom Lane wrote: I am betting that the problem is that the dump's TOC (table of contents) lacks offsets to the actual data of the database objects, and thus the readers have to reconstruct that information by scanning the dump file. Normally, pg_dump will back-fill offset da

Re: parallel pg_restore blocks on heavy random read I/O on all children processes

2025-03-20 Thread Tom Lane
Dimitrios Apostolou writes: > I noticed the weird behaviour that doing a pg_restore of a huge database > dump, leads to constant read I/O (at about 15K IOPS from the NVMe drive > that has the dump file) for about one hour. I believe it happens with > any -j value>=2. > In particular, I get output

Re: PostgreSQL 16 - Detach partition with FK - Error

2025-03-19 Thread Álvaro Herrera
On 2025-Mar-19, Adithya Kumaranchath wrote: > --Scenario 1: Detach parent partition > alter table table1 detach partition table1_202402 > > ERROR: Key (parnt_id, archive_dt)=(a6955e39-22eb-48e2-bd59-cad9650a4f6b, > 2024-02-11) is still referenced from table "table1_child_202402".removing > part

Re: Bad perf when using DECLARE CURSOR on big table

2025-03-18 Thread kimaidou
Indeed, increasing cursor_tuple_fraction TO 0.2 did the trick. Thanks for the hint Tom ! Le lun. 17 mars 2025 à 16:22, Tom Lane a écrit : > kimaidou writes: > > I have seen that this DECLARE has bad perf compared to a simple SQL > query : > > > Simple SQL query > > = > > https://explain.dal

Re: Efficient pagination using multi-column cursors

2025-03-17 Thread large . goose2829
Folks, thanks everyone for the valuable inputs, I think I more-or-less understand now what the options are for my particular problem. On Wed, Feb 26, 2025, at 17:14, Peter Geoghegan wrote: > On Wed, Feb 26, 2025 at 10:40 AM wrote: > > Does this mean that it is not possible to come up with a plan

Re: Bulk DML performance

2025-03-17 Thread Renan Alves Fonseca
Hi, Here are some observations. Em seg., 17 de mar. de 2025 às 09:19, escreveu: > > PostgreSQL has a lot of overhead per row. > > Okay, thanks. I'm not actually too worried about this since in my > scenario, each row is about 1.5 kB, so the % overhead is negligible. > > > It is probably not the

Re: Bad perf when using DECLARE CURSOR on big table

2025-03-17 Thread Tom Lane
kimaidou writes: > I have seen that this DECLARE has bad perf compared to a simple SQL query : > Simple SQL query > = > https://explain.dalibo.com/plan/042bc4dc2449adfe > 96ms > DECLARE CURSOR for the same query > = > https://explain.dalibo.com/plan/bh83fc0db500a79g# > 171 031 ms !! Rai

Re: Bulk DML performance

2025-03-17 Thread Greg Sabino Mullane
On Mon, Mar 17, 2025 at 4:19 AM wrote: Can you help me understand why performing 3 million lookups on a b-tree > index with all pages cached in memory takes so long? It's not the lookup, it's writing the 3 million rows (and in this particular upsert case, deleting 3 million, then inserting 3 m

Re: Bulk DML performance

2025-03-17 Thread Álvaro Herrera
On 2025-Mar-13, bill.po...@ymail.com wrote: > I need to perform a merge (INSERT ... ON CONFLICT ... DO UPDATE) on > the data, so sadly I cannot use COPY. > > I have discovered that for some reason, performing the original insert > without the ON CONFLICT statement is twice as fast as performing t

RE: Bulk DML performance

2025-03-17 Thread bill.poole
parallelized in the future. Best, Bill From: Renan Alves Fonseca Sent: Friday, 14 March 2025 5:25 AM To: bill.po...@ymail.com Cc: pgsql-performa...@postgresql.org Subject: Re: Bulk DML performance Hello, Regarding the additional time for UPDATE, you can try the following: CREATE

RE: Bulk DML performance

2025-03-17 Thread bill.poole
est4.id = generate_series This also now means that updating 3 million rows takes 4x longer than inserting those rows. Do we expect updates to be 4x slower than inserts? Regards, Bill -Original Message- From: Laurenz Albe Sent: Thursday, 13 March 2025 7:28 PM To: bill.po...@ymail

RE: Bulk DML performance

2025-03-17 Thread bill.poole
in the > database. Sadly, I cannot do that because I need all rows to be inserted in a single database transaction, which I cannot do over multiple database connections. Regards, Bill -Original Message- From: Laurenz Albe Sent: Thursday, 13 March 2025 5:21 PM To: bill.po...

Re: Bulk DML performance

2025-03-13 Thread Renan Alves Fonseca
Hello, Regarding the additional time for UPDATE, you can try the following: CREATE TABLE test3 ( id bigint PRIMARY KEY, text1 text ) WITH (fillfactor=30); See: https://www.postgresql.org/docs/17/storage-hot.html My local test gives me almost the same time for INSERT (first insert) and UPDATES

Re: Bulk DML performance

2025-03-13 Thread Laurenz Albe
On Thu, 2025-03-13 at 12:28 +0100, I wrote: > Then the best you can do is to use COPY rather than INSERT. > It will perform better (but now vastly better). Sorry, I meant "*not* vastly better". Yours, Laurenz Albe

Re: Bulk DML performance

2025-03-13 Thread Laurenz Albe
On Thu, 2025-03-13 at 18:13 +0800, bill.po...@ymail.com wrote: > > it is noteworthy that inserting 27 MB of data into a newly created table > creates > 191 MB of data including the index and 127 MB of data excluding the index. PostgreSQL has a lot of overhead per row. > > Can you help me under

Re: Bulk DML performance

2025-03-13 Thread Laurenz Albe
On Thu, 2025-03-13 at 12:05 +0800, bill.po...@ymail.com wrote: > The following INSERT .. ON CONFLICT (…) DO UPDATE statement inserts/updates > 3 million rows with only 9 bytes per row and takes about 8 seconds on first > run (to insert the rows) and about 14 seconds on subsequent runs (to update >

Re: Slow performance of collate "en_US.utf8"

2025-02-28 Thread Joe Conway
On 2/28/25 17:49, Thomas Munro wrote: On Sat, Mar 1, 2025 at 9:03 AM Joe Conway wrote: On 2/28/25 09:16, Laurenz Albe wrote: > On Thu, 2025-02-27 at 16:54 +0300, Alexey Borschev wrote: >> I see poor performance of text sorting of collate "en_US.utf8" in PG 17.4. > > I'd say that you would have

Re: [PERFORM] Unused index influencing sequential scan plan

2025-02-28 Thread Thom Brown
mns. That is, > > > > create index i on t (foo(x), x); > > > > The planner isn't terribly bright about this, but it will use that index > > for a query that only requires foo(x), and it won't re-evaluate foo() > > (though I think it will cost the p

Re: Slow performance of collate "en_US.utf8"

2025-02-28 Thread Thomas Munro
On Sat, Mar 1, 2025 at 9:03 AM Joe Conway wrote: > On 2/28/25 09:16, Laurenz Albe wrote: > > On Thu, 2025-02-27 at 16:54 +0300, Alexey Borschev wrote: > >> I see poor performance of text sorting of collate "en_US.utf8" in PG 17.4. > > > > I'd say that you would have to complain to the authors of t

Re: Slow performance of collate "en_US.utf8"

2025-02-28 Thread Joe Conway
On 2/28/25 09:16, Laurenz Albe wrote: On Thu, 2025-02-27 at 16:54 +0300, Alexey Borschev wrote: I see poor performance of text sorting of collate "en_US.utf8" in PG 17.4. I'd say that you would have to complain to the authors of the GNU C library, which provides this collation. Yep -- glibc

Re: Slow performance of collate "en_US.utf8"

2025-02-28 Thread Laurenz Albe
On Thu, 2025-02-27 at 16:54 +0300, Alexey Borschev wrote: > I see poor performance of text sorting of collate "en_US.utf8" in PG 17.4. I'd say that you would have to complain to the authors of the GNU C library, which provides this collation. Yours, Laurenz Albe

Re: Slow performance of collate "en_US.utf8"

2025-02-28 Thread Achilleas Mantzios - cloud
Hi Alexey On 2/27/25 15:54, Alexey Borschev wrote: Hi everyone! I see poor performance of text sorting of collate "en_US.utf8" in PG 17.4. Test query: explain (analyze, costs, buffers, verbose) SELECT ('БвЁжЫйяЙSёYz&$' || gen.id) collate "en_US.utf8" FROM generate_series(1, 1) AS gen(

Re: Efficient pagination using multi-column cursors

2025-02-26 Thread Laurenz Albe
On Wed, 2025-02-26 at 15:27 +0100, large.goose2...@salomvary.com wrote: > I am working on optimizing a query that attempts to efficiently paginate > through a large table using multi-column "cursors" aka. the "seek method" > (as described in detail here: > https://use-the-index-luke.com/sql/partial

Re: Efficient pagination using multi-column cursors

2025-02-26 Thread Peter Geoghegan
On Wed, Feb 26, 2025 at 10:40 AM wrote: > My understanding is that given this "mixed order" index: > CREATE INDEX data_index_desc ON data (col_1, col_2 DESC, col_3); > > The index tuples are physically organized exactly in this way: > ORDER BY col_1, col_2 DESC, col_3 > > So that I should be able

Re: Efficient pagination using multi-column cursors

2025-02-26 Thread large . goose2829
Thanks for the insights! On Wed, Feb 26, 2025, at 16:05, Peter Geoghegan wrote: > On Wed, Feb 26, 2025 at 9:29 AM wrote: > > Without being familiar the internals of the query planner, I *think* there > > *should* be a way to come up with WHERE conditions that results in the > > "perfect" plan.

Re: Efficient pagination using multi-column cursors

2025-02-26 Thread Peter Geoghegan
On Wed, Feb 26, 2025 at 9:29 AM wrote: > Without being familiar the internals of the query planner, I *think* there > *should* be a way to come up with WHERE conditions that results in the > "perfect" plan. There is a fundamental trade-off involved here. The simple, fast "WHERE (col_1, col_2, c

Re: Unfortunate Nested Loop + Missing Autovacuum

2025-02-24 Thread Lincoln Swaine-Moore
e the case that those are running frequently in our case. Ditto re: database restarts--this is out of my control because it would be DigitalOcean's doing, but I don't see any evidence of it. Nor anything amiss in \dt+, unfortunately. I'll try to figure out if I can get access to th

Re: Unfortunate Nested Loop + Missing Autovacuum

2025-02-24 Thread Andrei Lepikhov
On 23/2/2025 23:49, Lincoln Swaine-Moore wrote: Thanks for the reply! I tried the analysis on our much shorter staging table and it did change the plan. I haven’t tried it on the production ones because my understanding is that the autovacuum process is gentler with resource consumption and I d

Re: Unfortunate Nested Loop + Missing Autovacuum

2025-02-23 Thread Jeff Janes
On Sun, Feb 23, 2025 at 5:49 PM Lincoln Swaine-Moore wrote: > Thanks for the reply! I tried the analysis on our much shorter staging > table and it did change the plan. I haven’t tried it on the production ones > because my understanding is that the autovacuum process is gentler with > resource c

Re: Unfortunate Nested Loop + Missing Autovacuum

2025-02-23 Thread Lincoln Swaine-Moore
Thanks for the reply! I tried the analysis on our much shorter staging table and it did change the plan. I haven’t tried it on the production ones because my understanding is that the autovacuum process is gentler with resource consumption and I didn’t want to gum things up in the meantime. But tha

Re: Unfortunate Nested Loop + Missing Autovacuum

2025-02-22 Thread Andrei Lepikhov
On 22/2/2025 00:46, Lincoln Swaine-Moore wrote: So, obviously there's a statistics problem, which led me to realize that actually these tables have *never* been autovacuumed/analyzed according to pg_stat_user_tables. I'm using a managed database which makes it a little tricky to debug, but all

  1   2   3   4   5   6   7   8   9   10   >