Re: [EXT] Re: How to delete column level Stats/Histogram

2024-05-30 Thread David Rowley
(please keep communication on the list) On Fri, 31 May 2024 at 13:43, Wong, Kam Fook (TR Technology) wrote: > > Silly question why did I run into this problem below? Will the autovacuum > analyze abc reset it back which I don't want it to. > > DELETE FROM pg_statistic WHERE starelid =

Re: How to delete column level Stats/Histogram

2024-05-30 Thread David Rowley
On Fri, 31 May 2024 at 09:28, Wong, Kam Fook (TR Technology) wrote: > Is there a way to delete a specific column level stats/histogram. The > following approach does not work. > > alter table abc alter column bg_org_partner set statistics 0; > analyze abc; You'd have to: DELETE FROM

Re: Use of inefficient index in the presence of dead tuples

2024-05-29 Thread David Rowley
On Wed, 29 May 2024 at 12:53, Tom Lane wrote: > It would be interesting to see a non-artificial example that took > into account when the last auto-vacuum and auto-analyze really > happened, so we could see if there's any less-fragile way of > dealing with this situation. I think we need to find

Re: problem with query

2024-05-21 Thread David Rowley
On Tue, 21 May 2024 at 23:14, Laurenz Albe wrote: > We still don't know the query. hmm, it was posted on this thread: https://postgr.es/m/CAGB0_6600w5C=hvhgfmwcqo9bcwcg+3s0pxxuoqv48nlqtp...@mail.gmail.com David

Re: problem with query

2024-05-21 Thread David Rowley
On Mon, 20 May 2024 at 23:09, Sašo Gantar wrote: > what helps is > SET enable_nestloop = off; > query takes less then 2seconds > but it's probably not a good idea to change this flag Looks like it's slow due to a bad selectivity estimate on the join between pgn and pgc. This results in: ->

Re: signal 11: Segmentation fault ; query constraint list; pg16.3

2024-05-20 Thread David Rowley
On Mon, 20 May 2024 at 22:32, milist ujang wrote: > > postgres 16.1; rocky 9.3 > > when connect to database postgres this query is OK, but run on user database, > got segmentation fault. I tried your query on 16.1 and I'm unable to reproduce the crash. Are you able to recreate this on a

Re: problem with query

2024-05-15 Thread David Rowley
On Wed, 15 May 2024 at 21:08, Sašo Gantar wrote: > this query takes more than 8 seconds, > if i remove "AND ((pgn.nspname='servicedesk'))" and test it, it takes <1s Including the EXPLAIN rather than EXPLAIN (ANALYZE, BUFFERS) isn't very useful as there's no way to tell if the planner's estimates

Re: SELECT DISTINCT chooses parallel seqscan instead of indexscan on huge table with 1000 partitions

2024-05-13 Thread David Rowley
On Tue, 14 May 2024 at 02:07, Dimitrios Apostolou wrote: > > On Tue, 14 May 2024, David Rowley wrote: > > Parallel Append can also run in a way that the Append child nodes will > > only get 1 worker each. > > How can I tell which case it is, from the EXPLAIN output (for ex

Re: SELECT DISTINCT chooses parallel seqscan instead of indexscan on huge table with 1000 partitions

2024-05-13 Thread David Rowley
On Tue, 14 May 2024 at 01:52, Dimitrios Apostolou wrote: > > On Tue, 14 May 2024, David Rowley wrote: > > The query does contain an ORDER BY, so if the index is not chosen to > > provide pre-sorted input, then something has to put the results in the > > correct order bef

Re: SELECT DISTINCT chooses parallel seqscan instead of indexscan on huge table with 1000 partitions

2024-05-13 Thread David Rowley
On Tue, 14 May 2024 at 00:46, Dimitrios Apostolou wrote: > > On Mon, 13 May 2024, Dimitrios Apostolou wrote: > > > On Sat, 11 May 2024, David Rowley wrote: > >> If you look at [1], it says "Tuples changed in partitions and > >> inheritance children do n

Re: SELECT DISTINCT chooses parallel seqscan instead of indexscan on huge table with 1000 partitions

2024-05-13 Thread David Rowley
On Tue, 14 May 2024 at 00:41, Dimitrios Apostolou wrote: > > On Sat, 11 May 2024, David Rowley wrote: > > It will. It's just that Sorting requires fetching everything from its > > subnode. > > Isn't it plain wrong to have a sort step in the plan than? The different > p

Re: SELECT DISTINCT chooses parallel seqscan instead of indexscan on huge table with 1000 partitions

2024-05-13 Thread David Rowley
On Tue, 14 May 2024 at 00:28, Dimitrios Apostolou wrote: > > On Sat, 11 May 2024, David Rowley wrote: > > > On Sat, 11 May 2024 at 13:33, Tom Lane wrote: > >> I do kind of wonder why it's producing both a hashagg and a Unique > >> step --- seems l

Re: SELECT DISTINCT chooses parallel seqscan instead of indexscan on huge table with 1000 partitions

2024-05-10 Thread David Rowley
On Sat, 11 May 2024 at 13:33, Tom Lane wrote: > I do kind of wonder why it's producing both a hashagg and a Unique > step --- seems like it should do one or the other. It still needs to make the duplicate groups from parallel workers unique. David

Re: SELECT DISTINCT chooses parallel seqscan instead of indexscan on huge table with 1000 partitions

2024-05-10 Thread David Rowley
On Sat, 11 May 2024 at 13:11, Dimitrios Apostolou wrote: > Indeed that's an awful estimate, the table has more than 1M of unique > values in that column. Looking into pg_stat_user_tables, I can't see the > partitions having been vacuum'd or analyzed at all. I think they should > have been

Re: Performance degradation after upgrading from 9.5 to 14

2024-04-22 Thread David Rowley
On Tue, 23 Apr 2024 at 07:01, Олександр Янін wrote: > Try setting enable_memoize to off. > Our practice has shown that enabling this parameter by default often resulted > in less than optimal query plans in the cache. It would be good to see a thread opened with details on this. I understand

Re: CLUSTER vs. VACUUM FULL

2024-04-21 Thread David Rowley
On Mon, 22 Apr 2024 at 12:16, Ron Johnson wrote: > > On Sun, Apr 21, 2024 at 6:45 PM Tom Lane wrote: >> >> Ron Johnson writes: >> > Why is VACUUM FULL recommended for compressing a table, when CLUSTER does >> > the same thing (similarly doubling disk space), and apparently runs just as >> >

Re: Foreign Key error between two partitioned tables

2024-04-19 Thread David Rowley
On Fri, 19 Apr 2024 at 05:48, Michael Corey wrote: > ALTER TABLE ONLY par_log_file > ADD CONSTRAINT plf_pk PRIMARY KEY (par_file_id); > ALTER TABLE par_log_definition > ADD CONSTRAINT pld_fk FOREIGN KEY (par_file_id) REFERENCES > par_log_file(par_file_id); > I receive the following

Re: Assistance needed for the query execution in non-public schema

2024-04-16 Thread David Rowley
On Tue, 16 Apr 2024 at 23:00, Sasmit Utkarsh wrote: > msshctd=> SELECT setval(pg_get_serial_sequence('mqa_flfo_cstr', 'id'), > coalesce(MAX(id), 1)) from mqa_flfo_cstr; > ERROR: permission denied for sequence mqa_flfo_cstr_id_seq > msshctd=> SELECT setval(pg_get_serial_sequence('mqa_ffp_nval',

Re: constant crashing

2024-04-14 Thread David Rowley
On Mon, 15 Apr 2024 at 02:25, jack wrote: > Then I tried to update the data in blocks of 100,000 records and it crashed 4 > times on 4 different blocks. > So I updated the first crashed block down to the a block of 10 records, until > it crashed. > Then I updated each of the 10 records

Re: effect on planner of turning a subquery to a table, sql function returning table

2024-04-12 Thread David Rowley
On Fri, 12 Apr 2024 at 23:27, Thierry Henrio wrote: > JIT: >Functions: 36 >Options: Inlining true, Optimization true, Expressions true, Deforming true >Timing: Generation 1.949 ms, Inlining 28.891 ms, Optimization 207.481 ms, > Emission 134.907 ms, Total 373.228 ms > Execution

Re: effect on planner of turning a subquery to a table, sql function returning table

2024-04-12 Thread David Rowley
On Fri, 12 Apr 2024 at 22:33, Thierry Henrio wrote: > Here is (B) execution plan: To be able to determine if a plan is good or bad, we'd really need to see the EXPLAIN (ANALYZE) output rather than just the EXPLAIN ouput. To save a possible roundtrip; SET track_io_timing = ON; EXPLAIN (ANALYZE,

Re: Seeing high query planning time on Azure Postgres Single Server version 11.

2024-03-14 Thread David Rowley
On Fri, 15 Mar 2024 at 08:01, hassan rafi wrote: > We have migrated to postgres version 16.1, but still due to very high update > activity on our DB, we are seeing elevated response times, though now the > planning time is less. >Buffers: shared hit=33359 read=6590 dirtied=9379 >

Re: select results on pg_class incomplete

2024-03-14 Thread David Rowley
On Fri, 15 Mar 2024 at 07:13, David G. Johnston wrote: > On Thu, Mar 14, 2024, 11:08 Thiemo Kellner > wrote: >> >> Thanks for the enlightenment. A pity. I suppose, there is no working >> around this? > > Write a script to do the query in a loop on all databases - that catalog is > global.

Re: Behavior of debug_parallel_query=regress

2024-02-27 Thread David Rowley
On Tue, 27 Feb 2024 at 23:23, Rafsun Masud Prince wrote: > I am looking for a combination of the 'off' and 'regress' state, which is: > use parallel if improves performance + suppress context line (if > parallel is used) > > Our project, Apache AGE, has a regression test for cypher MATCH

Re: Partitioning, Identity and Uniqueness (given pg 16 changes)

2024-02-19 Thread David Rowley
On Mon, 19 Feb 2024 at 22:07, Darryl Green wrote: > > On Mon, 19 Feb 2024 at 14:23, David Rowley wrote: > > > > On Mon, 19 Feb 2024 at 16:32, Darryl Green wrote: > > > 2) It would be nice to be able to specify the id as pk on the table being > > > partition

Re: Partitioning, Identity and Uniqueness (given pg 16 changes)

2024-02-18 Thread David Rowley
On Mon, 19 Feb 2024 at 16:32, Darryl Green wrote: > 2) It would be nice to be able to specify the id as pk on the table being > partitioned (as it was in the non-partitioned definition of the table) once > to document and enforce that the partitions simply inherit the id pk. This > would seem

Re: Why doesn't Postgres apply limit on groups when retrieving N results per group using WHERE IN + ORDER BY

2024-02-07 Thread David Rowley
On Tue, 6 Feb 2024 at 01:23, Sean v wrote: > SELECT "orders".* > FROM "orders" > WHERE (user_id IN ?, ?, ?) > ORDER BY "orders"."created_at" LIMIT 50 > > I have two indexes - `(user_id)` and `(user_id, created_at)`. Only the first > index is ever used with this query. > I imagined that it would

Re: Query performance in 9.6.24 vs 14.10

2024-01-28 Thread David Rowley
On Mon, 29 Jan 2024 at 07:37, Ron Johnson wrote: > 08 9.6.24 1,142.164 1,160.801 1,103.716 1,249.852 1,191.081 > 14.10 159.354 155.111 155.111 162.797 158.157 86.72% > Your speedup per cent calculation undersells PG14 by quite a bit. I'd call that an increase of ~639% rather than 86.72%. I

Re: vacuum visibility relevance

2023-12-03 Thread David Rowley
On Mon, 4 Dec 2023 at 15:08, senor wrote: > PG version 11.4 You're missing out on over 3 years of bug fixes running that version. I see you're planning an upgrade to v15. You should update to 11.22 in the meantime. That's the latest and final version of v11. > If I start a vacuum on this table

Re: Understanding partial index selection

2023-11-28 Thread David Rowley
On Wed, 29 Nov 2023 at 11:23, Owen Nelson wrote: > "message_payload_not_null_pidx" btree (expiration) WHERE payload IS NOT > NULL > I periodically run a query like this: > ``` > UPDATE message SET payload = NULL WHERE id IN ( > Update on message (cost=1773.41..44611.36 rows=5000

Re: Inefficient query plan for SELECT ... EXCEPT ...

2023-10-31 Thread David Rowley
On Wed, 1 Nov 2023 at 11:41, Dimitrios Apostolou wrote: > I'm wondering why the planner doesn't see that the left table is very small > and follow a different path. > From an abstract computer science POV, I would > > 1. sort the left table (the right one is already indexed) > 2. "merge" the

Re: partitioning

2023-10-23 Thread David Rowley
On Tue, 24 Oct 2023 at 10:39, Torsten Förtsch wrote: > Then I added this constraint to the small table: > > ALTER TABLE original_small_table > ADD CONSTRAINT partition_boundaries > CHECK((false, '-infinity')<=(is_sold, purchase_time) > AND (is_sold, purchase_time)<(false, 'infinity')) > NOT

Re: pg_stat_statements IN problem

2023-10-03 Thread David Rowley
On Tue, 3 Oct 2023 at 22:03, wrote: > So one last question, should I expect the patch to land in version 17 only or > is there chance that it will also be in lower versions right away? It wouldn't ever be put into anything earlier than 17. David

Re: pg_stat_statements IN problem

2023-10-03 Thread David Rowley
On Tue, 3 Oct 2023 at 21:07, wrote: > P.S.: The only serious discussion I was able to find about it was from 2015 > here, everyone basically stating that the improvement would be useful. > https://postgrespro.com/list/thread-id/1880012 There is some active discussion and a patch which aims to

Re: Access plan selection logic PG9.2 -> PG14

2023-09-15 Thread David Rowley
On Fri, 15 Sept 2023 at 21:13, Ryo Yamaji (Fujitsu) wrote: > The following example shows a table with 1 million tuples: > * The cost of using PK was higher than the cost of using user index. > * It was faster to use PK. > > Index Scan using tbl_ix1 on tbl (cost=0.43..0.67 rows=1 width=61)

Re: Access plan selection logic PG9.2 -> PG14

2023-09-14 Thread David Rowley
On Fri, 15 Sept 2023 at 01:36, Ryo Yamaji (Fujitsu) wrote: > Question: > I am assuming that the version upgrade has changed the behavior of the > planner. Is this correct? It's possible. 9.2 was a long time ago. It would be quite a bit of work to determine if this is the case. You could

Re: Query performance going from Oracle to Postgres

2023-09-07 Thread David Rowley
On Thu, 7 Sept 2023 at 19:17, Peter Geoghegan wrote: > It seems likely that the problem here is that some of the predicates > appear as so-called "Filter:" conditions, as opposed to true index > quals. hmm, if that were true we'd see "Rows Removed by Filter" in the explain analyze. I think all

Re: Query performance going from Oracle to Postgres

2023-09-06 Thread David Rowley
On Thu, 7 Sept 2023 at 11:14, Dirschel, Steve wrote: > select count(historyeve0_.HISTORY_EVENT_SID) as col_0_0_ from > hist28.history_event_display_timestamp_20230301 historyeve0_ where > historyeve0_.IS_DELETED=0 > history_event_sid | character varying(32) | | not >

Re: Question on Partition key

2023-09-04 Thread David Rowley
On Sun, 3 Sept 2023 at 23:52, veem v wrote: > Additionally, is it true that optimizer will also get fooled on getting the > math correct during cardinality estimates, as because there is a big > difference between , comparing or substracting, two date values VS two number > values. And storing

Re: Partitionwise JOIN scanning all partitions, even unneeded ones

2023-08-30 Thread David Rowley
On Thu, 31 Aug 2023 at 07:55, Dimitrios Apostolou wrote: > I'd appreciate help on whether it's a real issue, and if it's unknown I > can forward this to the psql-bugs mailing list. I'd also appreciate any > critique on the clarity of my description and on my schema and queries, > since I'm new to

Re: pb with big volumes

2023-08-13 Thread David Rowley
On Mon, 14 Aug 2023 at 11:14, Marc Millas wrote: > that's exactly my question. > does the analyze buffers data, generated when track_io_timing is on, keep > track of multiple reloads of the same data while executing one operation ? Yes, the timing for reads will include the time it took to

Re: pb with big volumes

2023-08-10 Thread David Rowley
On Fri, 11 Aug 2023 at 13:54, Ron wrote: > Wouldn't IO contention make for additive timings instead of exponential? No, not necessarily. Imagine one query running that's doing a parameterised nested loop join resulting in the index on the inner side being descended several, say, million times.

Re: How to improve the performance of my SQL query?

2023-07-26 Thread David Rowley
On Wed, 26 Jul 2023 at 19:46, gzh wrote: > QUERY PLAN (enable_seqscan=on) > Execution Time: 167183.133 ms > QUERY PLAN (enable_seqscan=off) > Execution Time: 22320.153 ms effective_cache_size and random_page_cost are the settings you should be adjusting to coax the planner into using the

Re: How to improve the performance of my SQL query?

2023-07-25 Thread David Rowley
On Fri, 21 Jul 2023 at 13:44, gzh wrote: > > The definitions of the columns used in SQL are as follows. > > TBL_SHA > ms_cd character(6) NOT NULL -- PRIMARY KEY > et_cd character(8) > etrys character(8) > explain analyze > select COUNT(ET_CD) > from TBL_SHA > WHERE TBL_SHA.MS_CD = 'MLD009'

Re: Re: How to improve the performance of my SQL query?

2023-07-25 Thread David Rowley
On Thu, 20 Jul 2023 at 23:36, gzh wrote: > > > Thank you very much for taking the time to reply to my question. > > > Sorry, I provided incorrect information. > > The index also does not work in the following query statement. > > > > select COUNT(ET_CD) > > from TBL_SHA > > WHERE MS_CD = '009' >

Re: Effects of dropping a large table

2023-07-19 Thread David Rowley
On Wed, 19 Jul 2023 at 07:41, Rob Sargent wrote: > You might consider deleting portions of the table in separate (consecutive) > batches (maybe 5% per delete). And then truncate table is not logged so that > might be an alternative. Can you explain why this would be a useful thing to do? It

Re: Query take a long time and use no index

2023-07-18 Thread David Rowley
On Mon, 17 Jul 2023 at 21:13, basti wrote: > volkszaehler=# explain analyze SELECT COUNT(DISTINCT DATE_TRUNC('day', > TIMESTAMP 'epoch' + timestamp * INTERVAL '1 millisecond')) FROM data > WHERE channel_id = 5 AND timestamp >= 0; Alternatively, you could express this as: SELECT COUNT(*) FROM

Re: Query take a long time and use no index

2023-07-18 Thread David Rowley
On Tue, 18 Jul 2023 at 06:19, basti wrote: > > Thanks a lot tomas, i will try it. > > I have find out that there is a 'aggregation' function in the frontend. > But this is MySQL specific and I have no idea the transform it to postgres. > > It looks like: > 'REPLACE INTO aggregate (channel_id,

Re: need explanation about an explain plan

2023-07-02 Thread David Rowley
On Fri, 30 Jun 2023 at 00:42, Umut TEKİN wrote: > @Marc, I think there is no problem.Even though it says it is filtered by > ladate, it is not. Because of the partition. > As you can see for each index scan it uses a different partition and those > partition boundaries are already specified

Re: Helping planner to chose sequential scan when it improves performance

2023-06-25 Thread David Rowley
On Wed, 14 Jun 2023 at 07:28, Patrick O'Toole wrote: > Maybe we are barking up the wrong tree with the previous questions. Are there > other configuration parameters we should consider first to improve > performance in situations like the one illustrated? random_page_cost and

Re: bug or lacking doc hint

2023-06-25 Thread David Rowley
On Mon, 26 Jun 2023 at 03:02, Marc Millas wrote: > When I ask this list, David Rowley suggest to rewrite the SQL, replacing the > OR by a union. > > Fine, this do work, even if a bit complex as the original SQL was a set of > intricate joins. > > > So, either this beh

Re: pb with join plan

2023-06-20 Thread David Rowley
On Wed, 21 Jun 2023 at 10:26, Marc Millas wrote: > link to the plan with both clauses ORed (the one not finishing) > https://explain.depesz.com/s/jHO2 I'd go with the UNION or UNION ALL idea I mentioned earlier. David

Re: pb with join plan

2023-06-20 Thread David Rowley
On Wed, 21 Jun 2023 at 08:34, Marc Millas wrote: > > On Tue, Jun 20, 2023 at 10:14 PM David Rowley wrote: >> >> On Wed, 21 Jun 2023 at 07:42, Marc Millas wrote: >> > But if I do the same with clause one OR clause 2, I have to kill the >> > request after an

Re: pb with join plan

2023-06-20 Thread David Rowley
On Wed, 21 Jun 2023 at 07:42, Marc Millas wrote: > But if I do the same with clause one OR clause 2, I have to kill the request > after an hour, seeing the filesystem showing more than 140 Mb of increased > usage. > So, before providing the 3 explain plans (I must anonymize everything, so >

Re: Is there any good optimization solution to improve the query efficiency?

2023-06-05 Thread David Rowley
On Mon, 5 Jun 2023 at 18:56, gzh wrote: > I'm running into some performance issues with my SQL query. > The following SQL query is taking a long time to execute. > -> Hash Join (cost=253388.44..394112.07 rows=1 width=56) (actual > time=1197.484..2954.084 rows=330111 loops=1) >

Re: Pg 16: will pg_dump & pg_restore be faster?

2023-06-04 Thread David Rowley
On Sat, 3 Jun 2023 at 00:14, Jonathan S. Katz wrote: > Typically once a release announcement is out, we'll only edit it if it's > inaccurate. I don't think the statement in the release announcement is > inaccurate, as it specifies that concurrent bulk loading is faster. Understood. I had

Re: speed up full table scan using psql

2023-05-30 Thread David Rowley
On Wed, 31 May 2023 at 16:26, Lian Jiang wrote: > I am using psql to periodically dump the postgres tables into json files > which are imported into snowflake. For large tables (e.g. 70M rows), it takes > hours for psql to complete. Using spark to read the postgres table seems not > to work as

Re: Pg 16: will pg_dump & pg_restore be faster?

2023-05-30 Thread David Rowley
On Wed, 31 May 2023 at 14:11, Bruce Momjian wrote: > > On Wed, May 31, 2023 at 02:05:10PM +1200, David Rowley wrote: > > "Relation extensions have been improved allowing faster bulk loading > > of data using COPY. These improvements are more significant when

Re: Pg 16: will pg_dump & pg_restore be faster?

2023-05-30 Thread David Rowley
On Wed, 31 May 2023 at 13:13, Bruce Momjian wrote: > There is no mention of concurrency being a requirement. Is it wrong? I > think there was a question of whether you had to add _multiple_ blocks > ot get a benefit, not if concurrency was needed. This email about the > release notes didn't

Re: Pg 16: will pg_dump & pg_restore be faster?

2023-05-30 Thread David Rowley
On Wed, 31 May 2023 at 08:54, Ron wrote: > https://www.postgresql.org/about/news/postgresql-16-beta-1-released-2643/ > says "PostgreSQL 16 can also improve the performance of concurrent bulk > loading of data using COPY up to 300%." > > Since pg_dump & pg_restore use COPY (or something very

Re: [E] Re: parallel aggregation

2023-04-12 Thread David Rowley
On Thu, 13 Apr 2023 at 14:31, Alexander Saydakov wrote: > 1. I wonder if there is a clean separation between the phases: once partial > aggregation is done only combining is done after that (state transition > function never receives results of combining). Currently the transfn won't be called

Re: [E] Re: parallel aggregation

2023-04-12 Thread David Rowley
(Let's keep this on this list) On Thu, 13 Apr 2023 at 12:08, Alexander Saydakov wrote: > > Yes, I am talking about a custom aggregate in C++ with internal state type. > You did not really answer my first question. Does the state always go through > serialfinc-deserialfunc before reaching a

Re: parallel aggregation

2023-04-12 Thread David Rowley
On Wed, 12 Apr 2023 at 22:14, Alexander Saydakov wrote: > > I have a few questions regarding aggregate functions that would be parallel > safe. > 1. do the inputs of combinefunc always go through serialfunc-deserialfunc or > they can come directly from sfunc (worker on the same machine,

Re: Performance issue after migration from 9.4 to 15

2023-04-11 Thread David Rowley
On Tue, 11 Apr 2023 at 23:03, Christian Schröder wrote: > We have recently migrated from our old PostgreSQL 9.4 server to a PostgreSQL > 15 server. Even though the new machine has more resources, we see a > considerable decrease in the performance of some of our heavier queries, and > I have

Re: Get dead tuples data

2023-03-30 Thread David Rowley
On Thu, 30 Mar 2023 at 22:21, 任重 wrote: > Here is anyway to Get dead tuples data from table when I had disable > autovacuum? > I need whole row data https://www.postgresql.org/docs/current/pageinspect.html#id-1.11.7.34.5 David

Re: Do BRIN indexes support MIN/MAX?

2023-03-29 Thread David Rowley
On Thu, 30 Mar 2023 at 17:18, Tom Lane wrote: > > Julien Rouhaud writes: > > brin indexes don't work the way you would hope for. the stored min/max > > values per range guarantees that all values in the underlying relation > > pages are contained in that range, but it doesn't mean that those

Re: Idea: PostgreSQL equivalent to Oracle's KEEP clause

2023-03-14 Thread David Rowley
On Tue, 14 Mar 2023 at 16:07, Ben Clements wrote: > Similar to your "TOP() and BOTTOM() aggregate" idea, you might find Erwin > Brandstetter's solution using the LAST() aggregate function interesting: > (https://dba.stackexchange.com/a/324646/100880) Interesting. Just note that ORDER BY

Re: Idea: PostgreSQL equivalent to Oracle's KEEP clause

2023-03-14 Thread David Rowley
On Tue, 14 Mar 2023 at 21:01, Alban Hertroys wrote: > > On 7 Mar 2023, at 4:11, David G. Johnston > > wrote: > > TOP(city, ROW(population, land_area)) ? > > What should be the expected behaviour on a tie though? Undefined. Same as having an ORDER BY on a column that's not unique. The sort

Re: Idea: PostgreSQL equivalent to Oracle's KEEP clause

2023-03-06 Thread David Rowley
On Tue, 7 Mar 2023 at 16:11, David G. Johnston wrote: > > On Mon, Mar 6, 2023 at 7:51 PM David Rowley wrote: >> the transfn for bottom() would need to remember the city and the >> population for the highest yet seen value of the 2nd arg. > > > BOTTOM() reme

Re: Idea: PostgreSQL equivalent to Oracle's KEEP clause

2023-03-06 Thread David Rowley
On Tue, 7 Mar 2023 at 12:40, Tom Lane wrote: > > Ben Clements writes: > > As shown above, the following calculated column can bring in the city name, > > even though the city name isn't in the GROUP BY: > >max(city) keep (dense_rank first order by population desc) > > You haven't really

Re: Dropping behavior for unique CONSTRAINTs

2023-03-03 Thread David Rowley
On Sat, 4 Mar 2023 at 10:55, Ron wrote: > On 3/3/23 04:54, David Rowley wrote: > If you have a look at > https://www.postgresql.org/docs/15/sql-dropindex.html check out the > CONCURRENTLY option. That option allows an index to be dropped without > blocking concurrent reads and writ

Re: PG16devel - vacuum_freeze_table_age seems not being taken into account

2023-03-03 Thread David Rowley
On Fri, 3 Mar 2023 at 23:43, Simon Elbaz wrote: > hydrodb=# SELECT c.oid::regclass as table_name, >greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as age > FROM pg_class c > LEFT JOIN pg_class t ON c.reltoastrelid = t.oid > WHERE c.relkind IN ('r', 'm') and c.relname='test'; >

Re: Dropping behavior for unique CONSTRAINTs

2023-03-03 Thread David Rowley
On Fri, 3 Mar 2023 at 23:17, Conner Bean wrote: I wanted to avoid using a unique index since dropping them requires an exclusive lock and cannot be done concurrently. My thought was to then use a unique constraint, since I've read unofficial docs[0] that say these can be dropped safely with no

Re: Postgres undeterministically uses a bad plan, how to convince it otherwise?

2023-03-03 Thread David Rowley
On Fri, 3 Mar 2023 at 22:35, cen wrote: > Does equivalency only work for constants as in the sample you provided > or will it also be found in b1."number" and t0."block_number" in my > sample query? It works for more than constants, but in this case, it's the presence of the constant that would

Re: Postgres undeterministically uses a bad plan, how to convince it otherwise?

2023-03-02 Thread David Rowley
On Fri, 3 Mar 2023 at 02:20, cen wrote: > I understand that even though both colums are indexed, the indexes are > completely different but the point is, how would one know in advance > which one will be faster when designing the query? Likely to be safe, you'd just include both. The problem is

Re: Is Autovacuum running?

2023-02-21 Thread David Rowley
On Wed, 22 Feb 2023 at 11:28, Brad White wrote: > datname | stats_reset| now > DB | 2023-02-17 14:28:27-06 | 2023-02-21 16:16:34-06 > > I heard that the system was running slowly on Friday. They may have cycled > the service in an attempt to resolve that. You'll probably want

Re: Is Autovacuum running?

2023-02-21 Thread David Rowley
On Tue, 21 Feb 2023 at 08:42, Brad White wrote: > > I'm concerned that Autovacuum may not be running based on the results of this > query. > > SELECT relname, last_vacuum, last_autovacuum FROM pg_stat_user_tables; > gives 211 rows like this... > relname| last_vacuum |

Re: Sequential scan faster than index

2023-02-21 Thread David Rowley
On Tue, 21 Feb 2023 at 22:30, Arthur Ramsey wrote: > > I'm trying to figure out why a sequential scan is out performing. I've tried > psql 13.7, psql14.6 and REINDEX. The REINDEX didn't help. This is on an RDS > instance that's a db.m5.large (2 * vCPU, 8 GB memory) with 200 storage on io1 >

Re: A performance issue in ROW_NUMBER() OVER(ORDER BY NULL) [27 times slow than OVER()] V14.5

2023-02-20 Thread David Rowley
On Mon, 20 Feb 2023 at 13:17, Tom Lane wrote: > I suspect most of the remaining performance discrepancy is just triggered > by having to pass the extra always-NULL column forward through the various > plan steps. We could teach createplan.c to generate a WindowAgg plan node > that omits the

Re: A performance issue in ROW_NUMBER() OVER(ORDER BY NULL) [27 times slow than OVER()] V14.5

2023-02-19 Thread David Rowley
On Mon, 20 Feb 2023 at 10:18, Tom Lane wrote: > I don't see it at all. Comparing your two test queries on released > branches, I see maybe 2x penalty for the ORDER BY NULL, not 30x. > (In HEAD there's only about 13% penalty.) I wonder what PG version > you are testing. I suspect ed1a88dda

Re: Query plan for "id IS NULL" on PK

2023-02-14 Thread David Rowley
On Wed, 15 Feb 2023 at 11:39, Peter J. Holzer wrote: > OTOH it could also be argued that the optimizer should be able to > perform the same simplifications as I did above and produce the same > code for WHERE (("id" > ? OR "id" IS NULL)) AND (("id" <= ?)) > as for WHERE (("id" > ?)) AND (("id" <=

Re: Boyer-Moore string searching in LIKE, ILIKE, and/or POSITION?

2023-02-08 Thread David Rowley
On Thu, 9 Feb 2023 at 14:49, Tom Lane wrote: > > David Rowley writes: > > Tom's argument seems to think it's impossible, so if you find that > > it's definitely not impossible, then you can assume he's wrong about > > that. > > My point was that it seems like yo

Re: Boyer-Moore string searching in LIKE, ILIKE, and/or POSITION?

2023-02-08 Thread David Rowley
On Thu, 9 Feb 2023 at 13:05, Martin L. Buchanan wrote: > For the common and simple cases of find this string anywhere in another > string: > > str LIKE '%foo%' > > str ILIKE '%foo%' > > position('foo' in str) > 0 > > Is Boyer-Moore string searching now used by any of these three? We use a sort

Re: alter table impact on view

2023-01-30 Thread David Rowley
On Tue, 31 Jan 2023 at 01:14, Marc Millas wrote: > But if I alter table to change a column that is a varchar 20 into a varchar 21 > postgres refuse saying that it cannot due to the return rule... using said > column > > why ?? as the view is not a materialized object, the impact of the length of

Re: Why is a hash join preferred when it does not fit in work_mem

2023-01-12 Thread David Rowley
On Fri, 13 Jan 2023 at 07:33, Dimitrios Apostolou wrote: > > I have a very simple NATURAL JOIN that does not fit in the work_mem. Why > does the query planner prefer a hash join that needs 361s, while with a > sort operation and a merge join it takes only 13s? It's a simple matter of that the

Re: Implementing foreign data wrappers and avoiding n+1 querying

2022-12-21 Thread David Rowley
On Thu, 22 Dec 2022 at 13:31, David Gilman wrote: > > When a fdw table participates in query planning and finds itself as > part of a join it can output a parameterized path. If chosen, Postgres > will dutifully call the fdw over and over via IterateForeignScan to > fetch matching tuples. Many

Re: Is there a way to know write statistics on an individual index

2022-12-08 Thread David Rowley
On Thu, 8 Dec 2022 at 19:59, higherone wrote: > I know there's a pg_stats_user_indexes that gives statistics about the index > scans/fetches, but I don't find any information about how many > writes(insert/update/delete) happen on this individual index. > Is there some tool that could achieve

Re: sort performance better with little memory than big memory

2022-12-03 Thread David Rowley
On Sun, 4 Dec 2022 at 00:14, yang zhao wrote: > So,can we try to split memory to pieces and qsort every one,and merge than > all in memory,I have tried this in my local code, and got about 12% > improvement when memory is enough. We're not very good at CPU cache awareness. The hard part here

Re: Postgres upgrade 12 - issues with OIDs

2022-11-27 Thread David Rowley
On Mon, 28 Nov 2022 at 12:46, Venkata B Nagothi wrote: > Coming back to this thread after a while.. we have to remove OID on a 6 TB (5 > TB of indexes) table and ALTER TABLE is gonna block the table and is gonna > take hours... You may want to look into exploiting table inheritance for this.

Re: table inheritance partition and indexes

2022-11-23 Thread David Rowley
On Thu, 24 Nov 2022 at 11:34, Ted Toth wrote: > > On Wed, Nov 23, 2022 at 4:01 PM Tom Lane wrote: >> Then you're stuck managing it manually. But ATTACH PARTITION is >> not relevant to INHERITS-style partitioning. > > That's the part that wasn't clear to me, thanks. Would this have been more

Re: Unnecessary locks for partitioned tables

2022-11-09 Thread David Rowley
On Thu, 10 Nov 2022 at 04:11, wrote: > If someone would create delayed locking for generic plans, after parameters > are known and partition pruning occurs, I believe generic plan will be on > pars with custom. > So, I`m sticking with plan cache parameter for feature development, that was >

Re: Different execution plan between PostgreSQL 8.4 and 12.11

2022-10-11 Thread David Rowley
On Wed, 12 Oct 2022 at 13:06, Klint Gore wrote: > Limit (cost=0.56..28349.28 rows=1 width=4) (actual time=0.039..0.040 rows=1 > loops=1) > -> Unique (cost=0.56..28349.28 rows=1 width=4) (actual time=0.039..0.039 > rows=1 loops=1) > -> Index Only Scan using idx on tbl

Re: Re: Different execution plan between PostgreSQL 8.4 and 12.11

2022-10-11 Thread David Rowley
On Tue, 11 Oct 2022 at 16:13, gzh wrote: > new=# explain analyze select 2 from analyze_word_reports where (cseid = 94) > limit 1; > Limit (cost=0.00..0.43 rows=1 width=4) (actual time=2156.964..2156.966 > rows=1 loops=1) >-> Seq Scan on analyze_word_reports (cost=0.00..528550.75

Re: Different execution plan between PostgreSQL 8.4 and 12.11

2022-10-11 Thread David Rowley
On Tue, 11 Oct 2022 at 22:52, Josef Šimánek wrote: > This can make some queries fail since there will be no way to gather > data without seqscan. Disabling enable_seqscan only adds a const penalty to Seq Scans. It does not outright disallow them altogether. Having said that, having Paths with

Re: Feature proposal: immutable/sealed partitions (and maybe tables, too)

2022-09-06 Thread David Rowley
On Wed, 7 Sept 2022 at 13:33, Levi Aul wrote: > To be clear, this isn't a bug report. There is no bug—everything is working > exactly as it should. The partitions are not being pruned because the > workload consists of OLAP aggregations that fetch a small number of rows > spread across all

Re: Feature proposal: immutable/sealed partitions (and maybe tables, too)

2022-09-06 Thread David Rowley
On Wed, 7 Sept 2022 at 07:40, Levi Aul wrote: > In other words, our workload is inherently one that acquires "way too many > locks." Our largest performance bottleneck, according to pg_wait_sampling, is > the LockManager itself. Despite most of our queries spending only > milliseconds actually

Re: Restriction on table partition expressions

2022-08-25 Thread David Rowley
On Fri, 26 Aug 2022 at 03:08, James Vanns wrote: > Also, is there a chance that this > limitation will be relaxed in the future? (forgot to answer this part) Certainly not in the near future, I'm afraid. It would require allowing a single index to exist over multiple tables. There has been

Re: Restriction on table partition expressions

2022-08-25 Thread David Rowley
On Fri, 26 Aug 2022 at 03:08, James Vanns wrote: > CREATE TABLE foobar( > id BIGINT NOT NULL PRIMARY KEY, > baz VARCHAR NULL DEFAULT NULL > ) PARTITION BY HASH(my_func(id)); > > Error: primary key constraints cannot be used when partition keys > include expressions. > I couldn't find

Re: Different execution plan between PostgreSQL 8.2 and 12.5

2022-08-17 Thread David Rowley
On Thu, 18 Aug 2022 at 15:32, Tom Lane wrote: > The 12.5 plan looks like it thinks that the join condition is not > hashable --- and probably not mergeable as well, else it would have > done a mergejoin. This is odd if we assume that the lower() > outputs are just text. But you haven't said

Re: ***SPAM*** Re: Can I get the number of results plus the results with a single query?

2022-08-16 Thread David Rowley
On Tue, 16 Aug 2022 at 21:15, Walter Dörwald wrote: > select count(*) over (), e.* from email.email e; Depending on the complexity of the query, putting the count(*) as a subquery in the SELECT clause might execute more quickly. i.e. select (select count(*) from email.email) c, * from

  1   2   3   >