Re: Query performance in 9.6.24 vs 14.10
Thanks for the update. On Mon, Jan 29, 2024, 16:53 Ron Johnson wrote: > According to my tests, sometimes JIT is a little faster, and sometimes > it's a little slower. Mostly within the realm of statistical noise > (especially with each query having a sample size of only 13, on a VM that > lives on a probably-busy host). > > On Mon, Jan 29, 2024 at 9:18 AM Ron Johnson > wrote: > >> Yes, jit=on. >> >> I'll test them with jit=off, to see the difference. (The application is >> 3rd party, so will change it at the system level.) >> >> On Mon, Jan 29, 2024 at 7:09 AM Bob Jolliffe >> wrote: >> >>> Out of curiosity, is the pg14 running with the default jit=on setting? >>> >>> This is obviously entirely due to the nature of the particular queries >>> themselves, but we found that for our workloads that pg versions >>> greater than 11 were exacting a huge cost due to the jit compiler. Once we >>> explicitly turned jit=off we started to see improvements. >>> >>
Re: Query performance in 9.6.24 vs 14.10
According to my tests, sometimes JIT is a little faster, and sometimes it's a little slower. Mostly within the realm of statistical noise (especially with each query having a sample size of only 13, on a VM that lives on a probably-busy host). On Mon, Jan 29, 2024 at 9:18 AM Ron Johnson wrote: > Yes, jit=on. > > I'll test them with jit=off, to see the difference. (The application is > 3rd party, so will change it at the system level.) > > On Mon, Jan 29, 2024 at 7:09 AM Bob Jolliffe > wrote: > >> Out of curiosity, is the pg14 running with the default jit=on setting? >> >> This is obviously entirely due to the nature of the particular queries >> themselves, but we found that for our workloads that pg versions >> greater than 11 were exacting a huge cost due to the jit compiler. Once we >> explicitly turned jit=off we started to see improvements. >> >
Re: Query performance in 9.6.24 vs 14.10
Yes, jit=on. I'll test them with jit=off, to see the difference. (The application is 3rd party, so will change it at the system level.) On Mon, Jan 29, 2024 at 7:09 AM Bob Jolliffe wrote: > Out of curiosity, is the pg14 running with the default jit=on setting? > > This is obviously entirely due to the nature of the particular queries > themselves, but we found that for our workloads that pg versions > greater than 11 were exacting a huge cost due to the jit compiler. Once we > explicitly turned jit=off we started to see improvements. >
Re: Query performance in 9.6.24 vs 14.10
Out of curiosity, is the pg14 running with the default jit=on setting? This is obviously entirely due to the nature of the particular queries themselves, but we found that for our workloads that pg versions greater than 11 were exacting a huge cost due to the jit compiler. Once we explicitly turned jit=off we started to see improvements. On Mon, 29 Jan 2024 at 07:55, Ron Johnson wrote: > On Sun, Jan 28, 2024 at 10:44 PM David Rowley > wrote: > >> On Mon, 29 Jan 2024 at 07:37, Ron Johnson >> wrote: >> >>> 08 9.6.24 1,142.164 1,160.801 1,103.716 1,249.852 1,191.081 >>> 14.10 159.354 155.111 155.111 162.797 158.157 86.72% >>> >> >> Your speedup per cent calculation undersells PG14 by quite a bit. I'd >> call that an increase of ~639% rather than 86.72%. >> >> I think you've done "1 - sum( <14.10 numbers> ) / sum( <9.6.24 numbers>)" >> whereas I think you should have done "sum( <9.6.24 numbers>) / sum( <14.10 >> numbers> ) - 1" >> >> Nonetheless, thanks for testing this out. I assume this is just a report >> giving good feedback about progress in this area...? >> > > The spreadsheet function, using the Median cells, is (PG9.6 - PG14) / > PG9.6). That's essentially the same as what you wrote. > > 158.157 / 1191.081 = 0.13278 > > 1191.081 / 158.157 = 7.53, so 9.6.24 on that query is 7.53x slower. > >
Re: Query performance in 9.6.24 vs 14.10
On Sun, Jan 28, 2024 at 10:44 PM David Rowley wrote: > On Mon, 29 Jan 2024 at 07:37, Ron Johnson wrote: > >> 08 9.6.24 1,142.164 1,160.801 1,103.716 1,249.852 1,191.081 >> 14.10 159.354 155.111 155.111 162.797 158.157 86.72% >> > > Your speedup per cent calculation undersells PG14 by quite a bit. I'd > call that an increase of ~639% rather than 86.72%. > > I think you've done "1 - sum( <14.10 numbers> ) / sum( <9.6.24 numbers>)" > whereas I think you should have done "sum( <9.6.24 numbers>) / sum( <14.10 > numbers> ) - 1" > > Nonetheless, thanks for testing this out. I assume this is just a report > giving good feedback about progress in this area...? > The spreadsheet function, using the Median cells, is (PG9.6 - PG14) / PG9.6). That's essentially the same as what you wrote. 158.157 / 1191.081 = 0.13278 1191.081 / 158.157 = 7.53, so 9.6.24 on that query is 7.53x slower.
Re: Query performance in 9.6.24 vs 14.10
On Mon, 29 Jan 2024 at 07:37, Ron Johnson wrote: > 08 9.6.24 1,142.164 1,160.801 1,103.716 1,249.852 1,191.081 > 14.10 159.354 155.111 155.111 162.797 158.157 86.72% > Your speedup per cent calculation undersells PG14 by quite a bit. I'd call that an increase of ~639% rather than 86.72%. I think you've done "1 - sum( <14.10 numbers> ) / sum( <9.6.24 numbers>)" whereas I think you should have done "sum( <9.6.24 numbers>) / sum( <14.10 numbers> ) - 1" Nonetheless, thanks for testing this out. I assume this is just a report giving good feedback about progress in this area...? David
RE: [EXT] Re: Query performance going from Oracle to Postgres
Thanks to all who replied on this issue. I overlooked the difference in the execution plan between Oracle and Postgres. Oracle in fact does an index range scan only with no table lookup. When I changed the query to be a count(*) Postgres went from 332 logical reads to 19. Oracle did 20. Oracle did 20 regardless if it was count(*) or count(non-indexed-column). Regards Steve -Original Message- From: Peter Geoghegan Sent: Thursday, September 7, 2023 8:22 PM To: David Rowley Cc: Dirschel, Steve ; pgsql-general@lists.postgresql.org; Wong, Kam Fook (TR Technology) Subject: [EXT] Re: Query performance going from Oracle to Postgres External Email: Use caution with links and attachments. On Thu, Sep 7, 2023 at 3:48 AM David Rowley wrote: > On Thu, 7 Sept 2023 at 19:17, Peter Geoghegan wrote: > > It seems likely that the problem here is that some of the predicates > > appear as so-called "Filter:" conditions, as opposed to true index > > quals. > > hmm, if that were true we'd see "Rows Removed by Filter" in the > explain analyze. That's why I hedged, with "seems likely". The problem with using filter conditions rather than true index quals isn't limited to the problem of extra heap accesses. It happened to be convenient to make my point that way, but that isn't particularly fundamental here. I deliberately chose to make my example involve an index-only scan (that doesn't require any heap accesses) for this reason. > I think all that's going on is that each tuple is on a different page > and the heap accesses are just causing many buffers to be accessed. This index is an absolute monstrosity. I find it far easier to believe that the real explanation is the one that Steve intuited: that there is an issue with the way that the physical data structures (which are more or less comparable in both systems) are accessed in Postgres. The index in question ("history_event_display_timesta_prism_guid_display_timestamp_idx1") has certain columns that are omitted from the query. These columns nevertheless appear between other columns that the query filters on. The leading two columns ("prism_guid" and "display_timestamp") are made into index quals by Postgres, but the next index column after those two ("unique_lookup_key") is omitted by the query, and so isn't an index qual. In fact *four* columns are omitted after that one. But, the very-low-order index column "product_sid" *does* appear in the query, and so also appears as Postgres index quals. There is every chance that the true underlying explanation is that Oracle is able to skip over significant parts of the index structure dynamically. In fact I'm practically certain that that's the case, since the "product_sid" column appears as an "access predicate", rather than as a "filter predicate". These terms are explained here: https://urldefense.com/v3/__https://use-the-index-luke.com/sql/explain-plan/oracle/filter-predicates__;!!GFN0sa3rsbfR8OLyAw!fZJXNap0mP7xLTcE_5unwlR5eDfBHgw2F5LTBDGtjV_btV6Zze0MSWnTHamU16Fmu-kII-FwrC4WK7WLP60$ https://urldefense.com/v3/__https://use-the-index-luke.com/sql/explain-plan/postgresql/filter-predicates__;!!GFN0sa3rsbfR8OLyAw!fZJXNap0mP7xLTcE_5unwlR5eDfBHgw2F5LTBDGtjV_btV6Zze0MSWnTHamU16Fmu-kII-FwrC4WCXw7ubs$ How could "product_sid" be used as an "access predicate" given the omitted index columns? It seems very likely that parts of the index can be skipped in Oracle, but not in Postgres -- at least not yet. Like Markus Winand, I think that it's a real problem that EXPLAIN doesn't yet expose the difference between access predicates and filter predicates. Some of the index quals shown by EXPLAIN are marked SK_BT_REQFWD/SK_BT_REQBKWD by nbtree (meaning that they're what Oracle calls "access predicates"), while other are not (meaning that they're what Oracle calls "filter predicates"). That issue makes it far harder to spot these sorts of problems. > It seems to me that an IOS would likely fix that problem which is why > I suggested count(*) instead of > count( > ) Probably, but why should we have to use an index-only scan? And what about my example, that already used one, and still showed a big disparity where there is no inherently reason why there had to be? My example didn't involve jumping to another part of the index because the problem seems to be more general than that. -- Peter Geoghegan
Re: Query performance going from Oracle to Postgres
On Thu, Sep 7, 2023 at 3:48 AM David Rowley wrote: > On Thu, 7 Sept 2023 at 19:17, Peter Geoghegan wrote: > > It seems likely that the problem here is that some of the predicates > > appear as so-called "Filter:" conditions, as opposed to true index > > quals. > > hmm, if that were true we'd see "Rows Removed by Filter" in the > explain analyze. That's why I hedged, with "seems likely". The problem with using filter conditions rather than true index quals isn't limited to the problem of extra heap accesses. It happened to be convenient to make my point that way, but that isn't particularly fundamental here. I deliberately chose to make my example involve an index-only scan (that doesn't require any heap accesses) for this reason. > I think all that's going on is that each tuple is on a different page > and the heap accesses are just causing many buffers to be accessed. This index is an absolute monstrosity. I find it far easier to believe that the real explanation is the one that Steve intuited: that there is an issue with the way that the physical data structures (which are more or less comparable in both systems) are accessed in Postgres. The index in question ("history_event_display_timesta_prism_guid_display_timestamp_idx1") has certain columns that are omitted from the query. These columns nevertheless appear between other columns that the query filters on. The leading two columns ("prism_guid" and "display_timestamp") are made into index quals by Postgres, but the next index column after those two ("unique_lookup_key") is omitted by the query, and so isn't an index qual. In fact *four* columns are omitted after that one. But, the very-low-order index column "product_sid" *does* appear in the query, and so also appears as Postgres index quals. There is every chance that the true underlying explanation is that Oracle is able to skip over significant parts of the index structure dynamically. In fact I'm practically certain that that's the case, since the "product_sid" column appears as an "access predicate", rather than as a "filter predicate". These terms are explained here: https://use-the-index-luke.com/sql/explain-plan/oracle/filter-predicates https://use-the-index-luke.com/sql/explain-plan/postgresql/filter-predicates How could "product_sid" be used as an "access predicate" given the omitted index columns? It seems very likely that parts of the index can be skipped in Oracle, but not in Postgres -- at least not yet. Like Markus Winand, I think that it's a real problem that EXPLAIN doesn't yet expose the difference between access predicates and filter predicates. Some of the index quals shown by EXPLAIN are marked SK_BT_REQFWD/SK_BT_REQBKWD by nbtree (meaning that they're what Oracle calls "access predicates"), while other are not (meaning that they're what Oracle calls "filter predicates"). That issue makes it far harder to spot these sorts of problems. > It seems to me that an IOS would likely fix that problem which is why I > suggested count(*) instead of > count() Probably, but why should we have to use an index-only scan? And what about my example, that already used one, and still showed a big disparity where there is no inherently reason why there had to be? My example didn't involve jumping to another part of the index because the problem seems to be more general than that. -- Peter Geoghegan
Re: Query performance going from Oracle to Postgres
On Thu, 7 Sept 2023 at 19:17, Peter Geoghegan wrote: > It seems likely that the problem here is that some of the predicates > appear as so-called "Filter:" conditions, as opposed to true index > quals. hmm, if that were true we'd see "Rows Removed by Filter" in the explain analyze. I think all that's going on is that each tuple is on a different page and the heap accesses are just causing many buffers to be accessed. It seems to me that an IOS would likely fix that problem which is why I suggested count(*) instead of count() David
Re: Query performance going from Oracle to Postgres
On Wed, Sep 6, 2023 at 1:07 PM Dirschel, Steve wrote: > Oracle will find the same 332 rows using the same index but in Oracle it only > does 20 logical reads. I thought maybe the index was fragmented so I > reindexed that index: It seems likely that the problem here is that some of the predicates appear as so-called "Filter:" conditions, as opposed to true index quals. For reasons that aren't particularly fundamental, Postgres B-Trees cannot push down != (or <>) to the index level. Strangely enough, this is possible with IS NOT NULL. I'm working to improve things in this area. That difference is easy to see in the following example. The first query does far fewer buffer accesses than the. second query, despite being very similar: regression=# explain (analyze, buffers) select * from multi_test where a = 1 and b is not null; QUERY PLAN - Index Only Scan using multi_test_idx on multi_test (cost=0.29..50.10 rows=2191 width=8) (actual time=0.095..0.100 rows=14 loops=1) Index Cond: ((a = 1) AND (b IS NOT NULL)) Heap Fetches: 0 Buffers: shared hit=3 Planning Time: 0.056 ms Execution Time: 0.145 ms (6 rows) regression=# explain (analyze, buffers) select * from multi_test where a = 1 and b != 42; QUERY PLAN -- Index Only Scan using multi_test_idx on multi_test (cost=0.29..222.57 rows=2191 width=8) (actual time=0.087..2.982 rows=14 loops=1) Index Cond: (a = 1) Filter: (b <> 42) Rows Removed by Filter: 1 Heap Fetches: 0 Buffers: shared hit=11 Planning Time: 0.076 ms Execution Time: 3.204 ms (8 rows) (There are lots of index tuples matching (a, b) = "(1, NULL)" here, you see.) -- Peter Geoghegan
Re: Query performance going from Oracle to Postgres
On Thu, 7 Sept 2023 at 11:14, Dirschel, Steve wrote: > select count(historyeve0_.HISTORY_EVENT_SID) as col_0_0_ from > hist28.history_event_display_timestamp_20230301 historyeve0_ where > historyeve0_.IS_DELETED=0 > history_event_sid | character varying(32) | | not > null | Do you get an Index Only Scan if you use count(*) instead of count(historyeve0_.HISTORY_EVENT_SID)? It seems that column is the only one missing from the index and that might be the only thing that's stopping PostgreSQL using an Index Only Scan. Perhaps Oracle realises that the column is NOT NULL so can do count(*) instead of counting the non-nulls from that column. That may allow it to Index Only Scan? I'm not versed in reading Oracle plans so don't know if it's possible to tell. David
Re: Query performance going from Oracle to Postgres
On Wed, 2023-09-06 at 20:06 +, Dirschel, Steve wrote: > We are in the process of converting from Oracle to Postgres and I have a > query that is using > the same index in Postgres as is used in Oracle but in Postgres the query > does 16x more > buffer/logical reads. I’d like to understand why. > > explain (analyze, buffers) > select count(historyeve0_.HISTORY_EVENT_SID) as col_0_0_ > from hist28.history_event_display_timestamp_20230301 historyeve0_ where > historyeve0_.IS_DELETED=0 > and historyeve0_.PRISM_GUID='i0accd6a2018405f095ee669dc5b4' > and historyeve0_.PRODUCT_SID='CARSWELL.WESTLAW' > and (historyeve0_.EVENT_TYPE not in > ('versionsSearchWithinQueryEvent','notesOfDecisionsSearchWithinQueryEvent','citingReferencesSearchWithinQueryEvent','tocSearchWithinQueryEvent','searchWithinDocumentEvent','coCitesSearchWithinQueryE > vent')) > and (historyeve0_.PRODUCT_VIEW in > ('DefaultProductView','TNPPlus','PLCUS','Indigo','IndigoPremium','INDIGOCA') > or historyeve0_.PRODUCT_VIEW is null) > and historyeve0_.CLIENT_ID='WLNCE_VNJXL7' > and (historyeve0_.DISPLAY_TIMESTAMP between '2022-03-01 00:00:00' and > '2023-03-01 23:59:59.999'); > > Aggregate (cost=56.64..56.65 rows=1 width=8) (actual time=0.930..0.931 > rows=1 loops=1) > Buffers:shared hit=341 > -> Index Scan using > history_event_display_timesta_prism_guid_display_timestamp_idx1 on > history_event_display_timestamp_20230301 historyeve0_ (cost=0.42..56.64 > rows=1 width=33) (actual > time=0.034..0.897 rows=332 loops=1) > Index Cond: (((prism_guid)::text = > 'i0accd6a2018405f095ee669dc5b4'::text) AND (display_timestamp >= > '2022-03-01 00:00:00'::timestamp without time zone) AND (display_timestamp <= > '2023- > 03-01 23:59:59.999'::timestamp without time zone) AND ((product_sid)::text = > 'CARSWELL.WESTLAW'::text) AND (is_deleted = '0'::numeric) AND > ((client_id)::text = 'WLNCE_VNJXL7'::text)) > Filter: (((event_type)::text <> ALL > ('{versionsSearchWithinQueryEvent,notesOfDecisionsSearchWithinQueryEvent,citingReferencesSearchWithinQueryEvent,tocSearchWithinQueryEvent,searchWithinDocumentEvent,coCitesSearchWithinQueryEvent}'::t > ext[])) AND (((product_view)::text = ANY > ('{DefaultProductView,TNPPlus,PLCUS,Indigo,IndigoPremium,INDIGOCA}'::text[])) > OR (product_view IS NULL))) > Buffers: shared hit=341 > Planning: > Buffers: shared hit=6 > Planning Time: 0.266 ms > Execution Time: 0.965 ms > (10 rows) > > *** 341 logical reads to find 332 rows. > > Plan hash value: 3361538278 > > -- > | Id | Operation | Name | Starts | E-Rows | > A-Rows | A-Time | Buffers | > -- > | 0 | SELECT STATEMENT | | 1 | | > 1 |00:00:00.01 | 20 | > | 1 | SORT AGGREGATE | | 1 | 1 | > 1 |00:00:00.01 | 20 | > |* 2 | FILTER | | 1 | | > 332 |00:00:00.01 | 20 | > | 3 | PARTITION RANGE SINGLE| | 1 | 1 | > 332 |00:00:00.01 | 20 | > |* 4 | INDEX RANGE SCAN |HISTORY_EVENT_IDX02 | 1 | 1 | > 332 |00:00:00.01 | 20 | > -- > > Predicate Information (identified by operation id): > --- > > 2 - > filter(TO_TIMESTAMP(:SYS_B_16,:SYS_B_17)<=TO_TIMESTAMP(:SYS_B_18,:SYS_B_19)) > 4 - access("HISTORYEVE0_"."PRISM_GUID"=:SYS_B_01 AND > > "HISTORYEVE0_"."DISPLAY_TIMESTAMP">=TO_TIMESTAMP(:SYS_B_16,:SYS_B_17) AND > "HISTORYEVE0_"."PRODUCT_SID"=:SYS_B_02 AND > "HISTORYEVE0_"."IS_DELETED"=:SYS_B_00 AND > "HISTORYEVE0_"."CLIENT_ID"=:SYS_B_15 AND > "HISTORYEVE0_"."DISPLAY_TIMESTAMP"<=TO_TIMESTAMP(:SYS_B_1 > 8,:SYS_B_19)) > filter(("HISTORYEVE0_"."CLIENT_ID"=:SYS_B_15 AND > "HISTORYEVE0_"."PRODUCT_SID"=:SYS_B_02 > AND "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_03 AND > "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_04 AND > "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_05 AND > "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_06 AND > "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_07 AND > "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_08 AND > "HISTORYEVE0_"."IS_DELETED"=:SYS_B_00 AND > (INTERNAL_FUNCTION("HISTORYEVE0_"."PRODUCT_VIEW") OR > "HISTORYEVE0_"."PRODUCT_VIEW" IS NULL))) > > It finds the same 332 rows but it does only 20 logical reads. In PostgreSQL, the 332 matching rows seem to me stored mostly on different pages, while they are better clustered in your Oracle table. If it is really very important for you,
Re: Query Performance
I should have given you the full query. Here it is Select a.locationname, a.usercode, a.itemname, a.uomname, a.batchnumber, a.expirydate, a.itemnamefk, a.itemuomfk, f.itemgroupname, a.locationpk, Sum(a.quantity) as quantity, Sum(a.freequantity) as freequantity, b.beforetaxrate as beforetaxrate, (b.rate / b.rateper) as purchaserate, b.netrate as netrate, b.netsalesrate as netsalesrate, b.effectiverate as effectiverate, b.rateper, a.reportformat, g.standardcost, g.defaultpurchaserate, g.salesrateone, g.salesratetwo, g.salesratethree, g.salesratefour, g.salesratefive, g.salesratesix, g.salesrateseven, g.salesrateeight, g.salesratenine, 'N' as negativerate From (Select k.locationname, a.usercode, a.itemname, c.uomname, j.batchnumber, j.expirydate, j.itemnamefk, j.itemuomfk, k.locationpk, j.receivedquantity as quantity, j.receivedfreequantity as freequantity, c.reportformat From in_item_name a Join in_item_uom b On b.itemnamefk = a.itemnamepk Join gl_uom c On c.uompk = b.uomfk Join view_item_receipts j On j.itemnamefk = a.itemnamepk And j.itemuomfk = b.itemuompk Join in_location k On k.locationpk = j.locationfk Where j.companycode = 'SDM' And j.branchcode = '001' And j.accountperiodid = 1 And j.voucherdate <= '2022/09/17' And j.billstatus <> 'C' And j.topparentcode <> 4 And (j.receivedquantity <> 0 Or j.receivedfreequantity <> 0) UNION ALL Select k.locationname, a.usercode, a.itemname, c.uomname, j.batchnumber, j.expirydate, j.itemnamefk, j.itemuomfk, k.locationpk, (j.issuedquantity * -1) as quantity, (j.issuedfreequantity * -1) as freequantity, c.reportformat From in_item_name a Join in_item_uom b On b.itemnamefk = a.itemnamepk Join gl_uom c On c.uompk = b.uomfk Join view_item_issues j On j.itemnamefk = a.itemnamepk And j.itemuomfk = b.itemuompk Join in_location k On k.locationpk = j.locationfk Where j.companycode = 'SDM' And j.branchcode = '001' And j.accountperiodid = 1 And j.voucherdate <= '2022/09/17' And j.billstatus <> 'C' And j.topparentcode <> 4 And (j.issuedquantity <> 0 Or j.issuedfreequantity <> 0)) a Left Outer Join view_item_receipts b <- It's actually a view of 4 tables which tries to arrive the last purchase rate On b.itemreceiptspk = (Select c.itemreceiptspk From view_item_receipts c Where c.companycode = 'SDM' And c.branchcode = '001' And c.accountperiodid = 1 And c.voucherdate <= '2022/09/17' And c.billstatus <> 'C' And c.itemnamefk = a.itemnamefk And c.itemuomfk = a.itemuomfk And c.batchnumber = a.batchnumber And c.expirydate = a.expirydate And (c.receivedquantity <> 0 Or c.receivedfreequantity <> 0) Order by c.voucherdate desc, c.vouchernumber desc, c.sequencenumber desc Limit 1 ) Join in_item_name c On c.itemnamepk = a.itemnamefk Join in_item_group f On f.itemgrouppk = c.itemgroupfk Left Outer Join in_item_rate g On g.itemuomfk = b.itemuomfk And g.itemnamefk = b.itemnamefk And '2022/09/17' between g.fromdate and g.todate Group By a.locationname, a.usercode, a.itemname, a.uomname, a.batchnumber, a.expirydate, a.itemnamefk, a.itemuomfk, f.itemgroupname, a.locationpk, b.beforetaxrate, b.rate, b.netrate, b.netsalesrate, b.effectiverate, b.rateper, a.reportformat, g.standardcost, g.defaultpurchaserate, g.salesrateone, g.salesratetwo, g.salesratethree, g.salesratefour, g.salesratefive, g.salesratesix, g.salesrateseven, g.salesrateeight, g.salesratenine Having (sum(a.quantity) + sum(a.freequantity)) <> 0 Order by 1, 3, 2, 5 Create a index for companycode, branchcode, c.accountperiodid, voucherdate, billstatus, itemnamefk, itemuomfk, batchnumber, expirydate, receivedquantity, receivedfreequantity in all the 4 tables that this view got. Happiness Always BKR Sivaprakash On Saturday, 17 September, 2022 at 03:33:48 pm IST, Peter J. Holzer wrote: On 2022-09-17 05:28:25 +, sivapostg...@yahoo.com wrote: > My query is like this > > Select a.field1, a.field2, a.field3 > From (Select a.field1, b.field2, c.field3 > From table1 a >
Re: Query Performance
On 2022-09-17 05:28:25 +, sivapostg...@yahoo.com wrote: > My query is like this > > Select a.field1, a.field2, a.field3 > From (Select a.field1, b.field2, c.field3 > From table1 a > Join table2 b > on b.something = a.something > Join table3 c > On c.something = a.something > Where a.field7 = 'value' > > UNION ALL > > Select a.field4, a.field5, a.field6 > From table11 a > Join table21 b > On b.something = a.something > Where a.field8 = 'something' ) a > Join table10 b > On b.field11 = (Select c.field11 > From table10 c > Where c.field10 = a.field1 ) <- > instead of a.field1, if I hardcode value (eg. '100') query runs faster > Join table21 c > On c.something = a.something > ... Well, you are now searching table10 for a constant value (which can be done once) instead of the output of the union (which has to be done for each line of the union, so I'm not surprised that it's faster. What is the output of `explain (analyze)` for the two queries? Is there an obvious place where an index would help? Can you restructure the query? BTW, it is (at least for me) very hard to give advice on a query with only completely abstract names like `table11` or `field4`: I have no idea what this is supposed to do, so it's hard to tell if there is a better way. Using `a` to refer to 3 different things doesn't help either. hp -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature
Re: Query Performance
On 9/17/22 00:28, sivapostg...@yahoo.com wrote: Hello, My query is like this Select a.field1, a.field2, a.field3 From (Select a.field1, b.field2, c.field3 From table1 a Join table2 b on b.something = a.something Join table3 c On c.something = a.something Where a.field7 = 'value' UNION ALL Select a.field4, a.field5, a.field6 From table11 a Join table21 b On b.something = a.something Where a.field8 = 'something' ) a Join table10 b On b.field11 = (Select c.field11 From table10 c Where c.field10 = a.field1 ) <- instead of a.field1, if I hardcode value (eg. '100') query runs faster Join table21 c On c.something = a.something ... In the above query, If I substitute a value for a.field1, query runs faster. Any suggestion/guidance/links to improve the query performance without substituting the value ? Is there an index on table1.field1? Have you vacuumed or analyzed table1 lately? Have you decomposed the query into the smallest possible query that demonstrates the problem? -- Angular momentum makes the world go 'round.
Re: Query Performance
Hello, My query is like this Select a.field1, a.field2, a.field3From (Select a.field1, b.field2, c.field3 From table1 a Join table2 b on b.something = a.something Join table3 c On c.something = a.something Where a.field7 = 'value' UNION ALL Select a.field4, a.field5, a.field6 From table11 a Join table21 b On b.something = a.something Where a.field8 = 'something' ) aJoin table10 bOn b.field11 = (Select c.field11 From table10 c Where c.field10 = a.field1 ) <- instead of a.field1, if I hardcode value (eg. '100') query runs fasterJoin table21 cOn c.something = a.something... In the above query, If I substitute a value for a.field1, query runs faster. Any suggestion/guidance/links to improve the query performance without substituting the value ? Happiness Always BKR Sivaprakash
Re: Query performance with min and filter
"Yorwerth, Adam" writes: > We seem to have found a situation where a query run using explain analyse or > pgbench is incredibly fast, but run via Java under load performs very poorly > – we’ve checked query performance metrics for our Postgres instance and can > confirm that it’s the query running slowly, and not a Java problem. Usually when people report something like this, the explanation is that it's not actually the same query being tested in both cases. In particular, if you're parameterizing the query in Java but substituting constant values when running it manually, that can make a difference. A further-afield possibility is that your application has been set up to run with custom values of work_mem, planner cost constants, etc, which you're failing to duplicate in the manual tests. If all else fails, try installing auto_explain so that you can capture the actual query plan used when the application executes this query. regards, tom lane
Re: Query performance with min and filter
Hi Adam, What're the query times when you run the query directly on the psql prompt, but without explain/analyze? Can you check the cache hit rate vs disk read on explain analyze vs from java? Sometimes, the data's in RAM when you run a query manually, but the live Java app might be hitting the disk, causing the slowdown. Are you also sure that the time taken by psql for the query when called from the java app is being measured and not the network delays etc? Regards, Karthik. -- HoSD, In-Store Platform Ph: +91 99406 59452 From: Yorwerth, Adam Sent: Monday, August 31, 2020 2:57 PM To: pgsql-general@lists.postgresql.org Cc: Cianciaruso, Lorenzo Subject: Query performance with min and filter Hi Postgres community, We seem to have found a situation where a query run using explain analyse or pgbench is incredibly fast, but run via Java under load performs very poorly – we’ve checked query performance metrics for our Postgres instance and can confirm that it’s the query running slowly, and not a Java problem. The table has ~7 million rows, with total size of 44GB. As shown below, explain analyse gives a sub 1ms execution time, yet in practice the query takes an average of 2 seconds – it’s worth mentioning that a small percentage of requests do take ~1ms, so something non-deterministic is happening here. The query in question is below, with explain analyse: explain analyse SELECT coalesce ( (SELECT min(msg_offset) - 1 FROM events WHERE created_utc >= CURRENT_TIMESTAMP - INTERVAL '60 SECONDS' ), (SELECT max(msg_offset) FROM events), 0 ); QUERY PLAN --- - Result (cost=2.03..2.04 rows=1 width=8) (actual time=0.183..0.183 rows=1 loops=1) InitPlan 1 (returns $1) -> Aggregate (cost=1.56..1.57 rows=1 width=8) (actual time=0.028..0.028 rows=1 loops=1) -> Index Only Scan using test_4 on events (cost=0.44..1.56 rows=1 width=8) (actual time=0.023..0.023 rows=0 loops=1) Index Cond: (created_utc >= (now() - '00:01:00'::interval)) Heap Fetches: 0 InitPlan 3 (returns $3) -> Result (cost=0.45..0.46 rows=1 width=8) (actual time=0.153..0.153 rows=1 loops=1) InitPlan 2 (returns $2) -> Limit (cost=0.43..0.45 rows=1 width=8) (actual time=0.152..0.152 rows=1 loops=1) -> Index Only Scan Backward using events_pkey on events events_1 (cost=0.43..156842.57 rows=7643099 width=8) (actual time=0.151..0.151 rows=1 loops=1) Index Cond: (msg_offset IS NOT NULL) Heap Fetches: 0 Planning time: 2.480 ms Execution time: 0.242 ms (15 rows) Output of \d events, with a few test indexes added to try and improve performance: Table "public.events" Column|Type | Collation | Nullable | Default --+-+---+--+ msg_offset | bigint | | not null | nextval('events_msg_offset_seq'::regclass) msg_key | character varying | | not null | content_type | character varying | | not null | type | character varying | | not null | created_utc | timestamp without time zone | | not null | data | text| | | event_size | integer | | not null | 1048576 cluster_id | bigint | | not null | 1 Indexes: "events_pkey" PRIMARY KEY, btree (msg_offset) "cluster_idx" btree (msg_key, cluster_id) "cluster_type_filter_idx" btree (type, cluster_id) "created_idx" btree (created_utc, msg_key) "key_idx" btree (msg_key) "test_1" btree (created_utc, msg_offset) "test_2" btree (msg_offset, created_utc) "test_3" btree (created_utc, msg_offset) "test_4" btree (created_utc, msg_offset DESC) "type_idx" btree (type) Any help understanding what is happening here, or guidance to improve performance would be much appreciated. Regards, Adam This is a confidential email. Tesco may monitor and record all emails. The views expressed in this email are those of the sender and not Tesco. Tesco Stores Limited Company Number: 519500 Registered in England Registered Office: Tesco House, Shire Park, Kestrel Way, Welwyn Garden City, AL7 1GA VAT Registration Number: GB 220 4302 31
Re: query performance
On Sun, 18 Feb 2018 00:35:18 +0100 hmidi slimwrote: > Hi, > I have two tables: establishment which contains these columns: id, name, > longitude, latitude, geom (Geometric column) > Product contains: id, name, establishment_id > First of all I want to select the establishment within a radius. > I run this query: > select e.name, e1.name > from establishment as e, establishment as e1 > where e.id <> e1.id > and e1.id = 1 > and ST_DWithin(geom, ST_MakePoint(e1.longitude, e1.latitude)::geography, > 1000) > > The result of this query returns all the establishment within a radius 1KM > from from a given establishment which has an id = 1. > > After that I want to get the product's name of each establishment from the > query's result. > > Is there an other optimized solution to make a query such this: > select * from ( > select e.name, e1.name, e.id > from establishment as e, establishment as e1 > where e.id <> e1.id > and e1.id = 1 > and ST_DWithin(geom, ST_MakePoint(e1.longitude, e1.latitude)::geography, > 1000)) as tmp inner join product as p on p.establishment_id = tmp.id Did you not see my earlier response? -- Bill Moran
Re: query performance
On 02/18/2018 06:37 AM, David Rowley wrote: > On 18 February 2018 at 12:35, hmidi slimwrote: >> Is there an other optimized solution to make a query such this: >> select * from ( >> select e.name, e1.name, e.id >> from establishment as e, establishment as e1 >> where e.id <> e1.id >> and e1.id = 1 >> and ST_DWithin(geom, ST_MakePoint(e1.longitude, e1.latitude)::geography, >> 1000)) as tmp inner join product as p on p.establishment_id = tmp.id > > You really should state what is wrong with the performance of the > above version and also include the EXPLAIN (ANALYZE, BUFFERS) of that > query. > > The version of PostgreSQL that you're running it on is also a good > thing to share. > > Details of the indexes which you've defined on the tables are also > useful information. It would especially be good to know if you've > added an index on product (establishment_id), for example. > There's actually a wiki page with overview of what should be included in "slow query" questions: https://wiki.postgresql.org/wiki/Slow_Query_Questions regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: query performance
On 18 February 2018 at 12:35, hmidi slimwrote: > Is there an other optimized solution to make a query such this: > select * from ( > select e.name, e1.name, e.id > from establishment as e, establishment as e1 > where e.id <> e1.id > and e1.id = 1 > and ST_DWithin(geom, ST_MakePoint(e1.longitude, e1.latitude)::geography, > 1000)) as tmp inner join product as p on p.establishment_id = tmp.id You really should state what is wrong with the performance of the above version and also include the EXPLAIN (ANALYZE, BUFFERS) of that query. The version of PostgreSQL that you're running it on is also a good thing to share. Details of the indexes which you've defined on the tables are also useful information. It would especially be good to know if you've added an index on product (establishment_id), for example. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services