Re: Query performance in 9.6.24 vs 14.10

2024-01-29 Thread Bob Jolliffe
Thanks for the update.

On Mon, Jan 29, 2024, 16:53 Ron Johnson  wrote:

> According to my tests, sometimes JIT is a little faster, and sometimes
> it's a little slower. Mostly within the realm of statistical noise
> (especially with each query having a sample size of only 13, on a VM that
> lives on a probably-busy host).
>
> On Mon, Jan 29, 2024 at 9:18 AM Ron Johnson 
> wrote:
>
>> Yes, jit=on.
>>
>> I'll test them with jit=off, to see the difference.  (The application is
>> 3rd party, so will change it at the system level.)
>>
>> On Mon, Jan 29, 2024 at 7:09 AM Bob Jolliffe 
>> wrote:
>>
>>> Out of curiosity, is the pg14 running with the default jit=on setting?
>>>
>>> This is obviously entirely due to the nature of the particular queries
>>> themselves, but we found that for our workloads that pg versions
>>> greater than 11 were exacting a huge cost due to the jit compiler.  Once we
>>> explicitly turned jit=off we started to see improvements.
>>>
>>


Re: Query performance in 9.6.24 vs 14.10

2024-01-29 Thread Ron Johnson
According to my tests, sometimes JIT is a little faster, and sometimes it's
a little slower. Mostly within the realm of statistical noise (especially
with each query having a sample size of only 13, on a VM that lives on a
probably-busy host).

On Mon, Jan 29, 2024 at 9:18 AM Ron Johnson  wrote:

> Yes, jit=on.
>
> I'll test them with jit=off, to see the difference.  (The application is
> 3rd party, so will change it at the system level.)
>
> On Mon, Jan 29, 2024 at 7:09 AM Bob Jolliffe 
> wrote:
>
>> Out of curiosity, is the pg14 running with the default jit=on setting?
>>
>> This is obviously entirely due to the nature of the particular queries
>> themselves, but we found that for our workloads that pg versions
>> greater than 11 were exacting a huge cost due to the jit compiler.  Once we
>> explicitly turned jit=off we started to see improvements.
>>
>


Re: Query performance in 9.6.24 vs 14.10

2024-01-29 Thread Ron Johnson
Yes, jit=on.

I'll test them with jit=off, to see the difference.  (The application is
3rd party, so will change it at the system level.)

On Mon, Jan 29, 2024 at 7:09 AM Bob Jolliffe  wrote:

> Out of curiosity, is the pg14 running with the default jit=on setting?
>
> This is obviously entirely due to the nature of the particular queries
> themselves, but we found that for our workloads that pg versions
> greater than 11 were exacting a huge cost due to the jit compiler.  Once we
> explicitly turned jit=off we started to see improvements.
>


Re: Query performance in 9.6.24 vs 14.10

2024-01-29 Thread Bob Jolliffe
Out of curiosity, is the pg14 running with the default jit=on setting?

This is obviously entirely due to the nature of the particular queries
themselves, but we found that for our workloads that pg versions
greater than 11 were exacting a huge cost due to the jit compiler.  Once we
explicitly turned jit=off we started to see improvements.

On Mon, 29 Jan 2024 at 07:55, Ron Johnson  wrote:

> On Sun, Jan 28, 2024 at 10:44 PM David Rowley 
> wrote:
>
>> On Mon, 29 Jan 2024 at 07:37, Ron Johnson 
>> wrote:
>>
>>> 08 9.6.24 1,142.164 1,160.801 1,103.716 1,249.852 1,191.081
>>> 14.10 159.354 155.111 155.111 162.797 158.157 86.72%
>>>
>>
>> Your speedup per cent calculation undersells PG14 by quite a bit.  I'd
>> call that an increase of ~639% rather than 86.72%.
>>
>> I think you've done "1 - sum( <14.10 numbers> ) / sum( <9.6.24 numbers>)"
>> whereas I think you should have done "sum( <9.6.24 numbers>) / sum( <14.10
>> numbers> ) - 1"
>>
>> Nonetheless, thanks for testing this out.  I assume this is just a report
>> giving good feedback about progress in this area...?
>>
>
> The spreadsheet function, using the Median cells, is (PG9.6 - PG14) /
> PG9.6).  That's essentially the same as what you wrote.
>
> 158.157 / 1191.081 = 0.13278
>
> 1191.081 / 158.157 = 7.53, so 9.6.24 on that query is 7.53x slower.
>
>


Re: Query performance in 9.6.24 vs 14.10

2024-01-28 Thread Ron Johnson
On Sun, Jan 28, 2024 at 10:44 PM David Rowley  wrote:

> On Mon, 29 Jan 2024 at 07:37, Ron Johnson  wrote:
>
>> 08 9.6.24 1,142.164 1,160.801 1,103.716 1,249.852 1,191.081
>> 14.10 159.354 155.111 155.111 162.797 158.157 86.72%
>>
>
> Your speedup per cent calculation undersells PG14 by quite a bit.  I'd
> call that an increase of ~639% rather than 86.72%.
>
> I think you've done "1 - sum( <14.10 numbers> ) / sum( <9.6.24 numbers>)"
> whereas I think you should have done "sum( <9.6.24 numbers>) / sum( <14.10
> numbers> ) - 1"
>
> Nonetheless, thanks for testing this out.  I assume this is just a report
> giving good feedback about progress in this area...?
>

The spreadsheet function, using the Median cells, is (PG9.6 - PG14) /
PG9.6).  That's essentially the same as what you wrote.

158.157 / 1191.081 = 0.13278

1191.081 / 158.157 = 7.53, so 9.6.24 on that query is 7.53x slower.


Re: Query performance in 9.6.24 vs 14.10

2024-01-28 Thread David Rowley
On Mon, 29 Jan 2024 at 07:37, Ron Johnson  wrote:

> 08 9.6.24 1,142.164 1,160.801 1,103.716 1,249.852 1,191.081
> 14.10 159.354 155.111 155.111 162.797 158.157 86.72%
>

Your speedup per cent calculation undersells PG14 by quite a bit.  I'd call
that an increase of ~639% rather than 86.72%.

I think you've done "1 - sum( <14.10 numbers> ) / sum( <9.6.24 numbers>)"
whereas I think you should have done "sum( <9.6.24 numbers>) / sum( <14.10
numbers> ) - 1"

Nonetheless, thanks for testing this out.  I assume this is just a report
giving good feedback about progress in this area...?

David


RE: [EXT] Re: Query performance going from Oracle to Postgres

2023-09-13 Thread Dirschel, Steve
Thanks to all who replied on this issue.  I overlooked the difference in the 
execution plan between Oracle and Postgres.  Oracle in fact does an index range 
scan only with no table lookup.  When I changed the query to be a count(*) 
Postgres went from 332 logical reads to 19.  Oracle did 20.  Oracle did 20 
regardless if it was count(*) or count(non-indexed-column).

Regards
Steve

-Original Message-
From: Peter Geoghegan  
Sent: Thursday, September 7, 2023 8:22 PM
To: David Rowley 
Cc: Dirschel, Steve ; 
pgsql-general@lists.postgresql.org; Wong, Kam Fook (TR Technology) 

Subject: [EXT] Re: Query performance going from Oracle to Postgres

External Email: Use caution with links and attachments.

On Thu, Sep 7, 2023 at 3:48 AM David Rowley  wrote:
> On Thu, 7 Sept 2023 at 19:17, Peter Geoghegan  wrote:
> > It seems likely that the problem here is that some of the predicates 
> > appear as so-called "Filter:" conditions, as opposed to true index 
> > quals.
>
> hmm, if that were true we'd see "Rows Removed by Filter" in the 
> explain analyze.

That's why I hedged, with "seems likely". The problem with using filter 
conditions rather than true index quals isn't limited to the problem of extra 
heap accesses. It happened to be convenient to make my point that way, but that 
isn't particularly fundamental here.

I deliberately chose to make my example involve an index-only scan (that 
doesn't require any heap accesses) for this reason.

> I think all that's going on is that each tuple is on a different page 
> and the heap accesses are just causing many buffers to be accessed.

This index is an absolute monstrosity. I find it far easier to believe that the 
real explanation is the one that Steve intuited: that there is an issue with 
the way that the physical data structures (which are more or less comparable in 
both systems) are accessed in Postgres.

The index in question
("history_event_display_timesta_prism_guid_display_timestamp_idx1")
has certain columns that are omitted from the query. These columns nevertheless 
appear between other columns that the query filters on.
The leading two columns ("prism_guid" and "display_timestamp") are made into 
index quals by Postgres, but the next index column after those two 
("unique_lookup_key") is omitted by the query, and so isn't an index qual. In 
fact *four* columns are omitted after that one. But, the very-low-order index 
column "product_sid" *does* appear in the query, and so also appears as 
Postgres index quals.

There is every chance that the true underlying explanation is that Oracle is 
able to skip over significant parts of the index structure dynamically. In fact 
I'm practically certain that that's the case, since the "product_sid" column 
appears as an "access predicate", rather than as a "filter predicate". These 
terms are explained here:

https://urldefense.com/v3/__https://use-the-index-luke.com/sql/explain-plan/oracle/filter-predicates__;!!GFN0sa3rsbfR8OLyAw!fZJXNap0mP7xLTcE_5unwlR5eDfBHgw2F5LTBDGtjV_btV6Zze0MSWnTHamU16Fmu-kII-FwrC4WK7WLP60$
 

https://urldefense.com/v3/__https://use-the-index-luke.com/sql/explain-plan/postgresql/filter-predicates__;!!GFN0sa3rsbfR8OLyAw!fZJXNap0mP7xLTcE_5unwlR5eDfBHgw2F5LTBDGtjV_btV6Zze0MSWnTHamU16Fmu-kII-FwrC4WCXw7ubs$
 

How could "product_sid" be used as an "access predicate" given the omitted 
index columns? It seems very likely that parts of the index can be skipped in 
Oracle, but not in Postgres -- at least not yet.

Like Markus Winand, I think that it's a real problem that EXPLAIN doesn't yet 
expose the difference between access predicates and filter predicates. Some of 
the index quals shown by EXPLAIN are marked SK_BT_REQFWD/SK_BT_REQBKWD by 
nbtree (meaning that they're what Oracle calls "access predicates"), while 
other are not (meaning that they're what Oracle calls "filter predicates"). 
That issue makes it far harder to spot these sorts of problems.

> It seems to me that an IOS would likely fix that problem which is why 
> I suggested count(*) instead of
> count(
> )

Probably, but why should we have to use an index-only scan? And what about my 
example, that already used one, and still showed a big disparity where there is 
no inherently reason why there had to be? My example didn't involve jumping to 
another part of the index because the problem seems to be more general than 
that.

--
Peter Geoghegan


Re: Query performance going from Oracle to Postgres

2023-09-07 Thread Peter Geoghegan
On Thu, Sep 7, 2023 at 3:48 AM David Rowley  wrote:
> On Thu, 7 Sept 2023 at 19:17, Peter Geoghegan  wrote:
> > It seems likely that the problem here is that some of the predicates
> > appear as so-called "Filter:" conditions, as opposed to true index
> > quals.
>
> hmm, if that were true we'd see "Rows Removed by Filter" in the
> explain analyze.

That's why I hedged, with "seems likely". The problem with using
filter conditions rather than true index quals isn't limited to the
problem of extra heap accesses. It happened to be convenient to make
my point that way, but that isn't particularly fundamental here.

I deliberately chose to make my example involve an index-only scan
(that doesn't require any heap accesses) for this reason.

> I think all that's going on is that each tuple is on a different page
> and the heap accesses are just causing many buffers to be accessed.

This index is an absolute monstrosity. I find it far easier to believe
that the real explanation is the one that Steve intuited: that there
is an issue with the way that the physical data structures (which are
more or less comparable in both systems) are accessed in Postgres.

The index in question
("history_event_display_timesta_prism_guid_display_timestamp_idx1")
has certain columns that are omitted from the query. These columns
nevertheless appear between other columns that the query filters on.
The leading two columns ("prism_guid" and "display_timestamp") are
made into index quals by Postgres, but the next index column after
those two ("unique_lookup_key") is omitted by the query, and so isn't an index
qual. In fact *four* columns are omitted after that one. But, the very-low-order
index column "product_sid" *does* appear in the query, and so also
appears as Postgres index quals.

There is every chance that the true underlying explanation is that
Oracle is able to skip over significant parts of the index structure
dynamically. In fact I'm practically certain that that's the case,
since the "product_sid" column appears as an "access predicate", rather
than as a "filter predicate". These terms are explained here:

https://use-the-index-luke.com/sql/explain-plan/oracle/filter-predicates

https://use-the-index-luke.com/sql/explain-plan/postgresql/filter-predicates

How could "product_sid" be used as an "access predicate" given the
omitted index columns? It seems very likely that parts of the index
can be skipped in Oracle, but not in Postgres -- at least not yet.

Like Markus Winand, I think that it's a real problem that EXPLAIN
doesn't yet expose the difference between access predicates and filter
predicates. Some of the index quals shown by EXPLAIN are marked
SK_BT_REQFWD/SK_BT_REQBKWD by nbtree (meaning that they're what Oracle
calls "access predicates"), while other are not (meaning that they're
what Oracle calls "filter predicates"). That issue makes it far harder to spot
these sorts of problems.

> It seems to me that an IOS would likely fix that problem which is why I
> suggested count(*) instead of
> count()

Probably, but why should we have to use an index-only scan? And what
about my example, that already used one, and still showed a big
disparity where there is no inherently reason why there had to be? My
example didn't involve jumping to another part of the index because the
problem seems to be more general than that.

--
Peter Geoghegan




Re: Query performance going from Oracle to Postgres

2023-09-07 Thread David Rowley
On Thu, 7 Sept 2023 at 19:17, Peter Geoghegan  wrote:
> It seems likely that the problem here is that some of the predicates
> appear as so-called "Filter:" conditions, as opposed to true index
> quals.

hmm, if that were true we'd see "Rows Removed by Filter" in the
explain analyze.

I think all that's going on is that each tuple is on a different page
and the heap accesses are just causing many buffers to be accessed. It
seems to me that an IOS would likely fix that problem which is why I
suggested count(*) instead of
count()

David




Re: Query performance going from Oracle to Postgres

2023-09-06 Thread Peter Geoghegan
On Wed, Sep 6, 2023 at 1:07 PM Dirschel, Steve
 wrote:
> Oracle will find the same 332 rows using the same index but in Oracle it only 
> does 20 logical reads.  I thought maybe the index was fragmented so I 
> reindexed that index:

It seems likely that the problem here is that some of the predicates
appear as so-called "Filter:" conditions, as opposed to true index
quals. For reasons that aren't particularly fundamental, Postgres
B-Trees cannot push down != (or <>) to the index level. Strangely
enough, this is possible with IS NOT NULL. I'm working to improve
things in this area.

That difference is easy to see in the following example. The first
query does far fewer buffer accesses than the. second query, despite
being very similar:

regression=# explain (analyze, buffers) select * from multi_test where
a = 1 and b is not null;
 QUERY PLAN
-
 Index Only Scan using multi_test_idx on multi_test  (cost=0.29..50.10
rows=2191 width=8) (actual time=0.095..0.100 rows=14 loops=1)
   Index Cond: ((a = 1) AND (b IS NOT NULL))
   Heap Fetches: 0
   Buffers: shared hit=3
 Planning Time: 0.056 ms
 Execution Time: 0.145 ms
(6 rows)

regression=# explain (analyze, buffers) select * from multi_test where
a = 1 and b != 42;
  QUERY PLAN
--
 Index Only Scan using multi_test_idx on multi_test
(cost=0.29..222.57 rows=2191 width=8) (actual time=0.087..2.982
rows=14 loops=1)
   Index Cond: (a = 1)
   Filter: (b <> 42)
   Rows Removed by Filter: 1
   Heap Fetches: 0
   Buffers: shared hit=11
 Planning Time: 0.076 ms
 Execution Time: 3.204 ms
(8 rows)

(There are lots of index tuples matching (a, b) = "(1, NULL)" here, you see.)

-- 
Peter Geoghegan




Re: Query performance going from Oracle to Postgres

2023-09-06 Thread David Rowley
On Thu, 7 Sept 2023 at 11:14, Dirschel, Steve
 wrote:
> select count(historyeve0_.HISTORY_EVENT_SID) as col_0_0_ from 
> hist28.history_event_display_timestamp_20230301 historyeve0_ where 
> historyeve0_.IS_DELETED=0

> history_event_sid  | character varying(32)  |   | not 
> null |

Do you get an Index Only Scan if you use count(*) instead of
count(historyeve0_.HISTORY_EVENT_SID)?

It seems that column is the only one missing from the index and that
might be the only thing that's stopping PostgreSQL using an Index Only
Scan. Perhaps Oracle realises that the column is NOT NULL so can do
count(*) instead of counting the non-nulls from that column.  That may
allow it to Index Only Scan? I'm not versed in reading Oracle plans so
don't know if it's possible to tell.

David




Re: Query performance going from Oracle to Postgres

2023-09-06 Thread Laurenz Albe
On Wed, 2023-09-06 at 20:06 +, Dirschel, Steve wrote:
> We are in the process of converting from Oracle to Postgres and I have a 
> query that is using
> the same index in Postgres as is used in Oracle but in Postgres the query 
> does 16x more
> buffer/logical reads.  I’d like to understand why.
>  
> explain (analyze, buffers)
> select count(historyeve0_.HISTORY_EVENT_SID) as col_0_0_
> from hist28.history_event_display_timestamp_20230301 historyeve0_ where 
> historyeve0_.IS_DELETED=0
> and historyeve0_.PRISM_GUID='i0accd6a2018405f095ee669dc5b4'
> and historyeve0_.PRODUCT_SID='CARSWELL.WESTLAW'
> and (historyeve0_.EVENT_TYPE not in 
> ('versionsSearchWithinQueryEvent','notesOfDecisionsSearchWithinQueryEvent','citingReferencesSearchWithinQueryEvent','tocSearchWithinQueryEvent','searchWithinDocumentEvent','coCitesSearchWithinQueryE
> vent'))
> and (historyeve0_.PRODUCT_VIEW in 
> ('DefaultProductView','TNPPlus','PLCUS','Indigo','IndigoPremium','INDIGOCA')
> or historyeve0_.PRODUCT_VIEW is null)
> and historyeve0_.CLIENT_ID='WLNCE_VNJXL7'
> and (historyeve0_.DISPLAY_TIMESTAMP between '2022-03-01 00:00:00' and 
> '2023-03-01 23:59:59.999');
>  
> Aggregate  (cost=56.64..56.65 rows=1 width=8) (actual time=0.930..0.931 
> rows=1 loops=1)
>    Buffers:shared hit=341
>    ->  Index Scan using 
> history_event_display_timesta_prism_guid_display_timestamp_idx1 on 
> history_event_display_timestamp_20230301 historyeve0_  (cost=0.42..56.64 
> rows=1 width=33) (actual
> time=0.034..0.897 rows=332 loops=1)
>  Index Cond: (((prism_guid)::text = 
> 'i0accd6a2018405f095ee669dc5b4'::text) AND (display_timestamp >= 
> '2022-03-01 00:00:00'::timestamp without time zone) AND (display_timestamp <= 
> '2023-
> 03-01 23:59:59.999'::timestamp without time zone) AND ((product_sid)::text = 
> 'CARSWELL.WESTLAW'::text) AND (is_deleted = '0'::numeric) AND 
> ((client_id)::text = 'WLNCE_VNJXL7'::text))
>      Filter: (((event_type)::text <> ALL
> ('{versionsSearchWithinQueryEvent,notesOfDecisionsSearchWithinQueryEvent,citingReferencesSearchWithinQueryEvent,tocSearchWithinQueryEvent,searchWithinDocumentEvent,coCitesSearchWithinQueryEvent}'::t
> ext[])) AND (((product_view)::text = ANY 
> ('{DefaultProductView,TNPPlus,PLCUS,Indigo,IndigoPremium,INDIGOCA}'::text[])) 
> OR (product_view IS NULL)))
>  Buffers: shared hit=341
> Planning:
>    Buffers: shared hit=6
> Planning Time: 0.266 ms
> Execution Time: 0.965 ms
> (10 rows)
>  
> *** 341 logical reads to find 332 rows.
>  
> Plan hash value: 3361538278
>  
> --
> | Id  | Operation    | Name    | Starts | E-Rows | 
> A-Rows |   A-Time   | Buffers |
> --
> |   0 | SELECT STATEMENT | |  1 |    |
>   1 |00:00:00.01 | 20 |
> |   1 |  SORT AGGREGATE  |     |  1 |  1 |
>   1 |00:00:00.01 |  20 |
> |*  2 |   FILTER | |  1 |    |   
> 332 |00:00:00.01 |  20 |
> |   3 |    PARTITION RANGE SINGLE| |  1 |  1 |    
> 332 |00:00:00.01 |  20 |
> |*  4 | INDEX RANGE SCAN |HISTORY_EVENT_IDX02 |  1 |  1 |    
> 332 |00:00:00.01 |  20 |
> --
>  
> Predicate Information (identified by operation id):
> ---
>  
>    2 - 
> filter(TO_TIMESTAMP(:SYS_B_16,:SYS_B_17)<=TO_TIMESTAMP(:SYS_B_18,:SYS_B_19))
>    4 - access("HISTORYEVE0_"."PRISM_GUID"=:SYS_B_01 AND
>   
> "HISTORYEVE0_"."DISPLAY_TIMESTAMP">=TO_TIMESTAMP(:SYS_B_16,:SYS_B_17) AND
>   "HISTORYEVE0_"."PRODUCT_SID"=:SYS_B_02 AND 
> "HISTORYEVE0_"."IS_DELETED"=:SYS_B_00 AND
>   "HISTORYEVE0_"."CLIENT_ID"=:SYS_B_15 AND 
> "HISTORYEVE0_"."DISPLAY_TIMESTAMP"<=TO_TIMESTAMP(:SYS_B_1
>   8,:SYS_B_19))
>    filter(("HISTORYEVE0_"."CLIENT_ID"=:SYS_B_15 AND 
> "HISTORYEVE0_"."PRODUCT_SID"=:SYS_B_02
>   AND "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_03 AND 
> "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_04 AND
>   "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_05 AND 
> "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_06 AND
>   "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_07 AND 
> "HISTORYEVE0_"."EVENT_TYPE"<>:SYS_B_08 AND
>   "HISTORYEVE0_"."IS_DELETED"=:SYS_B_00 AND 
> (INTERNAL_FUNCTION("HISTORYEVE0_"."PRODUCT_VIEW") OR
>   "HISTORYEVE0_"."PRODUCT_VIEW" IS NULL)))
>  
> It finds the same 332 rows but it does only 20 logical reads.

In PostgreSQL, the 332 matching rows seem to me stored mostly on different 
pages, while they are
better clustered in your Oracle table.

If it is really very important for you, 

Re: Query Performance

2022-09-17 Thread sivapostg...@yahoo.com
 I should have given you the full query.   Here it is
 Select a.locationname, a.usercode, a.itemname, a.uomname, a.batchnumber, 
a.expirydate,         a.itemnamefk, a.itemuomfk, f.itemgroupname, a.locationpk, 
        Sum(a.quantity) as quantity, Sum(a.freequantity) as freequantity,       
  b.beforetaxrate as beforetaxrate, (b.rate / b.rateper) as purchaserate, 
b.netrate as netrate,         b.netsalesrate as netsalesrate, b.effectiverate 
as effectiverate, b.rateper, a.reportformat,         g.standardcost, 
g.defaultpurchaserate,         g.salesrateone, g.salesratetwo, 
g.salesratethree, g.salesratefour, g.salesratefive, g.salesratesix,         
g.salesrateseven, g.salesrateeight, g.salesratenine, 'N' as negativerate  From  
(Select k.locationname, a.usercode, a.itemname, c.uomname, j.batchnumber, 
j.expirydate,                 j.itemnamefk, j.itemuomfk, k.locationpk,          
     j.receivedquantity as quantity, j.receivedfreequantity as freequantity, 
c.reportformat         From   in_item_name a         Join   in_item_uom b       
  On     b.itemnamefk = a.itemnamepk         Join   gl_uom c         On     
c.uompk = b.uomfk         Join   view_item_receipts j         On     
j.itemnamefk = a.itemnamepk         And    j.itemuomfk = b.itemuompk         
Join   in_location k         On     k.locationpk = j.locationfk         Where  
j.companycode = 'SDM'        And    j.branchcode = '001'        And    
j.accountperiodid = 1        And    j.voucherdate <= '2022/09/17'        And    
j.billstatus <> 'C'         And    j.topparentcode <> 4         And    
(j.receivedquantity <> 0 Or j.receivedfreequantity <> 0)         UNION ALL      
   Select k.locationname, a.usercode, a.itemname, c.uomname, j.batchnumber, 
j.expirydate,                 j.itemnamefk, j.itemuomfk, k.locationpk,          
     (j.issuedquantity * -1) as quantity, (j.issuedfreequantity * -1) as 
freequantity, c.reportformat         From   in_item_name a         Join   
in_item_uom b         On     b.itemnamefk = a.itemnamepk         Join   gl_uom 
c         On     c.uompk = b.uomfk         Join   view_item_issues j         On 
    j.itemnamefk = a.itemnamepk         And    j.itemuomfk = b.itemuompk        
 Join   in_location k         On     k.locationpk = j.locationfk         Where  
j.companycode = 'SDM'        And    j.branchcode = '001'        And    
j.accountperiodid = 1        And    j.voucherdate <= '2022/09/17'        And    
j.billstatus <> 'C'         And    j.topparentcode <> 4         And    
(j.issuedquantity <> 0 Or j.issuedfreequantity <> 0)) a  Left Outer Join 
view_item_receipts b    <- It's actually a view 
of 4 tables which tries to arrive the last purchase rate On     
b.itemreceiptspk = (Select c.itemreceiptspk                             From  
view_item_receipts c                             Where c.companycode = 'SDM'    
                        And   c.branchcode = '001'                            
And   c.accountperiodid = 1                            And   c.voucherdate <= 
'2022/09/17'                            And   c.billstatus <> 'C'               
              And   c.itemnamefk = a.itemnamefk                             And 
  c.itemuomfk = a.itemuomfk                             And   c.batchnumber = 
a.batchnumber                             And   c.expirydate = a.expirydate     
                        And   (c.receivedquantity <> 0 Or 
c.receivedfreequantity <> 0)                             Order by c.voucherdate 
desc, c.vouchernumber desc, c.sequencenumber desc                             
Limit 1 )  Join   in_item_name c  On     c.itemnamepk = a.itemnamefk  Join   
in_item_group f  On     f.itemgrouppk = c.itemgroupfk  Left Outer Join 
in_item_rate g  On     g.itemuomfk = b.itemuomfk  And    g.itemnamefk = 
b.itemnamefk  And    '2022/09/17' between g.fromdate and g.todate  Group By 
a.locationname, a.usercode, a.itemname, a.uomname, a.batchnumber, a.expirydate, 
          a.itemnamefk, a.itemuomfk, f.itemgroupname, a.locationpk,           
b.beforetaxrate, b.rate, b.netrate, b.netsalesrate, b.effectiverate, b.rateper, 
a.reportformat,           g.standardcost, g.defaultpurchaserate,           
g.salesrateone, g.salesratetwo, g.salesratethree, g.salesratefour, 
g.salesratefive, g.salesratesix,           g.salesrateseven, g.salesrateeight, 
g.salesratenine  Having   (sum(a.quantity) + sum(a.freequantity)) <> 0  Order 
by 1, 3, 2, 5 

Create a index for companycode,  branchcode,  c.accountperiodid, voucherdate, 
billstatus, itemnamefk, itemuomfk, batchnumber, expirydate, receivedquantity, 
receivedfreequantity
in all the 4 tables that this view got.
Happiness Always
BKR Sivaprakash
On Saturday, 17 September, 2022 at 03:33:48 pm IST, Peter J. Holzer 
 wrote:  
 
 On 2022-09-17 05:28:25 +, sivapostg...@yahoo.com wrote:
> My query is like this  
> 
> Select a.field1, a.field2, a.field3
> From  (Select a.field1, b.field2, c.field3
>        From  table1 a
>       

Re: Query Performance

2022-09-17 Thread Peter J. Holzer
On 2022-09-17 05:28:25 +, sivapostg...@yahoo.com wrote:
> My query is like this   
> 
> Select a.field1, a.field2, a.field3
> From   (Select a.field1, b.field2, c.field3
> From   table1 a
> Join   table2 b
> on b.something = a.something
> Join   table3 c
> On c.something = a.something 
> Where  a.field7 = 'value'
> 
> UNION ALL
> 
> Select a.field4, a.field5, a.field6
> From   table11 a
> Join   table21 b
> On b.something = a.something
> Where  a.field8 = 'something' ) a
> Join   table10 b
> On b.field11 = (Select c.field11
> From   table10 c
> Where  c.field10 = a.field1 ) <-
> instead of a.field1, if I hardcode value (eg. '100') query runs faster
> Join   table21 c
> On c.something = a.something
> ...

Well, you are now searching table10 for a constant value (which can be
done once) instead of the output of the union (which has to be done for
each line of the union, so I'm not surprised that it's faster.

What is the output of `explain (analyze)` for the two queries? Is there
an obvious place where an index would help? Can you restructure the
query?

BTW, it is (at least for me) very hard to give advice on a query with
only completely abstract names like `table11` or `field4`: I have no
idea what this is supposed to do, so it's hard to tell if there is a
better way. Using `a` to refer to 3 different things doesn't help
either.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: Query Performance

2022-09-17 Thread Ron

On 9/17/22 00:28, sivapostg...@yahoo.com wrote:

Hello,

My query is like this

Select a.field1, a.field2, a.field3
From   (Select a.field1, b.field2, c.field3
        From   table1 a
        Join   table2 b
        on     b.something = a.something
        Join   table3 c
        On     c.something = a.something
        Where  a.field7 = 'value'

        UNION ALL

        Select a.field4, a.field5, a.field6
        From   table11 a
        Join   table21 b
        On     b.something = a.something
        Where  a.field8 = 'something' ) a
Join   table10 b
On     b.field11 = (Select c.field11
                    From   table10 c
                    Where  c.field10 = a.field1 )                <- 
instead of a.field1, if I hardcode value (eg. '100') query runs faster

Join   table21 c
On     c.something = a.something
...


In the above query, If I substitute a value for a.field1, query runs faster.
Any suggestion/guidance/links to improve the query performance without 
substituting the value ?


Is there an index on table1.field1?
Have you vacuumed or analyzed table1 lately?
Have you decomposed the query into the smallest possible query that 
demonstrates the problem?



--
Angular momentum makes the world go 'round.

Re: Query Performance

2022-09-16 Thread sivapostg...@yahoo.com
Hello,
My query is like this   
Select a.field1, a.field2, a.field3From   (Select a.field1, b.field2, c.field3  
      From   table1 a        Join   table2 b        on     b.something = 
a.something        Join   table3 c        On     c.something = a.something      
   Where  a.field7 = 'value'
        UNION ALL
        Select a.field4, a.field5, a.field6        From   table11 a        Join 
  table21 b        On     b.something = a.something        Where  a.field8 = 
'something' ) aJoin   table10 bOn     b.field11 = (Select c.field11             
       From   table10 c                    Where  c.field10 = a.field1 )        
             <- instead of a.field1, if I hardcode value (eg. '100') query runs 
fasterJoin   table21 cOn     c.something = a.something...

In the above query, If I substitute a value for a.field1, query runs faster.   
Any suggestion/guidance/links to improve the query performance without 
substituting the value ?
Happiness Always
BKR Sivaprakash



Re: Query performance with min and filter

2020-08-31 Thread Tom Lane
"Yorwerth, Adam"  writes:
> We seem to have found a situation where a query run using explain analyse or 
> pgbench is incredibly fast, but run via Java under load performs very poorly 
> – we’ve checked query performance metrics for our Postgres instance and can 
> confirm that it’s the query running slowly, and not a Java problem.

Usually when people report something like this, the explanation is that
it's not actually the same query being tested in both cases.

In particular, if you're parameterizing the query in Java but substituting
constant values when running it manually, that can make a difference.

A further-afield possibility is that your application has been set up to
run with custom values of work_mem, planner cost constants, etc, which
you're failing to duplicate in the manual tests.

If all else fails, try installing auto_explain so that you can capture
the actual query plan used when the application executes this query.

regards, tom lane




Re: Query performance with min and filter

2020-08-31 Thread Sengottuvelusamy, Karthikeyan
Hi Adam,

What're the query times when you run the query directly on the psql prompt, but 
without explain/analyze?

Can you check the cache hit rate vs disk read on explain analyze vs from java? 
Sometimes, the data's in RAM when you run a query manually, but the live Java 
app might be hitting the disk, causing the slowdown. Are you also sure that the 
time taken by psql for the query when called from the java app is being 
measured and not the network delays etc?

Regards,
Karthik.
--
HoSD, In-Store Platform
Ph: +91 99406 59452


From: Yorwerth, Adam 
Sent: Monday, August 31, 2020 2:57 PM
To: pgsql-general@lists.postgresql.org 
Cc: Cianciaruso, Lorenzo 
Subject: Query performance with min and filter


Hi Postgres community,



We seem to have found a situation where a query run using explain analyse or 
pgbench is incredibly fast, but run via Java under load performs very poorly – 
we’ve checked query performance metrics for our Postgres instance and can 
confirm that it’s the query running slowly, and not a Java problem.



The table has ~7 million rows, with total size of 44GB. As shown below, explain 
analyse gives a sub 1ms execution time, yet in practice the query takes an 
average of 2 seconds – it’s worth mentioning that a small percentage of 
requests do take ~1ms, so something non-deterministic is happening here.



The query in question is below, with explain analyse:



explain analyse SELECT coalesce (

(SELECT min(msg_offset) - 1 FROM events WHERE created_utc >= 
CURRENT_TIMESTAMP - INTERVAL '60 SECONDS' ),

(SELECT max(msg_offset) FROM events),

0

);


 QUERY PLAN



---

-

 Result  (cost=2.03..2.04 rows=1 width=8) (actual time=0.183..0.183 rows=1 
loops=1)

   InitPlan 1 (returns $1)

 ->  Aggregate  (cost=1.56..1.57 rows=1 width=8) (actual time=0.028..0.028 
rows=1 loops=1)

   ->  Index Only Scan using test_4 on events  (cost=0.44..1.56 rows=1 
width=8) (actual time=0.023..0.023 rows=0 loops=1)

 Index Cond: (created_utc >= (now() - '00:01:00'::interval))

 Heap Fetches: 0

   InitPlan 3 (returns $3)

 ->  Result  (cost=0.45..0.46 rows=1 width=8) (actual time=0.153..0.153 
rows=1 loops=1)

   InitPlan 2 (returns $2)

 ->  Limit  (cost=0.43..0.45 rows=1 width=8) (actual 
time=0.152..0.152 rows=1 loops=1)

   ->  Index Only Scan Backward using events_pkey on events 
events_1  (cost=0.43..156842.57 rows=7643099 width=8) (actual time=0.151..0.151 
rows=1 loops=1)

 Index Cond: (msg_offset IS NOT NULL)

 Heap Fetches: 0

 Planning time: 2.480 ms

 Execution time: 0.242 ms

(15 rows)



Output of \d events, with a few test indexes added to try and improve 
performance:



 Table "public.events"

Column|Type | Collation | Nullable |
  Default

--+-+---+--+

 msg_offset   | bigint  |   | not null | 
nextval('events_msg_offset_seq'::regclass)

 msg_key  | character varying   |   | not null |

 content_type | character varying   |   | not null |

 type | character varying   |   | not null |

 created_utc  | timestamp without time zone |   | not null |

 data | text|   |  |

 event_size   | integer |   | not null | 1048576

 cluster_id   | bigint  |   | not null | 1

Indexes:

"events_pkey" PRIMARY KEY, btree (msg_offset)

"cluster_idx" btree (msg_key, cluster_id)

"cluster_type_filter_idx" btree (type, cluster_id)

"created_idx" btree (created_utc, msg_key)

"key_idx" btree (msg_key)

"test_1" btree (created_utc, msg_offset)

"test_2" btree (msg_offset, created_utc)

"test_3" btree (created_utc, msg_offset)

"test_4" btree (created_utc, msg_offset DESC)

"type_idx" btree (type)



Any help understanding what is happening here, or guidance to improve 
performance would be much appreciated.



Regards,



Adam



This is a confidential email. Tesco may monitor and record all emails. The 
views expressed in this email are those of the sender and not Tesco. Tesco 
Stores Limited Company Number: 519500 Registered in England Registered Office: 
Tesco House, Shire Park, Kestrel Way, Welwyn Garden City, AL7 1GA VAT 
Registration Number: GB 220 4302 31


Re: query performance

2018-02-20 Thread PT
On Sun, 18 Feb 2018 00:35:18 +0100
hmidi slim  wrote:

> Hi,
> I have two tables: establishment which contains these columns: id, name,
> longitude, latitude, geom (Geometric column)
> Product contains: id, name, establishment_id
> First of all I want to select the establishment within a radius.
> I run this query:
> select e.name, e1.name
> from establishment as e, establishment as e1
> where e.id <> e1.id
> and e1.id = 1
> and ST_DWithin(geom, ST_MakePoint(e1.longitude, e1.latitude)::geography,
> 1000)
> 
> The result of this query returns all the establishment within a radius 1KM
> from from a given establishment which has an id = 1.
> 
> After that I want to get the product's name of each establishment from the
> query's result.
> 
> Is there an other optimized solution to make a query such this:
> select * from (
> select e.name, e1.name, e.id
> from establishment as e, establishment as e1
> where e.id <> e1.id
> and e1.id = 1
> and ST_DWithin(geom, ST_MakePoint(e1.longitude, e1.latitude)::geography,
> 1000)) as tmp inner join product as p on p.establishment_id = tmp.id

Did you not see my earlier response?

-- 
Bill Moran



Re: query performance

2018-02-18 Thread Tomas Vondra

On 02/18/2018 06:37 AM, David Rowley wrote:
> On 18 February 2018 at 12:35, hmidi slim  wrote:
>> Is there an other optimized solution to make a query such this:
>> select * from (
>> select e.name, e1.name, e.id
>> from establishment as e, establishment as e1
>> where e.id <> e1.id
>> and e1.id = 1
>> and ST_DWithin(geom, ST_MakePoint(e1.longitude, e1.latitude)::geography,
>> 1000)) as tmp inner join product as p on p.establishment_id = tmp.id
> 
> You really should state what is wrong with the performance of the
> above version and also include the EXPLAIN  (ANALYZE, BUFFERS) of that
> query.
> 
> The version of PostgreSQL that you're running it on is also a good
> thing to share.
> 
> Details of the indexes which you've defined on the tables are also
> useful information. It would especially be good to know if you've
> added an index on product (establishment_id), for example.
> 

There's actually a wiki page with overview of what should be included in
"slow query" questions:

https://wiki.postgresql.org/wiki/Slow_Query_Questions

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: query performance

2018-02-17 Thread David Rowley
On 18 February 2018 at 12:35, hmidi slim  wrote:
> Is there an other optimized solution to make a query such this:
> select * from (
> select e.name, e1.name, e.id
> from establishment as e, establishment as e1
> where e.id <> e1.id
> and e1.id = 1
> and ST_DWithin(geom, ST_MakePoint(e1.longitude, e1.latitude)::geography,
> 1000)) as tmp inner join product as p on p.establishment_id = tmp.id

You really should state what is wrong with the performance of the
above version and also include the EXPLAIN  (ANALYZE, BUFFERS) of that
query.

The version of PostgreSQL that you're running it on is also a good
thing to share.

Details of the indexes which you've defined on the tables are also
useful information. It would especially be good to know if you've
added an index on product (establishment_id), for example.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services