Unexpected Performance for the Function simplify_function

Ba Jinsheng Fri, 25 Oct 2024 01:18:44 -0700

Dear PostgreSQL Community,

For the query 10 in TPC-H benchmark:
select
      c_custkey,
      c_name,
      sum(l_extendedprice * (1 - l_discount)) as revenue,
      c_acctbal,
      n_name,
      c_address,
      c_phone,
      c_comment
from
      CUSTOMER,
      ORDERS,
      LINEITEM,
      NATION
where
      c_custkey = o_custkey
      and l_orderkey = o_orderkey
      and o_orderdate >= date '1993-08-01'
      and o_orderdate < date '1993-08-01' + interval '3' month
      and l_returnflag = 'R'
      and c_nationkey = n_nationkey
group by
      c_custkey,
      c_name,
      c_acctbal,
      c_phone,
      n_name,
      c_address,
      c_comment
order by
      revenue desc
limit
      20;



Its query plan is:

                                                                                
        QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=200479.71..200479.76 rows=20 width=205) (actual 
time=506.558..510.594 rows=20 loops=1)
   ->  Sort  (cost=200479.71..200622.37 rows=57064 width=205) (actual 
time=506.557..510.591 rows=20 loops=1)
         Sort Key: (sum((lineitem.l_extendedprice * ('1'::numeric - 
lineitem.l_discount)))) DESC
         Sort Method: top-N heapsort  Memory: 34kB
         ->  Finalize GroupAggregate  (cost=191629.63..198961.25 rows=57064 
width=205) (actual time=441.132..501.986 rows=37925 loops=1)
               Group Key: customer.c_custkey, nation.n_name
               ->  Gather Merge  (cost=191629.63..197772.41 rows=47554 
width=205) (actual time=441.124..474.623 rows=37925 loops=1)
                     Workers Planned: 2
                     Workers Launched: 2
                     ->  Partial GroupAggregate  (cost=190629.61..191283.48 
rows=23777 width=205) (actual time=437.497..464.923 rows=12642 loops=3)
                           Group Key: customer.c_custkey, nation.n_name
                           ->  Sort  (cost=190629.61..190689.05 rows=23777 
width=185) (actual time=437.485..441.339 rows=38183 loops=3)
                                 Sort Key: customer.c_custkey, nation.n_name
                                 Sort Method: external merge  Disk: 7184kB
                                 Worker 0:  Sort Method: external merge  Disk: 
7448kB
                                 Worker 1:  Sort Method: external merge  Disk: 
7264kB
                                 ->  Hash Join  (cost=181606.66..186706.85 
rows=23777 width=185) (actual time=385.555..418.269 rows=38183 loops=3)
                                       Hash Cond: (customer.c_nationkey = 
nation.n_nationkey)
                                       ->  Parallel Hash Join  
(cost=181605.09..186632.29 rows=23777 width=160) (actual time=385.484..411.936 
rows=38183 loops=3)
                                             Hash Cond: (customer.c_custkey = 
orders.o_custkey)
                                             ->  Parallel Seq Scan on customer  
(cost=0.00..4225.00 rows=62500 width=148) (actual time=0.028..9.805 rows=50000 
loops=3)
                                             ->  Parallel Hash  
(cost=181307.88..181307.88 rows=23777 width=16) (actual time=385.060..385.063 
rows=38183 loops=3)
                                                   Buckets: 131072 (originally 
65536)  Batches: 1 (originally 1)  Memory Usage: 7648kB
                                                   ->  Parallel Hash Join  
(cost=35809.22..181307.88 rows=23777 width=16) (actual time=69.608..371.381 
rows=38183 loops=3)
                                                         Hash Cond: 
(lineitem.l_orderkey = orders.o_orderkey)
                                                         ->  Parallel Seq Scan 
on lineitem  (cost=0.00..143863.66 rows=622855 width=16) (actual 
time=0.024..255.818 rows=492957 loops=3)
                                                               Filter: 
(l_returnflag = 'R'::bpchar)
                                                               Rows Removed by 
Filter: 1507448
                                                         ->  Parallel Hash  
(cost=35511.00..35511.00 rows=23858 width=8) (actual time=68.857..68.858 
rows=19046 loops=3)
                                                               Buckets: 65536  
Batches: 1  Memory Usage: 2816kB
                                                               ->  Parallel Seq 
Scan on orders  (cost=0.00..35511.00 rows=23858 width=8) (actual 
time=0.033..62.907 rows=19046 loops=3)
                                                                     Filter: 
((o_orderdate >= '1993-08-01'::date) AND (o_orderdate < '1993-11-01 
00:00:00'::timestamp without time zone))
                                                                     Rows 
Removed by Filter: 480954
                                       ->  Hash  (cost=1.25..1.25 rows=25 
width=33) (actual time=0.037..0.037 rows=25 loops=3)
                                             Buckets: 1024  Batches: 1  Memory 
Usage: 10kB
                                             ->  Seq Scan on nation  
(cost=0.00..1.25 rows=25 width=33) (actual time=0.020..0.024 rows=25 loops=3)
 Planning Time: 2.295 ms
 Execution Time: 512.015 ms
(38 rows)


While if I apply this patch to disable the simplify_function():

diff --git a/src/backend/optimizer/util/clauses.c 
b/src/backend/optimizer/util/clauses.c
index b4e085e9d4..155bbd9fbc 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -2636,15 +2636,6 @@ eval_const_expressions_mutator(Node *node,
                                 * Code for op/func reduction is pretty bulky, 
so split it out
                                 * as a separate function.
                                 */
-                               simple = simplify_function(expr->opfuncid,
-                                                                               
   expr->opresulttype, -1,
-                                                                               
   expr->opcollid,
-                                                                               
   expr->inputcollid,
-                                                                               
   &args,
-                                                                               
   false,
-                                                                               
   true,
-                                                                               
   true,
-                                                                               
   context);
                                if (simple)             /* successfully 
simplified it */
                                        return (Node *) simple;


Then we can get a more efficient query plan:

                                                                                
        QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=120917.71..120917.76 rows=20 width=202) (actual 
time=336.255..344.250 rows=20 loops=1)
   ->  Sort  (cost=120917.71..120936.18 rows=7387 width=202) (actual 
time=336.254..344.248 rows=20 loops=1)
         Sort Key: (sum((lineitem.l_extendedprice * ((1)::numeric - 
lineitem.l_discount)))) DESC
         Sort Method: top-N heapsort  Memory: 34kB
         ->  Finalize GroupAggregate  (cost=119764.35..120721.15 rows=7387 
width=202) (actual time=271.425..335.755 rows=37925 loops=1)
               Group Key: customer.c_custkey, nation.n_name
               ->  Gather Merge  (cost=119764.35..120567.25 rows=6156 
width=202) (actual time=271.420..309.049 rows=37925 loops=1)
                     Workers Planned: 2
                     Workers Launched: 2
                     ->  Partial GroupAggregate  (cost=118764.33..118856.67 
rows=3078 width=202) (actual time=267.897..296.298 rows=12642 loops=3)
                           Group Key: customer.c_custkey, nation.n_name
                           ->  Sort  (cost=118764.33..118772.02 rows=3078 
width=182) (actual time=267.881..271.626 rows=38183 loops=3)
                                 Sort Key: customer.c_custkey, nation.n_name
                                 Sort Method: external merge  Disk: 7248kB
                                 Worker 0:  Sort Method: external merge  Disk: 
7328kB
                                 Worker 1:  Sort Method: external merge  Disk: 
7328kB
                                 ->  Hash Join  (cost=113641.60..118585.99 
rows=3078 width=182) (actual time=222.247..249.682 rows=38183 loops=3)
                                       Hash Cond: (customer.c_nationkey = 
nation.n_nationkey)
                                       ->  Parallel Hash Join  
(cost=113640.04..118574.98 rows=3078 width=160) (actual time=222.183..243.657 
rows=38183 loops=3)
                                             Hash Cond: (customer.c_custkey = 
orders.o_custkey)
                                             ->  Parallel Seq Scan on customer  
(cost=0.00..4219.00 rows=62500 width=148) (actual time=0.029..5.646 rows=50000 
loops=3)
                                             ->  Parallel Hash  
(cost=113601.56..113601.56 rows=3078 width=16) (actual time=221.947..221.948 
rows=38183 loops=3)
                                                   Buckets: 131072 (originally 
8192)  Batches: 1 (originally 1)  Memory Usage: 8064kB
                                                   ->  Nested Loop  
(cost=0.43..113601.56 rows=3078 width=16) (actual time=0.096..204.897 
rows=38183 loops=3)
                                                         ->  Parallel Seq Scan 
on orders  (cost=0.00..37062.50 rows=3125 width=8) (actual time=0.029..63.908 
rows=19046 loops=3)
                                                               Filter: 
((o_orderdate >= '1993-08-01'::date) AND (o_orderdate < ('1993-08-01'::date + 
'3 mons'::interval month)))
                                                               Rows Removed by 
Filter: 480954
                                                         ->  Index Scan using 
lineitem_pkey on lineitem  (cost=0.43..24.45 rows=4 width=16) (actual 
time=0.006..0.007 rows=2 loops=57138)
                                                               Index Cond: 
(l_orderkey = orders.o_orderkey)
                                                               Filter: 
(l_returnflag = 'R'::bpchar)
                                                               Rows Removed by 
Filter: 2
                                       ->  Hash  (cost=1.25..1.25 rows=25 
width=30) (actual time=0.034..0.035 rows=25 loops=3)
                                             Buckets: 1024  Batches: 1  Memory 
Usage: 10kB
                                             ->  Seq Scan on nation  
(cost=0.00..1.25 rows=25 width=30) (actual time=0.018..0.023 rows=25 loops=3)
 Planning Time: 1.207 ms
 Execution Time: 345.427 ms
(36 rows)


The estimated cost is reduced by 39.69%, and the execution time is reduced by 
32.54%. I measured the execution time on average of 10 executions.
I am not proposing a fixing patch, as the patch is incorrect. Instead, I just 
want to show disabling the simplify_function() function brings performance 
benefit, and it seems unexpected. I am wondering whether we can optimize 
simplify_function() to make the performance better for this workload?



Environment:
I used 1 GB data of TPC-H benchmark, and my entire data folder can be 
downloaded here: 
https://drive.google.com/file/d/1ZBLHanIRwxbaMQIhRUSPv4I7y8g_0AWi/view?usp=sharing
The connection string is: postgresql://ubuntu:[email protected]:5432/tpch"

tpch=# select version();
                                             version
--------------------------------------------------------------------------------------------------
 PostgreSQL 17.0 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 
13.2.0-23ubuntu4) 13.2.0, 64-bit
(1 row)



Best regards,

Jinsheng Ba



Notice: This email is generated from the account of an NUS alumnus. Contents, 
views, and opinions therein are solely those of the sender.

Unexpected Performance for the Function simplify_function

Reply via email to