(Please read this message on wide display)

Our team recently tries to run TPC-DS benchmark to know capability of
PostgreSQL towards typical analytic queries.
TPC-DS defines about 100 complicated queries. We noticed optimizer made
unreasonable execution plan towards some of queries.

Here is an example (query.04) below, on bottom of this message.

Its query execution plan by EXPLAIN is below. The most time consuming
portion is multilevel nested-loop on bottom of the EXPLAIN output,
because it sequentially runs on the CTE result.
I cannot complete the query execution within 30minutes towards SF=1
on Xeon E5-2670v3 and RAM=384GB environment.

----------------------------------------------------------------------
Limit  (cost=1248808.70..1248808.71 rows=1 width=132)
   CTE year_total
     ->  Append  (cost=193769.66..496076.44 rows=4778919 width=220)
           ->  HashAggregate  (cost=193769.66..226692.26 rows=2633808 width=178)
                 Group Key: customer.c_customer_id, customer.c_first_name, 
customer.c_last_name, customer.c_preferred_cust_flag, customer.c_birth_country, 
customer.c_login, customer.c_email_address, date_dim.d_year
                 ->  Custom Scan (GpuJoin)  (cost=14554.84..108170.90 
rows=2633808 width=178)
                       Bulkload: On (density: 100.00%)
                       Depth 1: Logic: GpuHashJoin, HashKeys: 
(ss_sold_date_sk), JoinQual: (ss_sold_date_sk = d_date_sk), nrows_ratio: 
0.95623338
                       Depth 2: Logic: GpuHashJoin, HashKeys: (ss_customer_sk), 
JoinQual: (ss_customer_sk = c_customer_sk), nrows_ratio: 0.91441411
                       ->  Custom Scan (BulkScan) on store_sales  
(cost=0.00..96501.23 rows=2880323 width=38)
                       ->  Seq Scan on date_dim  (cost=0.00..2705.49 rows=73049 
width=16)
                       ->  Seq Scan on customer  (cost=0.00..4358.00 
rows=100000 width=156)
           ->  HashAggregate  (cost=125474.72..143301.10 rows=1426111 width=181)
                 Group Key: customer_1.c_customer_id, customer_1.c_first_name, 
customer_1.c_last_name, customer_1.c_preferred_cust_flag, 
customer_1.c_birth_country, customer_1.c_login, customer_1.c_email_address, 
date_dim_1.d_year
                 ->  Custom Scan (GpuJoin)  (cost=14610.07..79126.11 
rows=1426111 width=181)
                       Bulkload: On (density: 100.00%)
                       Depth 1: Logic: GpuHashJoin, HashKeys: 
(cs_bill_customer_sk), JoinQual: (c_customer_sk = cs_bill_customer_sk), 
nrows_ratio: 0.99446636
                       Depth 2: Logic: GpuHashJoin, HashKeys: 
(cs_sold_date_sk), JoinQual: (cs_sold_date_sk = d_date_sk), nrows_ratio: 
0.98929483
                       ->  Custom Scan (BulkScan) on catalog_sales  
(cost=0.00..65628.43 rows=1441543 width=41)
                       ->  Seq Scan on customer customer_1  (cost=0.00..4358.00 
rows=100000 width=156)
                       ->  Seq Scan on date_dim date_dim_1  (cost=0.00..2705.49 
rows=73049 width=16)
           ->  HashAggregate  (cost=69306.38..78293.88 rows=719000 width=181)
                 Group Key: customer_2.c_customer_id, customer_2.c_first_name, 
customer_2.c_last_name, customer_2.c_preferred_cust_flag, 
customer_2.c_birth_country, customer_2.c_login, customer_2.c_email_address, 
date_dim_2.d_year
                 ->  Custom Scan (GpuJoin)  (cost=14702.18..45938.88 
rows=719000 width=181)
                       Bulkload: On (density: 100.00%)
                       Depth 1: Logic: GpuHashJoin, HashKeys: 
(ws_bill_customer_sk), JoinQual: (c_customer_sk = ws_bill_customer_sk), 
nrows_ratio: 0.99973309
                       Depth 2: Logic: GpuHashJoin, HashKeys: 
(ws_sold_date_sk), JoinQual: (ws_sold_date_sk = d_date_sk), nrows_ratio: 
0.99946618
                       ->  Custom Scan (BulkScan) on web_sales  
(cost=0.00..32877.84 rows=719384 width=41)
                       ->  Seq Scan on customer customer_2  (cost=0.00..4358.00 
rows=100000 width=156)
                       ->  Seq Scan on date_dim date_dim_2  (cost=0.00..2705.49 
rows=73049 width=16)
   ->  Sort  (cost=752732.27..752732.27 rows=1 width=132)
         Sort Key: t_s_secyear.customer_id, t_s_secyear.customer_first_name, 
t_s_secyear.customer_last_name, t_s_secyear.customer_email_address
         ->  Nested Loop  (cost=0.00..752732.26 rows=1 width=132)
               Join Filter: ((t_s_secyear.customer_id = 
t_w_secyear.customer_id) AND (CASE WHEN (t_c_firstyear.year_total > 
'0'::numeric) THEN (t_c_secyear.year_total / t_c_firstyear.year_total) ELSE 
NULL::numeric END > CASE WHEN (t_w_firstyear.year_total > '0'::numeric) THEN 
(t_w_secyear.year_total / t_w_firstyear.year_total) ELSE NULL::numeric END))
               ->  Nested Loop  (cost=0.00..633256.31 rows=1 width=308)
                     Join Filter: ((t_s_secyear.customer_id = 
t_c_secyear.customer_id) AND (CASE WHEN (t_c_firstyear.year_total > 
'0'::numeric) THEN (t_c_secyear.year_total / t_c_firstyear.year_total) ELSE 
NULL::numeric END > CASE WHEN (t_s_firstyear.year_total > '0'::numeric) THEN 
(t_s_secyear.year_total / t_s_firstyear.year_total) ELSE NULL::numeric END))
                     ->  Nested Loop  (cost=0.00..513780.36 rows=1 width=320)
                           Join Filter: (t_s_firstyear.customer_id = 
t_s_secyear.customer_id)
                           ->  Nested Loop  (cost=0.00..394303.22 rows=2 
width=156)
                                 Join Filter: (t_s_firstyear.customer_id = 
t_w_firstyear.customer_id)
                                 ->  Nested Loop  (cost=0.00..262876.15 rows=8 
width=104)
                                       Join Filter: (t_s_firstyear.customer_id 
= t_c_firstyear.customer_id)
                                       ->  CTE Scan on year_total t_s_firstyear 
 (cost=0.00..131420.27 rows=40 width=52)
                                             Filter: ((year_total > 
'0'::numeric) AND (sale_type = 's'::text) AND (dyear = 2001))
                                       ->  CTE Scan on year_total t_c_firstyear 
 (cost=0.00..131420.27 rows=40 width=52)
                                             Filter: ((year_total > 
'0'::numeric) AND (sale_type = 'c'::text) AND (dyear = 2001))
                                 ->  CTE Scan on year_total t_w_firstyear  
(cost=0.00..131420.27 rows=40 width=52)
                                       Filter: ((year_total > '0'::numeric) AND 
(sale_type = 'w'::text) AND (dyear = 2001))
                           ->  CTE Scan on year_total t_s_secyear  
(cost=0.00..119472.98 rows=119 width=164)
                                 Filter: ((sale_type = 's'::text) AND (dyear = 
2002))
                     ->  CTE Scan on year_total t_c_secyear  
(cost=0.00..119472.98 rows=119 width=52)
                           Filter: ((sale_type = 'c'::text) AND (dyear = 2002))
               ->  CTE Scan on year_total t_w_secyear  (cost=0.00..119472.98 
rows=119 width=52)
                     Filter: ((sale_type = 'w'::text) AND (dyear = 2002))
(54 rows)
----------------------------------------------------------------------

On the other hands, once I turned off the nested-loop using SET 
enable_nestloop=off,
EXPLAIN displayed the following output, and query gets completed with 11.2sec.
This plan replaced a bunch of NestLoop on CTE Scan by HashJoin, therefore, JOIN
logic does not need to take full-scan on CTE multiple times.

-----------------------------------------------------------------------
Limit  (cost=1248761.93..1248761.93 rows=1 width=132)
   CTE year_total
     ->  Append  (cost=193769.66..496076.44 rows=4778919 width=220)
           ->  HashAggregate  (cost=193769.66..226692.26 rows=2633808 width=178)
                 Group Key: customer.c_customer_id, customer.c_first_name, 
customer.c_last_name, customer.c_preferred_cust_flag, customer.c_birth_country, 
customer.c_login, customer.c_email_address, date_dim.d_year
                 ->  Custom Scan (GpuJoin)  (cost=14554.84..108170.90 
rows=2633808 width=178)
                       Bulkload: On (density: 100.00%)
                       Depth 1: Logic: GpuHashJoin, HashKeys: 
(ss_sold_date_sk), JoinQual: (ss_sold_date_sk = d_date_sk), nrows_ratio: 
0.95623338
                       Depth 2: Logic: GpuHashJoin, HashKeys: (ss_customer_sk), 
JoinQual: (ss_customer_sk = c_customer_sk), nrows_ratio: 0.91441411
                       ->  Custom Scan (BulkScan) on store_sales  
(cost=0.00..96501.23 rows=2880323 width=38)
                       ->  Seq Scan on date_dim  (cost=0.00..2705.49 rows=73049 
width=16)
                       ->  Seq Scan on customer  (cost=0.00..4358.00 
rows=100000 width=156)
           ->  HashAggregate  (cost=125474.72..143301.10 rows=1426111 width=181)
                 Group Key: customer_1.c_customer_id, customer_1.c_first_name, 
customer_1.c_last_name, customer_1.c_preferred_cust_flag, 
customer_1.c_birth_country, customer_1.c_login, customer_1.c_email_address, 
date_dim_1.d_year
                 ->  Custom Scan (GpuJoin)  (cost=14610.07..79126.11 
rows=1426111 width=181)
                       Bulkload: On (density: 100.00%)
                       Depth 1: Logic: GpuHashJoin, HashKeys: 
(cs_bill_customer_sk), JoinQual: (c_customer_sk = cs_bill_customer_sk), 
nrows_ratio: 0.99446636
                       Depth 2: Logic: GpuHashJoin, HashKeys: 
(cs_sold_date_sk), JoinQual: (cs_sold_date_sk = d_date_sk), nrows_ratio: 
0.98929483
                       ->  Custom Scan (BulkScan) on catalog_sales  
(cost=0.00..65628.43 rows=1441543 width=41)
                       ->  Seq Scan on customer customer_1  (cost=0.00..4358.00 
rows=100000 width=156)
                       ->  Seq Scan on date_dim date_dim_1  (cost=0.00..2705.49 
rows=73049 width=16)
           ->  HashAggregate  (cost=69306.38..78293.88 rows=719000 width=181)
                 Group Key: customer_2.c_customer_id, customer_2.c_first_name, 
customer_2.c_last_name, customer_2.c_preferred_cust_flag, 
customer_2.c_birth_country, customer_2.c_login, customer_2.c_email_address, 
date_dim_2.d_year
                 ->  Custom Scan (GpuJoin)  (cost=14702.18..45938.88 
rows=719000 width=181)
                       Bulkload: On (density: 100.00%)
                       Depth 1: Logic: GpuHashJoin, HashKeys: 
(ws_bill_customer_sk), JoinQual: (c_customer_sk = ws_bill_customer_sk), 
nrows_ratio: 0.99973309
                       Depth 2: Logic: GpuHashJoin, HashKeys: 
(ws_sold_date_sk), JoinQual: (ws_sold_date_sk = d_date_sk), nrows_ratio: 
0.99946618
                       ->  Custom Scan (BulkScan) on web_sales  
(cost=0.00..32877.84 rows=719384 width=41)
                       ->  Seq Scan on customer customer_2  (cost=0.00..4358.00 
rows=100000 width=156)
                       ->  Seq Scan on date_dim date_dim_2  (cost=0.00..2705.49 
rows=73049 width=16)
   ->  Sort  (cost=752685.49..752685.50 rows=1 width=132)
         Sort Key: t_s_secyear.customer_id, t_s_secyear.customer_first_name, 
t_s_secyear.customer_last_name, t_s_secyear.customer_email_address
         ->  Hash Join  (cost=621264.91..752685.48 rows=1 width=132)
               Hash Cond: (t_s_secyear.customer_id = t_w_secyear.customer_id)
               Join Filter: (CASE WHEN (t_c_firstyear.year_total > 
'0'::numeric) THEN (t_c_secyear.year_total / t_c_firstyear.year_total) ELSE 
NULL::numeric END > CASE WHEN (t_w_firstyear.year_total > '0'::numeric) THEN 
(t_w_secyear.year_total / t_w_firstyear.year_total) ELSE NULL::numeric END)
               ->  Hash Join  (cost=501790.45..633210.98 rows=1 width=308)
                     Hash Cond: (t_s_secyear.customer_id = 
t_c_secyear.customer_id)
                     Join Filter: (CASE WHEN (t_c_firstyear.year_total > 
'0'::numeric) THEN (t_c_secyear.year_total / t_c_firstyear.year_total) ELSE 
NULL::numeric END > CASE WHEN (t_s_firstyear.year_total > '0'::numeric) THEN 
(t_s_secyear.year_total / t_s_firstyear.year_total) ELSE NULL::numeric END)
                     ->  Hash Join  (cost=382315.99..513736.47 rows=1 width=320)
                           Hash Cond: (t_s_firstyear.customer_id = 
t_s_secyear.customer_id)
                           ->  Hash Join  (cost=262841.53..394261.97 rows=2 
width=156)
                                 Hash Cond: (t_w_firstyear.customer_id = 
t_s_firstyear.customer_id)
                                 ->  CTE Scan on year_total t_w_firstyear  
(cost=0.00..131420.27 rows=40 width=52)
                                       Filter: ((year_total > '0'::numeric) AND 
(sale_type = 'w'::text) AND (dyear = 2001))
                                 ->  Hash  (cost=262841.43..262841.43 rows=8 
width=104)
                                       ->  Hash Join  
(cost=131420.77..262841.43 rows=8 width=104)
                                             Hash Cond: 
(t_s_firstyear.customer_id = t_c_firstyear.customer_id)
                                             ->  CTE Scan on year_total 
t_s_firstyear  (cost=0.00..131420.27 rows=40 width=52)
                                                   Filter: ((year_total > 
'0'::numeric) AND (sale_type = 's'::text) AND (dyear = 2001))
                                             ->  Hash  
(cost=131420.27..131420.27 rows=40 width=52)
                                                   ->  CTE Scan on year_total 
t_c_firstyear  (cost=0.00..131420.27 rows=40 width=52)
                                                         Filter: ((year_total > 
'0'::numeric) AND (sale_type = 'c'::text) AND (dyear = 2001))
                           ->  Hash  (cost=119472.98..119472.98 rows=119 
width=164)
                                 ->  CTE Scan on year_total t_s_secyear  
(cost=0.00..119472.98 rows=119 width=164)
                                       Filter: ((sale_type = 's'::text) AND 
(dyear = 2002))
                     ->  Hash  (cost=119472.98..119472.98 rows=119 width=52)
                           ->  CTE Scan on year_total t_c_secyear  
(cost=0.00..119472.98 rows=119 width=52)
                                 Filter: ((sale_type = 'c'::text) AND (dyear = 
2002))
               ->  Hash  (cost=119472.98..119472.98 rows=119 width=52)
                     ->  CTE Scan on year_total t_w_secyear  
(cost=0.00..119472.98 rows=119 width=52)
                           Filter: ((sale_type = 'w'::text) AND (dyear = 2002))
(61 rows)
------------------------------------------------------------------

SET enable_nestloop=off performed fine in this case. However, it seems to me
this restriction is too much.

In fact, cost of HashJoin underlying Sort node is:
    ->  Hash Join  (cost=621264.91..752685.48 rows=1 width=132)

On the other hands, NestedLoop on same place is:
    ->  Nested Loop  (cost=0.00..752732.26 rows=1 width=132)

Probably, small GUC adjustment may make optimizer to prefer HashJoin towards
these kind of queries.
Do you have a good idea?


====== QUERY: No.04 ========================

EXPLAIN ANALYZE
with year_total as (
 select c_customer_id customer_id
       ,c_first_name customer_first_name
       ,c_last_name customer_last_name
       ,c_preferred_cust_flag customer_preferred_cust_flag
       ,c_birth_country customer_birth_country
       ,c_login customer_login
       ,c_email_address customer_email_address
       ,d_year dyear
       
,sum(((ss_ext_list_price-ss_ext_wholesale_cost-ss_ext_discount_amt)+ss_ext_sales_price)/2)
 year_total
       ,'s' sale_type
 from customer
     ,store_sales
     ,date_dim
 where c_customer_sk = ss_customer_sk
   and ss_sold_date_sk = d_date_sk
 group by c_customer_id
         ,c_first_name
         ,c_last_name
         ,c_preferred_cust_flag
         ,c_birth_country
         ,c_login
         ,c_email_address
         ,d_year
 union all
 select c_customer_id customer_id
       ,c_first_name customer_first_name
       ,c_last_name customer_last_name
       ,c_preferred_cust_flag customer_preferred_cust_flag
       ,c_birth_country customer_birth_country
       ,c_login customer_login
       ,c_email_address customer_email_address
       ,d_year dyear
       
,sum((((cs_ext_list_price-cs_ext_wholesale_cost-cs_ext_discount_amt)+cs_ext_sales_price)/2)
 ) year_total
       ,'c' sale_type
 from customer
     ,catalog_sales
     ,date_dim
 where c_customer_sk = cs_bill_customer_sk
   and cs_sold_date_sk = d_date_sk
 group by c_customer_id
         ,c_first_name
         ,c_last_name
         ,c_preferred_cust_flag
         ,c_birth_country
         ,c_login
         ,c_email_address
         ,d_year
union all
 select c_customer_id customer_id
       ,c_first_name customer_first_name
       ,c_last_name customer_last_name
       ,c_preferred_cust_flag customer_preferred_cust_flag
       ,c_birth_country customer_birth_country
       ,c_login customer_login
       ,c_email_address customer_email_address
       ,d_year dyear
       
,sum((((ws_ext_list_price-ws_ext_wholesale_cost-ws_ext_discount_amt)+ws_ext_sales_price)/2)
 ) year_total
       ,'w' sale_type
 from customer
     ,web_sales
     ,date_dim
 where c_customer_sk = ws_bill_customer_sk
   and ws_sold_date_sk = d_date_sk
 group by c_customer_id
         ,c_first_name
         ,c_last_name
         ,c_preferred_cust_flag
         ,c_birth_country
         ,c_login
         ,c_email_address
         ,d_year
         )
  select
                  t_s_secyear.customer_id
                 ,t_s_secyear.customer_first_name
                 ,t_s_secyear.customer_last_name
                 ,t_s_secyear.customer_email_address
 from year_total t_s_firstyear
     ,year_total t_s_secyear
     ,year_total t_c_firstyear
     ,year_total t_c_secyear
     ,year_total t_w_firstyear
     ,year_total t_w_secyear
 where t_s_secyear.customer_id = t_s_firstyear.customer_id
   and t_s_firstyear.customer_id = t_c_secyear.customer_id
   and t_s_firstyear.customer_id = t_c_firstyear.customer_id
   and t_s_firstyear.customer_id = t_w_firstyear.customer_id
   and t_s_firstyear.customer_id = t_w_secyear.customer_id
   and t_s_firstyear.sale_type = 's'
   and t_c_firstyear.sale_type = 'c'
   and t_w_firstyear.sale_type = 'w'
   and t_s_secyear.sale_type = 's'
   and t_c_secyear.sale_type = 'c'
   and t_w_secyear.sale_type = 'w'
   and t_s_firstyear.dyear =  2001
   and t_s_secyear.dyear = 2001+1
   and t_c_firstyear.dyear =  2001
   and t_c_secyear.dyear =  2001+1
   and t_w_firstyear.dyear = 2001
   and t_w_secyear.dyear = 2001+1
   and t_s_firstyear.year_total > 0
   and t_c_firstyear.year_total > 0
   and t_w_firstyear.year_total > 0
   and case when t_c_firstyear.year_total > 0 then t_c_secyear.year_total / 
t_c_firstyear.year_total else null end
           > case when t_s_firstyear.year_total > 0 then t_s_secyear.year_total 
/ t_s_firstyear.year_total else null end
   and case when t_c_firstyear.year_total > 0 then t_c_secyear.year_total / 
t_c_firstyear.year_total else null end
           > case when t_w_firstyear.year_total > 0 then t_w_secyear.year_total 
/ t_w_firstyear.year_total else null end
 order by t_s_secyear.customer_id
         ,t_s_secyear.customer_first_name
         ,t_s_secyear.customer_last_name
         ,t_s_secyear.customer_email_address
limit 100;


--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kai...@ak.jp.nec.com>



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to