> On 2018-08-08 16:55:22 +1200, Thomas Munro wrote:
>> On Fri, Jul 27, 2018 at 8:10 PM, David Fetter <da...@fetter.org> wrote:
>> > On Fri, Jul 27, 2018 at 02:55:26PM +1200, Thomas Munro wrote:
>> >> On Thu, Jul 26, 2018 at 7:14 AM, David Fetter <da...@fetter.org> wrote:
>> >> > Please find attached the next version, which passes 'make check'.
>> >>
>> >> ... but not 'make check-world' (contrib/postgres_fdw's EXPLAIN is 
>> >> different).
>> >
>> > Please find attached a patch that does.
>> >
>> > It doesn't always pass make installcheck-world, but I need to sleep
>> > rather than investigate that at the moment.
>> 
>> One observation I wanted to share: CTE scans inhibit parallelism today
>> (something we might eventually want to fix with shared tuplestores).
>> This patch therefore allows parallelism in some WITH queries, which
>> seems like a very valuable thing.
> 
> Might be interesting to see how big a difference it makes for
> TPC-DS. Currently the results are bad (as in many queries don't finish
> in a relevant time) because it uses CTEs so widely, and there's often
> predicates outside the CTE that could be pushed down.

Now that the patch was committed, I played with TPCS-DS and found at
least one of their queries gets speedup. Query 2 runs 2 times faster
than 11. In 12, it seems that CTE wscs is pushed down.

with wscs as
 (select sold_date_sk
        ,sales_price
  from (select ws_sold_date_sk sold_date_sk
              ,ws_ext_sales_price sales_price
        from web_sales 
        union all
        select cs_sold_date_sk sold_date_sk
              ,cs_ext_sales_price sales_price
        from catalog_sales) as s1),
 wswscs as 
 (select d_week_seq,
        sum(case when (d_day_name='Sunday') then sales_price else null end) 
sun_sales,
        sum(case when (d_day_name='Monday') then sales_price else null end) 
mon_sales,
        sum(case when (d_day_name='Tuesday') then sales_price else  null end) 
tue_sales,
        sum(case when (d_day_name='Wednesday') then sales_price else null end) 
wed_sales,
        sum(case when (d_day_name='Thursday') then sales_price else null end) 
thu_sales,
        sum(case when (d_day_name='Friday') then sales_price else null end) 
fri_sales,
        sum(case when (d_day_name='Saturday') then sales_price else null end) 
sat_sales
 from wscs
     ,date_dim
 where d_date_sk = sold_date_sk
 group by d_week_seq)
 select d_week_seq1
       ,round(sun_sales1/sun_sales2,2)
       ,round(mon_sales1/mon_sales2,2)
       ,round(tue_sales1/tue_sales2,2)
       ,round(wed_sales1/wed_sales2,2)
       ,round(thu_sales1/thu_sales2,2)
       ,round(fri_sales1/fri_sales2,2)
       ,round(sat_sales1/sat_sales2,2)
 from
 (select wswscs.d_week_seq d_week_seq1
        ,sun_sales sun_sales1
        ,mon_sales mon_sales1
        ,tue_sales tue_sales1
        ,wed_sales wed_sales1
        ,thu_sales thu_sales1
        ,fri_sales fri_sales1
        ,sat_sales sat_sales1
  from wswscs,date_dim 
  where date_dim.d_week_seq = wswscs.d_week_seq and
        d_year = 1998) y,
 (select wswscs.d_week_seq d_week_seq2
        ,sun_sales sun_sales2
        ,mon_sales mon_sales2
        ,tue_sales tue_sales2
        ,wed_sales wed_sales2
        ,thu_sales thu_sales2
        ,fri_sales fri_sales2
        ,sat_sales sat_sales2
  from wswscs
      ,date_dim 
  where date_dim.d_week_seq = wswscs.d_week_seq and
        d_year = 1998+1) z
 where d_week_seq1=d_week_seq2-53
 order by d_week_seq1;

Here's the 12's plan:
                                                                                
  QUERY PLAN                                                                    
              
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=118929.39..118929.43 rows=13 width=228) (actual 
time=792.588..792.710 rows=2513 loops=1)
   Sort Key: wswscs.d_week_seq
   Sort Method: quicksort  Memory: 323kB
   CTE wswscs
     ->  Finalize GroupAggregate  (cost=110164.09..113672.71 rows=10447 
width=228) (actual time=766.232..768.415 rows=263 loops=1)
           Group Key: date_dim_2.d_week_seq
           ->  Gather Merge  (cost=110164.09..112601.89 rows=20894 width=228) 
(actual time=766.209..767.158 rows=789 loops=1)
                 Workers Planned: 2
                 Workers Launched: 2
                 ->  Sort  (cost=109164.06..109190.18 rows=10447 width=228) 
(actual time=763.059..763.078 rows=263 loops=3)
                       Sort Key: date_dim_2.d_week_seq
                       Sort Method: quicksort  Memory: 160kB
                       Worker 0:  Sort Method: quicksort  Memory: 160kB
                       Worker 1:  Sort Method: quicksort  Memory: 160kB
                       ->  Partial HashAggregate  (cost=108179.39..108466.69 
rows=10447 width=228) (actual time=762.202..762.889 rows=263 loops=3)
                             Group Key: date_dim_2.d_week_seq
                             ->  Parallel Hash Join  (cost=2371.82..74413.79 
rows=900416 width=20) (actual time=17.166..424.834 rows=717854 loops=3)
                                   Hash Cond: (catalog_sales.cs_sold_date_sk = 
date_dim_2.d_date_sk)
                                   ->  Parallel Append  (cost=0.00..69678.24 
rows=900416 width=10) (actual time=0.029..248.992 rows=720311 loops=3)
                                         ->  Parallel Seq Scan on catalog_sales 
 (cost=0.00..43411.73 rows=600673 width=10) (actual time=0.018..130.163 
rows=480516 loops=3)
                                         ->  Parallel Seq Scan on web_sales  
(cost=0.00..21764.43 rows=299743 width=10) (actual time=0.026..95.629 
rows=359692 loops=2)
                                   ->  Parallel Hash  (cost=1834.70..1834.70 
rows=42970 width=18) (actual time=16.610..16.610 rows=24350 loops=3)
                                         Buckets: 131072  Batches: 1  Memory 
Usage: 5056kB
                                         ->  Parallel Seq Scan on date_dim 
date_dim_2  (cost=0.00..1834.70 rows=42970 width=18) (actual time=0.020..7.617 
rows=24350 loops=3)
   ->  Hash Join  (cost=5007.74..5256.44 rows=13 width=228) (actual 
time=785.300..792.123 rows=2513 loops=1)
         Hash Cond: (wswscs.d_week_seq = date_dim.d_week_seq)
         ->  CTE Scan on wswscs  (cost=0.00..208.94 rows=10447 width=228) 
(actual time=766.236..766.263 rows=263 loops=1)
         ->  Hash  (cost=5007.58..5007.58 rows=13 width=232) (actual 
time=19.033..19.033 rows=2513 loops=1)
               Buckets: 4096 (originally 1024)  Batches: 1 (originally 1)  
Memory Usage: 288kB
               ->  Merge Join  (cost=5001.97..5007.58 rows=13 width=232) 
(actual time=17.739..18.210 rows=2513 loops=1)
                     Merge Cond: (((wswscs_1.d_week_seq - 53)) = 
date_dim.d_week_seq)
                     ->  Sort  (cost=2668.33..2669.24 rows=365 width=228) 
(actual time=9.906..9.924 rows=365 loops=1)
                           Sort Key: ((wswscs_1.d_week_seq - 53))
                           Sort Method: quicksort  Memory: 76kB
                           ->  Hash Join  (cost=2322.68..2652.79 rows=365 
width=228) (actual time=7.864..9.764 rows=365 loops=1)
                                 Hash Cond: (wswscs_1.d_week_seq = 
date_dim_1.d_week_seq)
                                 ->  CTE Scan on wswscs wswscs_1  
(cost=0.00..208.94 rows=10447 width=228) (actual time=0.001..2.287 rows=263 
loops=1)
                                 ->  Hash  (cost=2318.11..2318.11 rows=365 
width=4) (actual time=7.389..7.389 rows=365 loops=1)
                                       Buckets: 1024  Batches: 1  Memory Usage: 
21kB
                                       ->  Seq Scan on date_dim date_dim_1  
(cost=0.00..2318.11 rows=365 width=4) (actual time=3.876..7.348 rows=365 
loops=1)
                                             Filter: (d_year = 1999)
                                             Rows Removed by Filter: 72684
                     ->  Sort  (cost=2333.65..2334.56 rows=365 width=4) (actual 
time=7.824..7.930 rows=2514 loops=1)
                           Sort Key: date_dim.d_week_seq
                           Sort Method: quicksort  Memory: 42kB
                           ->  Seq Scan on date_dim  (cost=0.00..2318.11 
rows=365 width=4) (actual time=3.950..7.765 rows=365 loops=1)
                                 Filter: (d_year = 1998)
                                 Rows Removed by Filter: 72684
 Planning Time: 0.956 ms
 Execution Time: 794.072 ms
(50 rows)

Here's 11's plan:
                                                                          QUERY 
PLAN                                                                          
--------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=227105.32..227105.35 rows=13 width=228) (actual 
time=2019.071..2019.163 rows=2513 loops=1)
   Sort Key: wswscs.d_week_seq
   Sort Method: quicksort  Memory: 323kB
   CTE wscs
     ->  Append  (cost=0.00..88586.64 rows=2160976 width=10) (actual 
time=0.015..521.937 rows=2160932 loops=1)
           ->  Seq Scan on web_sales  (cost=0.00..25960.84 rows=719384 
width=10) (actual time=0.014..137.557 rows=719384 loops=1)
           ->  Seq Scan on catalog_sales  (cost=0.00..51820.92 rows=1441592 
width=10) (actual time=0.011..269.559 rows=1441548 loops=1)
   CTE wswscs
     ->  HashAggregate  (cost=132977.62..133264.03 rows=10415 width=228) 
(actual time=1996.856..1997.387 rows=263 loops=1)
           Group Key: date_dim_2.d_week_seq
           ->  Hash Join  (cost=3048.60..51941.02 rows=2160976 width=28) 
(actual time=36.414..1323.387 rows=2153563 loops=1)
                 Hash Cond: (wscs.sold_date_sk = date_dim_2.d_date_sk)
                 ->  CTE Scan on wscs  (cost=0.00..43219.52 rows=2160976 
width=18) (actual time=0.017..942.090 rows=2160932 loops=1)
                 ->  Hash  (cost=2135.49..2135.49 rows=73049 width=18) (actual 
time=35.870..35.871 rows=73049 loops=1)
                       Buckets: 131072  Batches: 1  Memory Usage: 4734kB
                       ->  Seq Scan on date_dim date_dim_2  (cost=0.00..2135.49 
rows=73049 width=18) (actual time=0.011..17.749 rows=73049 loops=1)
   ->  Hash Join  (cost=5006.47..5254.41 rows=13 width=228) (actual 
time=2012.260..2018.602 rows=2513 loops=1)
         Hash Cond: (wswscs.d_week_seq = date_dim.d_week_seq)
         ->  CTE Scan on wswscs  (cost=0.00..208.30 rows=10415 width=228) 
(actual time=1996.858..1996.876 rows=263 loops=1)
         ->  Hash  (cost=5006.31..5006.31 rows=13 width=232) (actual 
time=15.380..15.380 rows=2513 loops=1)
               Buckets: 4096 (originally 1024)  Batches: 1 (originally 1)  
Memory Usage: 288kB
               ->  Merge Join  (cost=5000.73..5006.31 rows=13 width=232) 
(actual time=14.272..14.683 rows=2513 loops=1)
                     Merge Cond: (((wswscs_1.d_week_seq - 53)) = 
date_dim.d_week_seq)
                     ->  Sort  (cost=2667.18..2668.09 rows=363 width=228) 
(actual time=7.374..7.393 rows=365 loops=1)
                           Sort Key: ((wswscs_1.d_week_seq - 53))
                           Sort Method: quicksort  Memory: 76kB
                           ->  Hash Join  (cost=2322.65..2651.75 rows=363 
width=228) (actual time=6.386..7.154 rows=365 loops=1)
                                 Hash Cond: (wswscs_1.d_week_seq = 
date_dim_1.d_week_seq)
                                 ->  CTE Scan on wswscs wswscs_1  
(cost=0.00..208.30 rows=10415 width=228) (actual time=0.000..0.707 rows=263 
loops=1)
                                 ->  Hash  (cost=2318.11..2318.11 rows=363 
width=4) (actual time=6.367..6.367 rows=365 loops=1)
                                       Buckets: 1024  Batches: 1  Memory Usage: 
21kB
                                       ->  Seq Scan on date_dim date_dim_1  
(cost=0.00..2318.11 rows=363 width=4) (actual time=3.000..6.330 rows=365 
loops=1)
                                             Filter: (d_year = 1999)
                                             Rows Removed by Filter: 72684
                     ->  Sort  (cost=2333.55..2334.45 rows=363 width=4) (actual 
time=6.890..6.975 rows=2514 loops=1)
                           Sort Key: date_dim.d_week_seq
                           Sort Method: quicksort  Memory: 42kB
                           ->  Seq Scan on date_dim  (cost=0.00..2318.11 
rows=363 width=4) (actual time=3.832..6.841 rows=365 loops=1)
                                 Filter: (d_year = 1998)
                                 Rows Removed by Filter: 72684
 Planning Time: 0.962 ms
 Execution Time: 2027.758 ms
(42 rows)

BTW, in my small TPC-DS environment (2GB), only two queries were not
finished within 30 minutes (query 4 and 11). My guess is these seem to
be suffered from statistic errors (I got this hint from Kaigai-san's
slide: https://www.slideshare.net/kaigai/tpcdspostgresql, especially
page 20. Unfortunately the slide is in Japanese).  And it leads to
choosing a bad plan: nested loop join. If I disable nested loop join,
these two queries finished quickly.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Reply via email to