As Scott recommended, I did the following:
# set enable_nestloop = false;
# vacuum full analyze;

After this I re-ran the query and its execution time went down from 2
hours to 2 minutes. I attached the new query plan to this posting.
Is there any way to optimize it even further?
What should I do to make this query run fast without hurting the
performance of the other queries?
Thanks.

Oleg

-----Original Message-----
From: scott.marlowe [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, October 01, 2003 4:00 PM
To: Oleg Lebedev
Cc: Josh Berkus; [EMAIL PROTECTED]
Subject: Re: [PERFORM] TPC-R benchmarks


For troubleshooting, can you try it with "set enable_nestloop = false"
and 
rerun the query and see how long it takes?  

It looks like the estimates of rows returned is WAY off (estimate is too

low compared to what really comes back.)

Also, you might try to alter the table.column to have a higher target on

the rows p_partkey and ps_partkey and any others where the estimate is
so 
far off of the reality.

On Wed, 1 Oct 2003, Oleg Lebedev wrote:

> All right, my query just finished running with EXPLAIN ANALYZE. I show

> the plan below and also attached it as a file. Any ideas?
> 
>    ->  Sort  (cost=54597.49..54597.50 rows=1 width=121) (actual 
> time=6674562.03..6674562.15 rows=175 loops=1)
>          Sort Key: nation.n_name, date_part('year'::text,
> orders.o_orderdate)
>          ->  Aggregate  (cost=54597.45..54597.48 rows=1 width=121) 
> (actual time=6668919.41..6674522.48 rows=175 loops=1)
>                ->  Group  (cost=54597.45..54597.47 rows=3 width=121) 
> (actual time=6668872.68..6672136.96 rows=348760 loops=1)
>                      ->  Sort  (cost=54597.45..54597.46 rows=3
> width=121) (actual time=6668872.65..6669499.95 rows=348760 loops=1)
>                            Sort Key: nation.n_name, 
> date_part('year'::text, orders.o_orderdate)
>                            ->  Hash Join  (cost=54596.00..54597.42 
> rows=3
> width=121) (actual time=6632768.89..6650192.67 rows=348760 loops=1)
>                                  Hash Cond: ("outer".n_nationkey =
> "inner".s_nationkey)
>                                  ->  Seq Scan on nation 
> (cost=0.00..1.25 rows=25 width=33) (actual time=6.75..7.13 rows=25
> loops=1)
>                                  ->  Hash  (cost=54596.00..54596.00 
> rows=3
> width=88) (actual time=6632671.96..6632671.96 rows=0 loops=1)
>                                        ->  Nested Loop 
> (cost=0.00..54596.00 rows=3 width=88) (actual time=482.41..6630601.46 
> rows=348760 loops=1)
>                                              Join Filter: 
> ("inner".s_suppkey = "outer".l_suppkey)
>                                              ->  Nested Loop 
> (cost=0.00..54586.18 rows=3 width=80) (actual time=383.87..6594984.40 
> rows=348760 loops=1)
>                                                    ->  Nested Loop 
> (cost=0.00..54575.47 rows=4 width=68) (actual time=199.95..3580882.07 
> rows=348760 loops=1)
>                                                          Join Filter: 
> ("outer".p_partkey = "inner".ps_partkey)
>                                                          ->  Nested 
> Loop (cost=0.00..22753.33 rows=9343 width=49) (actual 
> time=146.85..3541433.10 rows=348760 loops=1)
>                                                                ->  Seq

> Scan on part  (cost=0.00..7868.00 rows=320 width=4) (actual 
> time=33.64..15651.90 rows=11637 loops=1)
> 
> Filter: (p_name ~~ '%green%'::text)
>                                                                ->  
> Index Scan using i_l_partkey on lineitem  (cost=0.00..46.15 rows=29 
> width=45) (actual time=10.71..302.67 rows=30 loops=11637)
>  
> Index
> Cond: ("outer".p_partkey = lineitem.l_partkey)
>                                                          ->  Index 
> Scan using pk_partsupp on partsupp  (cost=0.00..3.39 rows=1 width=19) 
> (actual time=0.09..0.09 rows=1 loops=348760)
>                                                                Index
> Cond: ((partsupp.ps_partkey = "outer".l_partkey) AND 
> (partsupp.ps_suppkey =
> "outer".l_suppkey))
>                                                    ->  Index Scan 
> using pk_orders on orders  (cost=0.00..3.01 rows=1 width=12) (actual 
> time=8.62..8.62 rows=1 loops=348760)
>                                                          Index Cond: 
> (orders.o_orderkey = "outer".l_orderkey)
>                                              ->  Index Scan using 
> pk_supplier on supplier  (cost=0.00..3.01 rows=1 width=8) (actual 
> time=0.08..0.08 rows=1 loops=348760)
>                                                    Index Cond: 
> ("outer".ps_suppkey = supplier.s_suppkey)  Total runtime: 6674724.23 
> msec (28 rows)
> 
> 
> -----Original Message-----
> From: Oleg Lebedev
> Sent: Wednesday, October 01, 2003 12:00 PM
> To: Josh Berkus; scott.marlowe
> Cc: [EMAIL PROTECTED]
> Subject: Re: [PERFORM] TPC-R benchmarks
> Importance: Low
> 
> 
> Sure, below is the query. I attached the plan to this posting.
> 
> select
>       nation,
>       o_year,
>       sum(amount) as sum_profit
> from
>       (
>               select
>                       n_name as nation,
>                       extract(year from o_orderdate) as o_year,
>                       l_extendedprice * (1 - l_discount) -
> ps_supplycost * l_quantity as amount
>               from
>                       part,
>                       supplier,
>                       lineitem,
>                       partsupp,
>                       orders,
>                       nation
>               where
>                       s_suppkey = l_suppkey
>                       and ps_suppkey = l_suppkey
>                       and ps_partkey = l_partkey
>                       and p_partkey = l_partkey
>                       and o_orderkey = l_orderkey
>                       and s_nationkey = n_nationkey
>                       and p_name like '%green%'
>       ) as profit
> group by
>       nation,
>       o_year
> order by
>       nation,
>       o_year desc;
> 
> 
> -----Original Message-----
> From: Josh Berkus [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, October 01, 2003 11:42 AM
> To: Oleg Lebedev; scott.marlowe
> Cc: [EMAIL PROTECTED]
> Subject: Re: [PERFORM] TPC-R benchmarks
> 
> 
> Oleg,
> 
> > The output of the query should contain about 200 rows. So, I guess 
> > the
> 
> > planer is off assuming that the query should return 1 row.
> 
> Oh, also did you post the query before?   Can you re-post it with the
> planner
> results?
> 
> 

*************************************

This e-mail may contain privileged or confidential material intended for the named 
recipient only.
If you are not the named recipient, delete this message and all attachments.
Unauthorized reviewing, copying, printing, disclosing, or otherwise using information 
in this e-mail is prohibited.
We reserve the right to monitor e-mail sent through our network. 

*************************************
->  Sort  (cost=360630.16..360630.16 rows=1 width=121) (actual time=119039.15..119039.27 rows=175 loops=1)
         Sort Key: nation.n_name, date_part('year'::text,
orders.o_orderdate)
         ->  Aggregate  (cost=360630.13..360630.15 rows=1 width=121) (actual time=113515.96..119037.50 rows=175 loops=1)
               ->  Group  (cost=360630.13..360630.14 rows=2 width=121) (actual time=113481.12..116728.38 rows=348760 loops=1)
                     ->  Sort  (cost=360630.13..360630.13 rows=2
width=121) (actual time=113481.10..114095.94 rows=348760 loops=1)
                           Sort Key: nation.n_name, date_part('year'::text, orders.o_orderdate)
                           ->  Hash Join  (cost=360628.72..360630.12 rows=2 width=121) (actual time=92430.19..95230.68 rows=348760 loops=1)
                                 Hash Cond: ("outer".n_nationkey =
"inner".s_nationkey)
                                 ->  Seq Scan on nation  (cost=0.00..1.25 rows=25 width=33) (actual time=9.55..9.97 rows=25 loops=1)
                                 ->  Hash  (cost=360628.72..360628.72 rows=2 width=88) (actual time=92420.44..92420.44 rows=0 loops=1)
                                       ->  Hash Join (cost=306669.69..360628.72 rows=2 width=88) (actual time=53474.51..91355.74 rows=348760 loops=1)
                                             Hash Cond: ("outer".o_orderkey = "inner".l_orderkey)
                                             ->  Seq Scan on orders (cost=0.00..46459.00 rows=1500000 width=12) (actual time=10.87..6298.62 rows=1500000 loops=1)
                                             ->  Hash (cost=306669.69..306669.69 rows=2 width=76) (actual time=53463.43..53463.43 rows=0 loops=1)
                                                   ->  Hash Join (cost=40618.91..306669.69 rows=2 width=76) (actual time=5633.60..52291.78 rows=348760 loops=1)
                                                         Hash Cond: ("outer".l_partkey = "inner".ps_partkey)
                                                         Join Filter: ("inner".s_suppkey = "outer".l_suppkey)
                                                         ->  Seq Scan on lineitem (cost=0.00..235620.15 rows=6001215 width=45) (actual time=21.10..25686.20 rows=6001215 loops=1)
                                                         ->  Hash (cost=40615.76..40615.76 rows=1261 width=31) (actual time=5612.06..5612.06 rows=0 loops=1)
                                                               ->  Hash Join  (cost=40174.00..40615.76 rows=1261 width=31) (actual time=5235.18..5528.55 rows=46548 loops=1)
                                                                     Hash
Cond: ("outer".s_suppkey = "inner".ps_suppkey)
                                                                     -> Seq Scan on supplier  (cost=0.00..351.00 rows=10000 width=8) (actual time=24.60..170.81 rows=10000 loops=1)
                                                                     -> Hash  (cost=40170.80..40170.80 rows=1280 width=23) (actual time=5210.24..5210.24 rows=0 loops=1)

->  Hash Join  (cost=7867.80..40170.80 rows=1280 width=23) (actual
time=1215.96..5108.35 rows=46548 loops=1)

Hash Cond: ("outer".ps_partkey = "inner".p_partkey)

->  Seq Scan on partsupp  (cost=0.00..26287.00 rows=800000 width=19)
(actual time=0.01..2784.12 rows=800000 loops=1)

->  Hash  (cost=7867.00..7867.00 rows=320 width=4) (actual
time=1215.38..1215.38 rows=0 loops=1)

    ->  Seq Scan on part  (cost=0.00..7867.00 rows=320 width=4) (actual time=11.64..1187.37 rows=11637 loops=1)

          Filter: (p_name ~~ '%green%'::text)
 Total runtime: 119058.34 msec
(31 rows)

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faqs/FAQ.html

Reply via email to