On Wed, Mar 23, 2011 at 4:51 PM, Vitalii Tymchyshyn <tiv...@gmail.com>wrote:

>  23.03.11 13:21, Adarsh Sharma написав(ла):
>
> Thank U all, for U'r Nice Support.
>
> Let me Conclude the results, below results are obtained after finding the
> needed queries :
>
> *First Option :
>
> *pdc_uima=# explain analyze select distinct(p.crawled_page_id)
> pdc_uima-# from page_content p left join clause2 c on (p.crawled_page_id =
> pdc_uima(# c.source_id) where (c.source_id is null);
>                                                                      QUERY
> PLAN
>
> -----------------------------------------------------------------------------------------------------------------------------------------------------
>  HashAggregate  (cost=100278.16..104104.75 rows=382659 width=8) (actual
> time=87927.000..87930.084 rows=72 loops=1)
>    ->  Nested Loop Anti Join  (cost=0.00..99320.46 rows=383079 width=8)
> (actual time=0.191..87926.546 rows=74 loops=1)
>          ->  Seq Scan on page_content p  (cost=0.00..87132.17 rows=428817
> width=8) (actual time=0.027..528.978 rows=428467 loops=1)
>          ->  Index Scan using idx_clause2_source_id on clause2 c
> (cost=0.00..18.18 rows=781 width=4) (actual time=0.202..0.202 rows=1
> loops=428467)
>                Index Cond: (p.crawled_page_id = c.source_id)
>  Total runtime: 87933.882 ms :-(
> (6 rows)
>
> *Second Option :
>
> *pdc_uima=# explain analyze select distinct(p.crawled_page_id) from
> page_content p
> pdc_uima-#  where NOT EXISTS (select 1 from  clause2 c where c.source_id =
> p.crawled_page_id);
>                                                                      QUERY
> PLAN
>
> -----------------------------------------------------------------------------------------------------------------------------------------------------
>  HashAggregate  (cost=100278.16..104104.75 rows=382659 width=8) (actual
> time=7047.259..7050.261 rows=72 loops=1)
>    ->  Nested Loop Anti Join  (cost=0.00..99320.46 rows=383079 width=8)
> (actual time=0.039..7046.826 rows=74 loops=1)
>          ->  Seq Scan on page_content p  (cost=0.00..87132.17 rows=428817
> width=8) (actual time=0.008..388.976 rows=428467 loops=1)
>          ->  Index Scan using idx_clause2_source_id on clause2 c
> (cost=0.00..18.18 rows=781 width=4) (actual time=0.013..0.013 rows=1
> loops=428467)
>                Index Cond: (c.source_id = p.crawled_page_id)
>  Total runtime: 7054.074 ms :-)
> (6 rows)
>
>
> Actually the plans are equal, so I suppose it depends on what were run
> first :). Slow query operates with data mostly on disk, while fast one with
> data in memory.
>
> yeah. maybe the easiest way, is to start a fresh session and fire the
queries.


>  Best regards, Vitalii Tymchyshyn
>



-- 
Regards,
Chetan Suttraway
EnterpriseDB <http://www.enterprisedb.com/>, The Enterprise
PostgreSQL<http://www.enterprisedb.com/>
 company.

Reply via email to