Allow me to demonstrate my pitiful SQL knowledge
I have tables documents and comments
If I run join and list doc_nums the query is quite fast
If I run join and use subselect the query is extremely slow
Can someone offer analysis

Thanks
Richard

dcc=# EXPLAIN ANALYZE select doc_num from documents limit 10;
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..1.01 rows=10 width=13) (actual time=0.013..0.061 rows=10 loops=1) -> Seq Scan on documents (cost=0.00..909333.85 rows=9014885 width=13) (actual time=0.008..0.027 rows=10 loops=1)
 Total runtime: 0.125 ms
(3 rows)

dcc=#



dcc=# EXPLAIN ANALYZE select * from documents left outer join comments on (documents.doc_num = comments.doc_num) where documents.doc_num in ('105364107','105513059','105513095','105513112','105585627','102933195','014650340','014650361','014650362','105419865');

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=21.23..61.54 rows=10 width=444) (actual time=0.507..0.574 rows=10 loops=1)
   Hash Cond: ("outer".doc_num = "inner".doc_num)
-> Bitmap Heap Scan on documents (cost=20.03..60.28 rows=10 width=361) (actual time=0.397..0.432 rows=10 loops=1) Recheck Cond: ((doc_num = '105364107'::bpchar) OR (doc_num = '105513059'::bpchar) OR (doc_num = '105513095'::bpchar) OR (doc_num = '105513112'::bpchar) OR (doc_num = '105585627'::bpchar) OR (doc_num = '102933195'::bpchar) OR (doc_num = '014650340'::bpchar) OR (doc_num = '014650361'::bpchar) OR (doc_num = '014650362'::bpchar) OR (doc_num = '105419865'::bpchar)) -> BitmapOr (cost=20.03..20.03 rows=10 width=0) (actual time=0.383..0.383 rows=0 loops=1) -> Bitmap Index Scan on documents_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.059..0.059 rows=1 loops=1)
                     Index Cond: (doc_num = '105364107'::bpchar)
-> Bitmap Index Scan on documents_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.039..0.039 rows=1 loops=1)
                     Index Cond: (doc_num = '105513059'::bpchar)
-> Bitmap Index Scan on documents_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.034..0.034 rows=1 loops=1)
                     Index Cond: (doc_num = '105513095'::bpchar)
-> Bitmap Index Scan on documents_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.031..0.031 rows=1 loops=1)
                     Index Cond: (doc_num = '105513112'::bpchar)
-> Bitmap Index Scan on documents_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.034..0.034 rows=1 loops=1)
                     Index Cond: (doc_num = '105585627'::bpchar)
-> Bitmap Index Scan on documents_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.036..0.036 rows=1 loops=1)
                     Index Cond: (doc_num = '102933195'::bpchar)
-> Bitmap Index Scan on documents_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.036..0.036 rows=1 loops=1)
                     Index Cond: (doc_num = '014650340'::bpchar)
-> Bitmap Index Scan on documents_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.031..0.031 rows=1 loops=1)
                     Index Cond: (doc_num = '014650361'::bpchar)
-> Bitmap Index Scan on documents_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.031..0.031 rows=1 loops=1)
                     Index Cond: (doc_num = '014650362'::bpchar)
-> Bitmap Index Scan on documents_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.035..0.035 rows=1 loops=1)
                     Index Cond: (doc_num = '105419865'::bpchar)
-> Hash (cost=1.16..1.16 rows=16 width=83) (actual time=0.080..0.080 rows=16 loops=1) -> Seq Scan on comments (cost=0.00..1.16 rows=16 width=83) (actual time=0.005..0.037 rows=16 loops=1)
 Total runtime: 0.775 ms
(28 rows)

dcc=#



dcc=# EXPLAIN ANALYZE select * from documents left outer join comments on (documents.doc_num = comments.doc_num) where documents.doc_num in (select doc_num from documents limit 10);

QUERY PLAN

--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--
Merge IN Join (cost=100000002.19..136154797.93 rows=10 width=654) (actual time
=23.534..2216180.550 rows=10 loops=1)
   Merge Cond: ("outer".doc_num = "inner".doc_num)
-> Merge Left Join (cost=0.00..36129585.92 rows=10083868 width=654) (actual
 time=23.239..2188733.430 rows=6696218 loops=1)
         Merge Cond: ("outer".doc_num = "inner".doc_num)
-> Index Scan using documents_pkey on documents (cost=0.00..35723277. 60 rows=10083868 width=569) (actual time=6.845..2107300.767 rows=6695853 loops=1
)
-> Index Scan using doc_num_idx on comments (cost=0.00..377203.50 row
s=311612 width=85) (actual time=16.368..6984.365 rows=243797 loops=1)
-> Sort (cost=100000002.19..100000002.22 rows=10 width=13) (actual time=0.2
52..0.293 rows=10 loops=1)
         Sort Key: "IN_subquery".doc_num
-> Limit (cost=100000000.00..100000001.92 rows=10 width=13) (actual t
ime=0.019..0.128 rows=10 loops=1)
-> Seq Scan on documents (cost=100000000.00..101940460.68 rows=
10083868 width=13) (actual time=0.010..0.055 rows=10 loops=1)
 Total runtime: 2216180.973 ms
(11 rows)

dcc=#

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

Reply via email to