What is random_page_cost and seq_page_cost in your server? And how many rows does the table have?
On Tue, May 27, 2014 at 2:09 PM, Grzegorz Olszewski < grzegorz.olszew...@outlook.com> wrote: > Hi, > > I wonder why planner uses Seq Scan instead of Index Scan. > > Here is my table (partial): > content.contents > > -------------------------+-----------------------------+----------------------------------------------------------------- > id | bigint | niepusty > domyślnie nextval('content.contents_id_seq'::regclass) > version | integer | niepusty > date_published | timestamp without time zone | > moderation_status | character varying(50) | > publication_status | character varying(30) | > > And indexes (there are some other indexes too): > "contents_id_pkey" PRIMARY KEY, btree (id) > "contents_date_published_idx" btree (date_published) > "contents_moderation_status_idx" btree (moderation_status) > "contents_publication_status_idx" btree (publication_status) > > I tried also creating following indexes: > "contents_date_published_publication_status_moderation_statu_idx" > btree (date_published, publication_status, moderation_status) > "contents_publication_status_idx1" btree ((publication_status::text)) > "contents_moderation_status_idx1" btree ((moderation_status::text)) > > Then for this query (genrated by Hibernate): > explain (analyze, buffers) select count(*) as y0_ from content.contents > this_ inner join content.content_categories cat1_ on > this_.CONTENT_CATEGORY_ID=cat1_.ID where cat1_.name in ([...]) > and this_.date_published<='2014-05-26 12:23:31.557000 +02:00:00' > and (this_.PUBLICATION_STATUS is null or this_.PUBLICATION_STATUS<>'DRAFT') > and (this_.moderation_status is null or > this_.moderation_status<>'DANGEROUS') > and exists(select * from content.content_visibilities cv where > cv.content_id = this_.ID and cv.user_group_id in (1,2)); > > Planner creates such plan: > > QUERY > PLAN > > > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > Hash Semi Join (cost=31706.84..106020.81 rows=21871 width=2076) (actual > time=1197.658..6012.406 rows=430218 loops=1) > Hash Cond: (this_.id = cv.content_id) > Buffers: shared hit=5 read=59031 written=3, temp read=47611 > written=47549 > -> Hash Join (cost=2.22..56618.11 rows=22881 width=2076) (actual time= > 0.163..1977.304 rows=430221 loops=1) > Hash Cond: (this_.content_category_id = cat1_.id) > Buffers: shared hit=1 read=46829 written=1 > -> Seq Scan on contents this_ (cost=0.00..54713.92 rows=446176 > width=2030) (actual time=0.048..915.724 rows=450517 loops=1) > Filter: ((date_published <= '2014-05-26 > 12:23:31.557'::timestamp without time zone) AND ((publication_status IS > NULL) OR ((publication_status)::text <> 'DRAFT'::text)) AND > ((moderation_status IS NULL) OR ((moderation_status)::text <> > 'DANGEROUS'::text))) > Rows Removed by Filter: 50 > Buffers: shared read=46829 written=1 > -> Hash (cost=2.17..2.17 rows=4 width=46) (actual > time=0.089..0.089 rows=4 loops=1) > Buckets: 1024 Batches: 1 Memory Usage: 1kB > Buffers: shared hit=1 > -> Seq Scan on content_categories cat1_ (cost=0.00..2.17 > rows=4 width=46) (actual time=0.053..0.076 rows=4 loops=1) > Filter: ((name)::text = ANY > ('{przeglad-prasy/rp,przeglad-prasy/parkiet,komunikat-z-rynku-pap-emitent,komunikat-z-rynku-pap-depesze}'::text[])) > Rows Removed by Filter: 74 > Buffers: shared hit=1 > -> Hash (cost=24435.09..24435.09 rows=443083 width=8) (actual > time=1197.146..1197.146 rows=447624 loops=1) > Buckets: 4096 Batches: 32 Memory Usage: 560kB > Buffers: shared hit=4 read=12202 written=2, temp written=1467 > -> Bitmap Heap Scan on content_visibilities cv > (cost=7614.55..24435.09 rows=443083 width=8) (actual time=61.034..647.729 > rows=447624 loops=1) > Recheck Cond: (user_group_id = ANY ('{1,2}'::bigint[])) > Buffers: shared hit=4 read=12202 written=2 > -> Bitmap Index Scan on > content_visibilities_user_group_id_idx (cost=0.00..7503.78 rows=443083 > width=0) (actual time=58.680..58.680 rows=447626 loops=1) > Index Cond: (user_group_id = ANY ('{1,2}'::bigint[])) > Buffers: shared hit=3 read=1226 > Total runtime: 6364.689 ms > (27 wierszy) > > The suspicious part is: > -> Seq Scan on contents this_ (cost=0.00..54713.92 rows=446176 > width=2030) (actual time=0.048..915.724 rows=450517 loops=1) > Filter: ((date_published <= '2014-05-26 > 12:23:31.557'::timestamp without time zone) AND ((publication_status IS > NULL) OR ((publication_status)::text <> 'DRAFT'::text)) AND > ((moderation_status IS NULL) OR ((moderation_status)::text <> > 'DANGEROUS'::text))) > > I don't understand why planner doesn't use indexes. The problem is there > are about 0.5M rows satisfying condition (almost every row in the table). > Could you please explain this behavior? > > I'm using PostgreSQL 9.2.8 on Ubuntu 12.04 LTS x86_64 > > Best regards, > Grzegorz Olszewski >