What is random_page_cost and seq_page_cost in your server?
And how many rows does the table have?


On Tue, May 27, 2014 at 2:09 PM, Grzegorz Olszewski <
grzegorz.olszew...@outlook.com> wrote:

> Hi,
>
> I wonder why planner uses Seq Scan instead of Index Scan.
>
> Here is my table (partial):
> content.contents
>
> -------------------------+-----------------------------+-----------------------------------------------------------------
>  id                      | bigint                      | niepusty
> domyślnie nextval('content.contents_id_seq'::regclass)
>  version                 | integer                     | niepusty
>  date_published          | timestamp without time zone |
>  moderation_status       | character varying(50)       |
>  publication_status      | character varying(30)       |
>
> And indexes (there are some other indexes too):
>     "contents_id_pkey" PRIMARY KEY, btree (id)
>     "contents_date_published_idx" btree (date_published)
>     "contents_moderation_status_idx" btree (moderation_status)
>     "contents_publication_status_idx" btree (publication_status)
>
> I tried also creating following indexes:
>     "contents_date_published_publication_status_moderation_statu_idx"
> btree (date_published, publication_status, moderation_status)
>     "contents_publication_status_idx1" btree ((publication_status::text))
>     "contents_moderation_status_idx1" btree ((moderation_status::text))
>
> Then for this query (genrated by Hibernate):
> explain (analyze, buffers) select count(*) as y0_ from content.contents
> this_ inner join content.content_categories cat1_ on
> this_.CONTENT_CATEGORY_ID=cat1_.ID where cat1_.name in ([...])
> and this_.date_published<='2014-05-26 12:23:31.557000 +02:00:00'
> and (this_.PUBLICATION_STATUS is null or this_.PUBLICATION_STATUS<>'DRAFT')
> and (this_.moderation_status is null or
> this_.moderation_status<>'DANGEROUS')
> and exists(select * from content.content_visibilities cv where
> cv.content_id = this_.ID and cv.user_group_id in (1,2));
>
> Planner creates such plan:
>
> QUERY
> PLAN
>
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Hash Semi Join  (cost=31706.84..106020.81 rows=21871 width=2076) (actual
> time=1197.658..6012.406 rows=430218 loops=1)
>    Hash Cond: (this_.id = cv.content_id)
>    Buffers: shared hit=5 read=59031 written=3, temp read=47611
> written=47549
>    ->  Hash Join  (cost=2.22..56618.11 rows=22881 width=2076) (actual time=
> 0.163..1977.304 rows=430221 loops=1)
>          Hash Cond: (this_.content_category_id = cat1_.id)
>          Buffers: shared hit=1 read=46829 written=1
>          ->  Seq Scan on contents this_  (cost=0.00..54713.92 rows=446176
> width=2030) (actual time=0.048..915.724 rows=450517 loops=1)
>                Filter: ((date_published <= '2014-05-26
> 12:23:31.557'::timestamp without time zone) AND ((publication_status IS
> NULL) OR ((publication_status)::text <> 'DRAFT'::text)) AND
> ((moderation_status IS NULL) OR ((moderation_status)::text <>
> 'DANGEROUS'::text)))
>                Rows Removed by Filter: 50
>                Buffers: shared read=46829 written=1
>          ->  Hash  (cost=2.17..2.17 rows=4 width=46) (actual
> time=0.089..0.089 rows=4 loops=1)
>                Buckets: 1024  Batches: 1  Memory Usage: 1kB
>                Buffers: shared hit=1
>                ->  Seq Scan on content_categories cat1_  (cost=0.00..2.17
> rows=4 width=46) (actual time=0.053..0.076 rows=4 loops=1)
>                      Filter: ((name)::text = ANY
> ('{przeglad-prasy/rp,przeglad-prasy/parkiet,komunikat-z-rynku-pap-emitent,komunikat-z-rynku-pap-depesze}'::text[]))
>                      Rows Removed by Filter: 74
>                      Buffers: shared hit=1
>    ->  Hash  (cost=24435.09..24435.09 rows=443083 width=8) (actual
> time=1197.146..1197.146 rows=447624 loops=1)
>          Buckets: 4096  Batches: 32  Memory Usage: 560kB
>          Buffers: shared hit=4 read=12202 written=2, temp written=1467
>          ->  Bitmap Heap Scan on content_visibilities cv
> (cost=7614.55..24435.09 rows=443083 width=8) (actual time=61.034..647.729
> rows=447624 loops=1)
>                Recheck Cond: (user_group_id = ANY ('{1,2}'::bigint[]))
>                Buffers: shared hit=4 read=12202 written=2
>                ->  Bitmap Index Scan on
> content_visibilities_user_group_id_idx  (cost=0.00..7503.78 rows=443083
> width=0) (actual time=58.680..58.680 rows=447626 loops=1)
>                      Index Cond: (user_group_id = ANY ('{1,2}'::bigint[]))
>                      Buffers: shared hit=3 read=1226
>  Total runtime: 6364.689 ms
> (27 wierszy)
>
> The suspicious part is:
>          ->  Seq Scan on contents this_  (cost=0.00..54713.92 rows=446176
> width=2030) (actual time=0.048..915.724 rows=450517 loops=1)
>                Filter: ((date_published <= '2014-05-26
> 12:23:31.557'::timestamp without time zone) AND ((publication_status IS
> NULL) OR ((publication_status)::text <> 'DRAFT'::text)) AND
> ((moderation_status IS NULL) OR ((moderation_status)::text <>
> 'DANGEROUS'::text)))
>
> I don't understand why planner doesn't use indexes. The problem is there
> are about 0.5M rows satisfying condition (almost every row in the table).
> Could you please explain this behavior?
>
> I'm using PostgreSQL 9.2.8 on Ubuntu 12.04 LTS x86_64
>
> Best regards,
> Grzegorz Olszewski
>

Reply via email to