I've also observed the same behaviour on a very large table (200GB data, 170GB for 2 indexes) ....
I have a table which has 6 small columns, let's call them (a, b, c, d, e, f) and about 1 billion rows. There is an index on (a, b, c, d) - not my idea, Hibernate requires primary keys for every table. If I do the following query: *select max(c) from tbl where a=[constant literal] and b=[other constant literal];* .... then with maxed out analysis histograms, and no changes to any of the page_cost type stuff, it still deparately wants toi do a full table scan, which is ... kinda slow. Of course, a billion row table is also rather suboptimal (our app collects a lot more data than it used to) and so I'm bypassing Hibernate, and sharding it all by time, so that the tables and indexes will be a manageable size, and will also be vacuum-free as my aging out process is now DROP TABLE :-) Cheers Dave On Wed, Mar 17, 2010 at 8:01 PM, Eger, Patrick <pe...@automotive.com> wrote: > I'm running 8.4.2 and have noticed a similar heavy preference for > sequential scans and hash joins over index scans and nested loops. Our > database is can basically fit in cache 100% so this may not be > applicable to your situation, but the following params seemed to help > us: > > seq_page_cost = 1.0 > random_page_cost = 1.01 > cpu_tuple_cost = 0.0001 > cpu_index_tuple_cost = 0.00005 > cpu_operator_cost = 0.000025 > effective_cache_size = 1000MB > shared_buffers = 1000MB > > > Might I suggest the Postgres developers reconsider these defaults for > 9.0 release, or perhaps provide a few sets of tuning params for > different workloads in the default install/docs? The cpu_*_cost in > particular seem to be way off afaict. I may be dead wrong though, fwiw > =) > > -----Original Message----- > From: pgsql-performance-ow...@postgresql.org > [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Christian > Brink > Sent: Wednesday, March 17, 2010 2:26 PM > To: pgsql-performance@postgresql.org > Subject: [PERFORM] Forcing index scan on query produces 16x faster > > I am running into a problem with a particular query. The execution plan > cost shows that the Seq Scan is a better bet (cost=54020.49..54020.55) > over the forced index 'enable_seqscan = false' > (cost=1589703.87..1589703.93). But when I run the query both ways I get > a vastly different result (below). It appears not to want to bracket the > > salesitems off of the 'id' foreign_key unless I force it. > > Is there a way to rewrite or hint the planner to get me the better plan > without resorting to 'enable_seqscan' manipulation (or am I missing > something)? > > postream=> select version(); > version > ------------------------------------------------------------------------ > ------------------------------------------------- > PostgreSQL 8.0.3 on i386-redhat-linux-gnu, compiled by GCC > i386-redhat-linux-gcc (GCC) 4.0.0 20050505 (Red Hat 4.0.0-4) > > > postream=> SET enable_seqscan = false; > SET > postream=> EXPLAIN ANALYZE > postream-> SELECT si.group1_id as name, sum(si.qty) as count, > sum(si.amt) as amt > postream-> FROM salesitems si, sales s, sysstrings > postream-> WHERE si.id = s.id > postream-> AND si.group1_id != '' > postream-> AND si.group1_id IS NOT NULL > postream-> AND NOT si.void > postream-> AND NOT s.void > postream-> AND NOT s.suspended > postream-> AND s.tranzdate >= (cast('2010-02-15' as date) + > cast(sysstrings.data as time)) > postream-> AND s.tranzdate < ((cast('2010-02-15' as date) + 1) + > cast(sysstrings.data as time)) > postream-> AND sysstrings.id='net/Console/Employee/Day End Time' > postream-> GROUP BY name; > > QUERY PLAN > ------------------------------------------------------------------------ > ------------------------------------------------------------------------ > ---------------------------------------------------- > HashAggregate (cost=1589703.87..1589703.93 rows=13 width=35) (actual > time=33.414..33.442 rows=12 loops=1) > -> Nested Loop (cost=0.01..1588978.22 rows=96753 width=35) (actual > > time=0.284..22.115 rows=894 loops=1) > -> Nested Loop (cost=0.01..2394.31 rows=22530 width=4) > (actual time=0.207..4.671 rows=225 loops=1) > -> Index Scan using sysstrings_pkey on sysstrings > (cost=0.00..5.78 rows=1 width=175) (actual time=0.073..0.078 rows=1 > loops=1) > Index Cond: (id = 'net/Console/Employee/Day End > Time'::text) > -> Index Scan using sales_tranzdate_index on sales s > (cost=0.01..1825.27 rows=22530 width=12) (actual time=0.072..3.464 > rows=225 loops=1) > Index Cond: ((s.tranzdate >= ('2010-02-15'::date + > > ("outer".data)::time without time zone)) AND (s.tranzdate < > ('2010-02-16'::date + ("outer".data)::time without time zone))) > Filter: ((NOT void) AND (NOT suspended)) > -> Index Scan using salesitems_pkey on salesitems si > (cost=0.00..70.05 rows=30 width=39) (actual time=0.026..0.052 rows=4 > loops=225) > Index Cond: (si.id = "outer".id) > Filter: ((group1_id <> ''::text) AND (group1_id IS NOT > NULL) AND (NOT void)) > Total runtime: 33.734 ms > (12 rows) > > postream=> SET enable_seqscan = true; > SET > postream=> EXPLAIN ANALYZE > postream-> SELECT si.group1_id as name, sum(si.qty) as count, > sum(si.amt) as amt > postream-> FROM salesitems si, sales s, sysstrings > postream-> WHERE si.id = s.id > postream-> AND si.group1_id != '' > postream-> AND si.group1_id IS NOT NULL > postream-> AND NOT si.void > postream-> AND NOT s.void > postream-> AND NOT s.suspended > postream-> AND s.tranzdate >= (cast('2010-02-15' as date) + > cast(sysstrings.data as time)) > postream-> AND s.tranzdate < ((cast('2010-02-15' as date) + 1) + > cast(sysstrings.data as time)) > postream-> AND sysstrings.id='net/Console/Employee/Day End Time' > postream-> GROUP BY name; > > QUERY PLAN > ------------------------------------------------------------------------ > ------------------------------------------------------------------------ > ---------------------------------------------------------- > HashAggregate (cost=54020.49..54020.55 rows=13 width=35) (actual > time=5564.929..5564.957 rows=12 loops=1) > -> Hash Join (cost=2539.63..53294.84 rows=96753 width=35) (actual > time=5502.324..5556.262 rows=894 loops=1) > Hash Cond: ("outer".id = "inner".id) > -> Seq Scan on salesitems si (cost=0.00..30576.60 > rows=885215 width=39) (actual time=0.089..3099.453 rows=901249 loops=1) > Filter: ((group1_id <> ''::text) AND (group1_id IS NOT > NULL) AND (NOT void)) > -> Hash (cost=2394.31..2394.31 rows=22530 width=4) (actual > time=3.329..3.329 rows=0 loops=1) > -> Nested Loop (cost=0.01..2394.31 rows=22530 width=4) > > (actual time=0.217..2.749 rows=225 loops=1) > -> Index Scan using sysstrings_pkey on > sysstrings (cost=0.00..5.78 rows=1 width=175) (actual time=0.077..0.085 > > rows=1 loops=1) > Index Cond: (id = 'net/Console/Employee/Day > End Time'::text) > -> Index Scan using sales_tranzdate_index on > sales s (cost=0.01..1825.27 rows=22530 width=12) (actual > time=0.074..1.945 rows=225 loops=1) > Index Cond: ((s.tranzdate >= > ('2010-02-15'::date + ("outer".data)::time without time zone)) AND > (s.tranzdate < ('2010-02-16'::date + ("outer".data)::time without time > zone))) > Filter: ((NOT void) AND (NOT suspended)) > Total runtime: 5565.262 ms > (13 rows) > > > -- > Christian Brink > > > > -- > Sent via pgsql-performance mailing list > (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance >