Silke Trissl wrote:
Sorry,
is there a way to tell Postgres which index to use when a query is
issued in 7.4.2?
PostgreSQL adjusts usage through global parameters, statistics, and
periodic ANALYZE. Please post an EXPLAIN ANALYZE (not just EXPLAIN)
for your query and people on this list can help you with your
specific problem.
here are the plans, but still I would like to tell Postgres to use an
index or the join method (like HINT in ORACLE).
First the vacuum
db=# vacuum full analyze;
VACUUM
Then the query for the first time with analyze
db=# EXPLAIN ANALYZE
db-# SELECT chain.pdb_id, chain.id FROM PDB_ENTRY, CHAIN
WHERE PDB_ENTRY.resolution > 0.0 and PDB_ENTRY.resolution < 1.7
AND PDB_ENTRY.id = CHAIN.pdb_id;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=1426.75..5210.52 rows=7533 width=8) (actual
time=77.712..399.108 rows=5798 loops=1)
Hash Cond: ("outer".pdb_id = "inner".id)
-> Seq Scan on "chain" (cost=0.00..3202.11 rows=67511 width=8)
(actual time=0.048..151.885 rows=67511 loops=1)
-> Hash (cost=1418.68..1418.68 rows=3226 width=4) (actual
time=77.062..77.062 rows=0 loops=1)
This seems to be at least one of the problems. The planner thinks there
are going to be 3000+ rows, but in reality there are 0.
-> Seq Scan on pdb_entry (cost=0.00..1418.68 rows=3226
width=4) (actual time=0.118..71.956 rows=3329 loops=1)
Filter: ((resolution > 0::double precision) AND
(resolution < 1.7::double precision))
Total runtime: 404.434 ms
(7 rows)
And then try to avoid the Hash Join
db=# SET ENABLE_hashjoin = OFF;
SET
db=# EXPLAIN ANALYZE
db-# SELECT chain.pdb_id, chain.id FROM PDB_ENTRY, CHAIN
WHERE PDB_ENTRY.resolution > 0.0 and PDB_ENTRY.resolution < 1.7
AND PDB_ENTRY.id = CHAIN.pdb_id;
QUERY
PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=9163.85..11100.74 rows=7533 width=8) (actual
time=606.505..902.740 rows=5798 loops=1)
Merge Cond: ("outer".id = "inner".pdb_id)
-> Index Scan using pdb_entry_pkey on pdb_entry
(cost=0.00..1516.03 rows=3226 width=4) (actual time=0.440..102.912
rows=3329 loops=1)
Filter: ((resolution > 0::double precision) AND (resolution <
1.7::double precision))
-> Sort (cost=9163.85..9332.63 rows=67511 width=8) (actual
time=605.838..694.190 rows=67501 loops=1)
Sort Key: "chain".pdb_id
-> Seq Scan on "chain" (cost=0.00..3202.11 rows=67511
width=8) (actual time=0.064..225.859 rows=67511 loops=1)
Total runtime: 911.024 ms
(8 rows)
And finally timewise the fastest method, but not costwise. Even for
almost full table joins, this method is the fastest.
db=# SET ENABLE_mergejoin = off;
SET
db=# EXPLAIN ANALYZE
db-# SELECT chain.pdb_id, chain.id FROM PDB_ENTRY, CHAIN
WHERE PDB_ENTRY.resolution > 0.0 and PDB_ENTRY.resolution < 1.7
AND PDB_ENTRY.id = CHAIN.pdb_id;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..23849.81 rows=7533 width=8) (actual
time=0.341..198.162 rows=5798 loops=1)
-> Seq Scan on pdb_entry (cost=0.00..1418.68 rows=3226 width=4)
(actual time=0.145..78.177 rows=3329 loops=1)
Filter: ((resolution > 0::double precision) AND (resolution <
1.7::double precision))
-> Index Scan using chain_pdb_id_ind on "chain" (cost=0.00..6.87
rows=6 width=8) (actual time=0.021..0.027 rows=2 loops=3329)
Index Cond: ("outer".id = "chain".pdb_id)
Total runtime: 204.105 ms
(6 rows)
I'm guessing the filter is more selective than postgres thinks it is (0
<> 1.7). You might try increasing the statistics of that column, you
might also try playing with your random_page_cost to make index scans
relatively cheaper (than seq scans).
It might be an issue that your effective_cache_size isn't quite right,
which makes postgres think most things are on disk, when in reality they
are in memory (which also makes index scans much cheaper).
Also, this query may sort itself out in time. As the tables grow, the
relative fraction that you desire probably decreases, which makes index
scans more attractive.
John
=:->
signature.asc
Description: OpenPGP digital signature