I see what you mean about the scan on the feature_pkey taking a long
time.  I tried several things to remedy that.  I created an index on
feature (feature_id,type_id) (which I don't think makes sense since
feature_id is the primary key, so add another column really doesn't
help).  I also created a index on feature (type_id, feature_id), but the
planner doesn't use it.  Also, there was an already existing index on
feature (type_id) that the planner never used.

One thing I tried that changed the query plan and improved performance
slightly (but still nowhere near what I need) was to add a partial index
on featureloc on (fmin,fmax) where scrfeature_id=6.  This is something I
could realistically do since there are relatively few (>30)
srcfeature_ids that I am interested in, so putting in place a partial
index for each of them would not be a big deal.  Nevertheless, the
performance is still not there.  Here is the EXPLAIN ANALYZE for this

                                                                           QUERY PLAN  
 Unique  (cost=156172.23..156200.11 rows=159 width=54) (actual time=63631.93..63631.93 
rows=1 loops=1)
   ->  Sort  (cost=156172.23..156176.21 rows=1594 width=54) (actual 
time=63631.93..63631.93 rows=1 loops=1)
         Sort Key:, fl.fmin, fl.fmax, fl.strand, f.type_id, f.feature_id
         ->  Hash Join  (cost=135100.30..156087.46 rows=1594 width=54) (actual 
time=63631.29..63631.79 rows=1 loops=1)
               Hash Cond: ("outer".feature_id = "inner".feature_id)
               ->  Index Scan using featureloc_src_6 on featureloc fl  
(cost=0.00..18064.99 rows=101883 width=14) (actual time=26.11..430.00 rows=570 loops=1)
                     Index Cond: ((fmin <= 2585581) AND (fmax >= 2565581))
                     Filter: (srcfeature_id = 6)
               ->  Hash  (cost=134601.43..134601.43 rows=48347 width=40) (actual 
time=63182.86..63182.86 rows=0 loops=1)
                     ->  Index Scan using feature_pkey on feature f  
(cost=0.00..134601.43 rows=48347 width=40) (actual time=69.98..62978.27 rows=13825 
                           Filter: (type_id = 219)
 Total runtime: 63632.28 msec
(12 rows)

Any other ideas?


On Fri, 2003-07-11 at 09:38, Rod Taylor wrote:
> On Fri, 2003-07-11 at 13:17, Scott Cain wrote:
> > The problem (at least as it appears to me) is not that it is performing
> > a table scan instead of an index scan, it is that it is using the wrong
> > index.  Here is the output from EXPLAIN ANALYZE:
> > 
> >                                                                          QUERY PLAN
> > ------------------------------------------------------------------------------------------------------------------------------------------------------------
> >  Unique  (cost=494008.47..494037.59 rows=166 width=54) (actual 
> > time=114660.37..114660.38 rows=1 loops=1)
> >    ->  Sort  (cost=494008.47..494012.63 rows=1664 width=54) (actual 
> > time=114660.37..114660.37 rows=1 loops=1)
> >          Sort Key:, fl.fmin, fl.fmax, fl.strand, f.type_id, f.feature_id
> >          ->  Nested Loop  (cost=0.00..493919.44 rows=1664 width=54) (actual 
> > time=2596.13..114632.90 rows=1 loops=1)
> >                ->  Index Scan using feature_pkey on feature f  
> > (cost=0.00..134601.43 rows=52231 width=40) (actual time=105.74..56048.87 
> > rows=13825 loops=1)
> >                      Filter: (type_id = 219)
> >                ->  Index Scan using featureloc_idx1 on featureloc fl  
> > (cost=0.00..6.87 rows=1 width=14) (actual time=4.23..4.23 rows=0 loops=13825)
> >                      Index Cond: ("outer".feature_id = fl.feature_id)
> >                      Filter: ((srcfeature_id = 6) AND (fmin <= 2585581) AND (fmax 
> > >= 2565581))
> >  Total runtime: 114660.91 msec
> > it is using on featureloc (featureloc_idx1) is on the foreign key
> > feature_id.  It should instead be using another index, featureloc_idx3,
> > which is built on (srcfeature_id, fmin, fmax).
> Nope.. The optimizer is right in the decision to use featureloc_idx1. 
> You will notice it is expecting to retrieve a single row from this
> index, but the featureloc_idx3 is bound to be larger (due to indexing
> more data), thus take more disk reads for the exact same information (or
> in this case, lack thereof).
> What is taking a long time is the scan on feature_pkey. It looks like it
> is throwing away a ton of rows that are not type_id = 219.  Either that,
> or you do a pile of deletes and haven't run REINDEX recently.
> Create an index consisting of (feature_id, type_id).  This will probably
> make a significant different in execution time.
Scott Cain, Ph. D.                                         [EMAIL PROTECTED]
GMOD Coordinator (                     216-392-3087
Cold Spring Harbor Laboratory

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Reply via email to