Re: [HACKERS] Query not using index, please explain.

Matthew Hagerty Thu, 08 Mar 2001 11:27:20 -0800
Richard,

Thanks for the response, I guess I should have included a little more 
information.  The table contains 3.5 million rows.  The indexes were 
created after the data was imported into the table and I had just run 
vacuum and vacuum analyze on the database before trying the queries and 
sending this question to hackers.

When I turned the seqscan variable off and ran the query with the 
'04-01-2000' date the results were literally instantaneous.  Turn the 
seqscan back on and it takes right around 3 minutes.  Also, the query for 
any date older than the '04-01-2000' returns zero rows.  The actual number 
of rows for the '04-01-2000' select is right around 8300.

Here is the table for more information:

pglog=# \d history_entries
        Table "history_entries"
  Attribute  |    Type     | Modifier
------------+-------------+----------
  domain     | varchar(80) |
  time_stamp | timestamp   |
  response   | integer     |
  transfered | integer     |
  reqtime    | integer     |
  entry      | text        |
Indices: hist_entries_domain,
          hist_entries_timestamp

I'm also having problems with this query:

select domain from history_entries group by domain;

To me, since there is an index on domain, it seems like this should be a 
rather fast thing to do?  It takes a *very* long time, no matter if I turn 
seqscan on or off.

pglog=# select version();
                                  version
-------------------------------------------------------------------------
  PostgreSQL 7.0.3 on i386-unknown-freebsdelf3.4, compiled by gcc 2.7.2.3
(1 row)

Thanks,
Matthew


At 07:18 PM 3/8/2001 +0000, you wrote:
>On Thu, Mar 08, 2001 at 01:49:42PM -0500, Matthew Hagerty wrote:
> > Greetings,
> >
> > I have a real simple table with a timestamp field.  The timestamp field 
> has
> > an index on it.  But, the index does not seem to be taken into account for
> > selects that return rows:
> >
> > pglog=# explain select time_stamp from history_entries where time_stamp <
> > '03-01-2000';
> > NOTICE:  QUERY PLAN:
> >
> > Index Scan using hist_entries_timestamp on
> > history_entries  (cost=0.00..12810.36 rows=3246 width=8)
> >
> > EXPLAIN
> > pglog=# explain select time_stamp from history_entries where time_stamp <
> > '04-01-2000';
> > NOTICE:  QUERY PLAN:
> >
> > Seq Scan on history_entries  (cost=0.00..160289.71 rows=138215 width=8)
> >
> > EXPLAIN
> > pglog=# set enable_seqscan to off;
> > SET VARIABLE
> > pglog=# explain select time_stamp from history_entries where time_stamp <
> > '04-01-2000';
> > NOTICE:  QUERY PLAN:
> >
> > Index Scan using hist_entries_timestamp on
> > history_entries  (cost=0.00..368241.51 rows=138215 width=8)
> >
> > EXPLAIN
> > pglog=# set enable_seqscan to on;
> > SET VARIABLE
> > pglog=#
> >
> > The query where the time_stamp < '03-01-2000' does not return any rows, 
> the
> > 04-01-2000 date does return rows.  When I disable seqscan the query is
> > almost instant, but with it on, it takes about 3 or 4 minutes.  Why can't
> > the query planner use the index in the later case?
>
>Well, it can, it just chooses not to.  Your second EXPLAIN shows that
>it thinks it's going to get 138215 rows from that select; it then
>calculates that it would be more expensive to use the index than simply
>to scan the table. Presumably it actually returns many fewer rows than
>that. Have you done a VACUUM ANALYZE recently? If you get plans this
>badly wrong immediately after a VACUUM ANALYZE, *then*'s the time to
>ask -hackers about it (FAQ item 4.9).
>
>Richard


---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Query not using index, please explain.

Reply via email to