Richard,
Thanks for the response, I guess I should have included a little more
information. The table contains 3.5 million rows. The indexes were
created after the data was imported into the table and I had just run
vacuum and vacuum analyze on the database before trying the queries and
sending this question to hackers.
When I turned the seqscan variable off and ran the query with the
'04-01-2000' date the results were literally instantaneous. Turn the
seqscan back on and it takes right around 3 minutes. Also, the query for
any date older than the '04-01-2000' returns zero rows. The actual number
of rows for the '04-01-2000' select is right around 8300.
Here is the table for more information:
pglog=# \d history_entries
Table "history_entries"
Attribute | Type | Modifier
------------+-------------+----------
domain | varchar(80) |
time_stamp | timestamp |
response | integer |
transfered | integer |
reqtime | integer |
entry | text |
Indices: hist_entries_domain,
hist_entries_timestamp
I'm also having problems with this query:
select domain from history_entries group by domain;
To me, since there is an index on domain, it seems like this should be a
rather fast thing to do? It takes a *very* long time, no matter if I turn
seqscan on or off.
pglog=# select version();
version
-------------------------------------------------------------------------
PostgreSQL 7.0.3 on i386-unknown-freebsdelf3.4, compiled by gcc 2.7.2.3
(1 row)
Thanks,
Matthew
At 07:18 PM 3/8/2001 +0000, you wrote:
>On Thu, Mar 08, 2001 at 01:49:42PM -0500, Matthew Hagerty wrote:
> > Greetings,
> >
> > I have a real simple table with a timestamp field. The timestamp field
> has
> > an index on it. But, the index does not seem to be taken into account for
> > selects that return rows:
> >
> > pglog=# explain select time_stamp from history_entries where time_stamp <
> > '03-01-2000';
> > NOTICE: QUERY PLAN:
> >
> > Index Scan using hist_entries_timestamp on
> > history_entries (cost=0.00..12810.36 rows=3246 width=8)
> >
> > EXPLAIN
> > pglog=# explain select time_stamp from history_entries where time_stamp <
> > '04-01-2000';
> > NOTICE: QUERY PLAN:
> >
> > Seq Scan on history_entries (cost=0.00..160289.71 rows=138215 width=8)
> >
> > EXPLAIN
> > pglog=# set enable_seqscan to off;
> > SET VARIABLE
> > pglog=# explain select time_stamp from history_entries where time_stamp <
> > '04-01-2000';
> > NOTICE: QUERY PLAN:
> >
> > Index Scan using hist_entries_timestamp on
> > history_entries (cost=0.00..368241.51 rows=138215 width=8)
> >
> > EXPLAIN
> > pglog=# set enable_seqscan to on;
> > SET VARIABLE
> > pglog=#
> >
> > The query where the time_stamp < '03-01-2000' does not return any rows,
> the
> > 04-01-2000 date does return rows. When I disable seqscan the query is
> > almost instant, but with it on, it takes about 3 or 4 minutes. Why can't
> > the query planner use the index in the later case?
>
>Well, it can, it just chooses not to. Your second EXPLAIN shows that
>it thinks it's going to get 138215 rows from that select; it then
>calculates that it would be more expensive to use the index than simply
>to scan the table. Presumably it actually returns many fewer rows than
>that. Have you done a VACUUM ANALYZE recently? If you get plans this
>badly wrong immediately after a VACUUM ANALYZE, *then*'s the time to
>ask -hackers about it (FAQ item 4.9).
>
>Richard
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])