Very Fast Version:
Recently my database stopped respecting one of my indexes, which took a query
that should run in subsecond response time and turning it into something that
with small data sets runs in the 7-10 minute range and with large data sets
runs in the 30 minute - eternity range.
Explain Analyze tells me that what used to be an Index Scan has become a Seq
Scan, doing a full table scan through 140 million records.
Any thoughts on why that happens?
+++
Longer version:
I have a very large database (billion+ records) that has begun not respecting
indexes on some queries, resulting in what used to be instant answers now
taking many minutes and I'm trying to figure out why.
I'll try to simplify the request this way:
I have a database of many hundreds of millions of email messages, and
information about those messages. There is a table that tells me when I
received emails and from what source. There is another table that tells what
URLs were seen in the bodies of those emails and there is a table that links
those two tables together. Since many emails can contain the same URL, and
many URLs can be seen in each email, we divide like this:
Tables
===
emails
link_urls
email_links
email.message_id = link_url.message_id
linkurl.urlid= email_links.urlid
One attribute of the URL is the hostname portion.
In my puzzle, I have a table of hostnames, and I want to get statistics about
which emails contained URLs that pointed to those hostnames.
A very simplified version of the query could be:
select email.stuff from email natural join link_url natural join email_links
where hostname = 'foo.bar.com';
We know that there were two emails that advertised foo.bar.com, so the
objective is to link the presence of foo.bar.com to a URLID, use the link_url
table to find the respective message_id and then go ask the email table for
details about those two messages.
My explain analyze of that query takes seven minutes to complete, despite the
fact that I have an index on message_id, and urlid, and machine where they
occur in each of the three tables.
We broke the query down into it's components, and confirmed that what we
consider the three natural pieces of the query each run blazing fast, each
using the proper indexes. But when we put them all together, the index gets
ignored, and we sequential scan a 140 million row table.
Which is slow.
===
If we break it into pieces ... a human might think of the pieces as:
explain analyze select urlid from email_links where machine = 'foo.bar.com';
QUERY PLAN
---
Bitmap Heap Scan on email_links (cost=97.10..9650.58 rows=2457 width=33)
(actual time=0.049..0.049 rows=1 loops=1)
Recheck Cond: (machine = 'foo.bar.com'::text)
- Bitmap Index Scan on hostdex (cost=0.00..96.49 rows=2457 width=0)
(actual time=0.039..0.039 rows=1 loops=1)
Index Cond: (machine = 'foo.bar.com'::text)
Total runtime: 0.066 ms
(5 rows)
=
= explain analyze select message_id from link_url where urlid =
'9de6440fcc089c654806bd4c853c76f1';
QUERY PLAN
---
Bitmap Heap Scan on link_url (cost=65.20..5680.14 rows=1437 width=4) (actual
time=0.074..0.083 rows=2 loops=1)
Recheck Cond: (urlid = '9de6440fcc089c654806bd4c853c76f1'::text)
- Bitmap Index Scan on link_url_hkey (cost=0.00..64.84 rows=1437 width=0)
(actual time=0.021..0.021 rows=2 loops=1)
Index Cond: (urlid = '9de6440fcc089c654806bd4c853c76f1'::text)
Total runtime: 0.109 ms
(5 rows)
=
explain analyze select stuff from email where message_id in (78085350,
78088168);
QUERY PLAN
---
Bitmap Heap Scan on email (cost=17.19..25.21 rows=2 width=8) (actual
time=0.068..0.077 rows=2 loops=1)
Recheck Cond: (message_id = ANY ('{78085350,78088168}'::integer[]))
- Bitmap Index Scan on email_pkey (cost=0.00..17.19 rows=2 width=0)
(actual time=0.054..0.054 rows=2 loops=1)
Index Cond: (message_id = ANY ('{78085350,78088168}'::integer[]))
Total runtime: 0.100 ms
(5 rows)
++
In my simple mind that should take (.1 + .109 + .066) = FAST.
Here's the Explain Analyze for the combined version:
explain analyze select email.stuff from email natural join link_url natural
join email_link where machine = 'foo.bar.com';
QUERY PLAN
---
Merge Join