-----Original Message-----
From: Mike Christensen [mailto:[email protected]]
Sent: Tuesday, June 05, 2012 4:28 PM
To: Dann Corbit
Cc: [email protected]
Subject: Re: [GENERAL] I'd like to learn a bit more about how indexes work
I'm aware of how "B-Trees" work, but I only understand them on the level of
traversing a single tree at a time. I'm curious how Postgres combines multiple
indexes with OR and AND clauses.
I've done some Google searches, however I can't find anything basic.
Everything I've found assumes you already have knowledge of terms such as "hash
join, aggregate join, etc".
At this point I'm not looking at learning how the optimizer works..
>>
"How the optimizer works" is the answer to your question.
The plan of attack for forming a query is a function of the optimizer.
One possible plan for " WHERE data = key1 OR data = key2 " is something along
the lines of:
SEEK("key1")
While key == key1 accumulate rows into the result set
GetNextRow()
SEEK("key2")
While key == key2 accumulate rows into the result set
GetNextRow()
However, if the table is tiny (suppose it is ten rows and fits into memory)
then a table scan might be cheaper.
Here at CONNX, I have written a hashed btree search that tends to be cheaper
than using a clustered index if there are no qualifiers on the join.
For instance
SELECT a.*, b.* from table1 a, table2 b WHERE a.unique_index = b.foreign_key
It will be faster to actually not use the index. Whereas if there are
additional where clause criteria such as:
SELECT a.*, b.* from table1 a, table2 b WHERE a.unique_index = b.foreign_key
AND a.unique_index IN (k1, k2, k3, k4,..., kn, kn+1)
It will probably be faster to use the index unless the list of items is a
substantial proportion of the possible data values.
The point is that there is not a simple formula that describes how data values
are retrieved from the database. The method of collection is decided by the
optimizer.
It isn't always cheaper to use an index. In fact, sometimes building an index
is a complete waste of time.
For instance, suppose that you have a column named 'sex' that can contain the
values 'F', 'M', and 'U'
If you built an index on that column it won't help you to find all the males
faster than a table scan because the data is not specific enough so that the
total number of pages of disk that are read would be more with the index than
if the index were not used.
So, I suggest that possibly the articles you do not want to read are the very
ones that will answer your questions.
On the other hand, it is not unlikely that I simply do not understand the
questions that you are asking.
<<
--
Sent via pgsql-general mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general