On Fri, 4 Apr 2008, Ow Mun Heng wrote:
select * from table
where A=X
and B = Y
and C = Z
and D = AA
and E = BB

With that kind of WHERE condition, Postgres will use a Bitmap Index Scan to combine your indices. If, however, postgres notices while looking at the statistics gathered during ANALYZE, that for one of your columns, you request a value that happens in a large percentage of the rows (like 20%), and this value has a rather random distribution, Postgres will not bother scanning the index, because it is very likely that all the pages would contain a row satisfying your condition anyway, so the time taken to scan this huge index and mark the bitmap would be lost because it would not allow a better selectivity, since all the pages would get selected for scan anyway. I would guess that Postgres uses Bitmap Index Scan only on your columns that have good selectivity (ie. lots of different values).

        So :

If you use conditions on (a,b) or (a,b,c) or (a,b,c,d) etc, you will benefit GREATLY from a multicolumn index on (a,b,c,d...). However, even if postgres can use some clever tricks, a multicolumn index on (a,b,c,d) will not be optimal for a condition on (b,c,d) for instance.

So, if you mostly use conditions on a left-anchored subset of (a,b,c,d,e), the multicolumn index will be a great tool. A multicolumn index on (a,b,c,d,e) is always slightly slower than an index on (a) if you only use a condition on (a), but it is immensely faster when you use a multicolumn condition.

Can you tell us more about what those columns mean and what you store in them, how many distinct values, etc ?

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to