On 07/05/2011 06:15 AM, Marcus Engene wrote:
Though partial index solved another problem. Usually I'm a little bit annoyed with the optimizer and the developers religious "fix the planner instead of index hints". I must say that I'm willing to reconsider my usual stance to that.

We have a large table of products where status=20 is a rare intermediate status. I added a...

CREATE INDEX pond_item_common_x8 ON pond_item_common(pond_user, status)
WHERE status = 20;

...and a slow 5s select with users who had existing status=20 items became very fast. Planner, I guess, saw the 10000 status 20 clips (out of millions of items) instead of like 5 different values of status and thus ignoring the index. Super!

To my great amazement, the planner also managed to use the index when counting how many status=20 items there are in total:

I'm glad we got you to make a jump toward common ground with the database's intended use. There are many neat advanced ways to solve the sorts of problems people try to hammer with hints available in PostgreSQL, some of which don't even exist in other databases. It's kind of interesting to me how similarly one transition tends to happen to people who learn a lot about those options, enough that they can talk fully informed about things like how hints would have to work in PostgreSQL--for example: they'd have to consider all all these partial index possibilities. Once you go through all that, suddenly a lot of the people who do it realize that maybe hints aren't as important as good design and indexing--when you take advantages of all the features available to you--after all.

To help explain what happened to you here a little better, the planner tracks Most Common Values in the database, and it uses those statistics to make good decisions about the ones it finds. But when a value is really rare, it's never going to make it to that list, and therefore the planner is going to make a guess about how likely it is--likely a wrong one. By creating a partial index on that item, it's essentially adding that information--just how many rows are going to match a query looking for that value--so that it can be utilized the same way MCVs are. Adding partial indexes on sparse columns that are critical to a common report allow what I'm going to coin a new acronym for: those are part of the Most Important Values in that column. The MIV set is the MCV information plus information about the rare but critical columns. And the easiest way to expose that data to the planner is with a partial index.

I smell a blog post coming on this topic.

--
Greg Smith   2ndQuadrant US    g...@2ndquadrant.com   Baltimore, MD
Comprehensive and Customized PostgreSQL Training Classes:
http://www.2ndquadrant.us/postgresql-training/


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to