Re: [PERFORM] anti-join chosen even when slower than old plan

Craig James Thu, 11 Nov 2010 09:56:12 -0800

On 11/11/10 9:13 AM, Mladen Gogala wrote:

Kevin Grittner wrote:

Mladen Gogala <[email protected]> wrote:

create a definitive bias toward one type of the execution plan.


We're talking about trying to support the exact opposite.

I understand this, that is precisely the reason for my intervention into the 
discussion of experts, which I am not.

This all
started because a database which was tuned for good response time
for relatively small queries against a "hot" portion of some tables
chose a bad plan for a weekend maintenance run against the full
tables. We're talking about the possibility of adapting the cost
factors based on table sizes as compared to available cache, to more
accurately model the impact of needing to do actual disk I/O for
such queries.

Kevin, in my experience, the hardest thing to do is to tune so called mixed type databases. In 
practice, databases are usually separated: OLTP database on one group of servers, reporting 
database and the data warehouse on another group of servers. Postgres 9.0 has made a great stride 
toward such possibility with the new replication facilities. Again, having an optimizer which will 
choose the plan completely accurately is, at least in my opinion, less important than having a 
possibility of manual control, the aforementioned "knobs and buttons" and produce the 
same plan for the same statement. Trying to make the optimizer smart enough for all types of loads 
is akin to looking for the Holy Grail. Inevitably, you will face some hard questions, like the one 
about the airspeed velocity of an unladen swallow, and the whole search is likely to end in pretty 
funny way, not producing the desired "optimizing genie in the CPU".


What about rule-based configuration?  You provide a default configuration (what 
Postgres does now), and then allow one or more alternate configurations that 
are triggered when certain rules match.  The rules might be things like:

- Per user or group of users.  Set up a user or group for your
  maintenance task and you automatically get your own config.

- A set of tables.  If you do a query that uses tables X, Y, and
  Z, this configuration applies to you.

- A regular expression applied to the SQL.  If the regexp matches,
  the configuration applies.

- System resource usage.  If some other process is gobbling memory,
  switch to a configuration with lower memory requirements.

- A time of day.  Maybe on weekends, different rules apply.

... and so on.  I don't know what the right parameters might be, but surely the 
original poster's problem would be solved by this solution.  It gives 
performance experts the tool they need for complex installations, without 
adding FUD to the lives of everyone else.

Craig


This also is very different from trying to adapt queries to what
happens to be currently in cache. As already discussed on a recent
thread, the instability in plans and the failure to get to an
effective cache set make that a bad idea. The idea discussed here
would maintain a stable plan for a given query, it would just help
choose a good plan based on the likely level of caching.

Kevin, I am talking from the perspective of a DBA who is involved with a production 
databases on day-to-day basis. I am no expert but I do believe to speak from a 
perspective of users that Postgres has to win in order to make further inroads into the 
corporate server rooms. Without the possibility of such control and the plan stability, 
it is hard for me to recommend more extensive use of PostgreSQL to my boss. Whatever 
solution is chosen, it needs to have "knobs and buttons" and produce the plans 
that will not change when the CPU usage goes up.



--
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] anti-join chosen even when slower than old plan

Reply via email to