Re: [PERFORM] How to force Nested Loop plan?

Tom Lane Sat, 30 Aug 2003 16:22:47 +0000

Rob Nagler <[EMAIL PROTECTED]> writes:
> Tom Lane writes:
>> The reason the planner does not much like this plan is that it's
>> estimating that quite a lot of rows will have to be hit in min_date_time
>> order before it finds enough rows with server_id = 21.


> Very interesting.  How does it know "quite a lot"?

It doesn't, because it has no cross-column-correlation stats.  The
default assumption is that there's no correlation.

> server_id is uniformly distributed over time.  There's
> no randomness.  There is at least one 21 record for every value of
> min_date_time.

That doesn't really tell me anything.  What's the proportion of 21
records out of the total table?

> 21 is a special server_id containing aggregate
> (denormalized) data for the other servers.  I thought about putting it
> in a separate table, but this would complicate the code as the data is
> identical to the non-aggregated case.

Hm.  If most of your queries are for id 21, an alternative approach is
to create single-column partial indexes:

        create index fooi on foo (min_date_time) where server_id = 21;

This reduces the cost of maintaining the index but also makes it useful
*only* for id = 21 queries.  On the plus side, you don't need to hack
the ORDER BY clause to get your queries to use it.  Your choice...

> What if the ORDER BY was:
>     ORDER BY aa_t.server_id DESC, cc_t.name ASC
> Would the planner do the right thing?

What do you consider the right thing?  cc_t.name doesn't seem connected
to this table at all --- or did I miss something?

> It is a NUMERIC(18).  It could be a bigint.  What would be the change
> in performance of this query if we changed it to bigint?

Hard to say.  I can tell you that the raw comparison operator is a lot
quicker for bigint than for numeric, but I don't have any hard numbers
about what percentage of total CPU time is involved.  You'd pretty much
have to try it for yourself to see what the effect is in your queries.

If you've been generically using NUMERIC(n) where you could be using
integer or bigint, then I think you've probably paid a high price
without knowing it.  I don't know what Oracle's cost tradeoffs are for
these datatypes, but I can tell you that Postgres's integer types are
way faster (and more compact) than our NUMERIC.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
      joining column's datatypes do not match

Re: [PERFORM] How to force Nested Loop plan?

Reply via email to