Re: [GENERAL] Planner choice & tuning

Tom Lane Tue, 22 Jun 2004 16:46:52 -0700

"Ed L." <[EMAIL PROTECTED]> writes:
> This seems to be saying the planner thinks its less expensive to do the
> sequential scan, but why?


Because it thinks it's less expensive ;-)

There are a couple issues here: one is overestimation of the number of
rows matching the query.  That is a statistical issue and should be
fixable by increasing the statistics target for the column.  With an
accurate rows estimate the seqscan cost estimate would not change but
the indexscan cost estimate would decrease approximately proportionally.

Allowing for the row estimation error, the indexscan cost estimate is
still about 4x what it ought to be, which means that you could bring the
estimated cost (for this query anyway) in line with reality by setting
random_page_cost = 1.

Note however that we are defining "reality" as "the results of this one
single experiment".  You should realize in particular that the table is
probably fully cached in memory during your test, which is a scenario
in which random_page_cost actually is 1.0 (given reasonable assumptions
about the behavior of the kernel's cache, anyway).  If you optimize for
this case then you are likely to be pessimizing the behavior for larger
tables that don't fit in memory.

My suggestion would be to lower random_page_cost to 3 or so, which would
be enough to tip the decision to indexscan for this case, though not to
make the estimated cost really truly correct.  You should however run a
larger set of experiments before doing anything, and realize that any
random_page_cost setting is a compromise because the model doesn't take
all the variables into account.

You can find much more about this issue in the pgsql-performance archives.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [GENERAL] Planner choice & tuning

Reply via email to