Re: [PERFORM] Yet another abort-early plan disaster on 9.3

Jeff Janes Tue, 30 Sep 2014 10:29:27 -0700

On Mon, Sep 29, 2014 at 2:54 PM, Josh Berkus <j...@agliodbs.com> wrote:

> On 09/26/2014 01:06 AM, Simon Riggs wrote:
> > On 23 September 2014 00:56, Josh Berkus <j...@agliodbs.com> wrote:
> >
> >> We've hashed that out a bit, but frankly I think it's much more
> >> profitable to pursue fixing the actual problem than providing a
> >> workaround like "risk", such as:
> >>
> >> a) fixing n_distinct estimation
> >> b) estimating stacked quals using better math (i.e. not assuming total
> >> randomness)
> >> c) developing some kind of correlation stats
> >>
> >> Otherwise we would be just providing users with another knob there's no
> >> rational way to set.
> >
> > I believe this is a serious issue for PostgreSQL users and one that
> > needs to be addressed.
> >
> > n_distinct can be fixed manually, so that is less of an issue.
>
> It's an issue for the 99.8% of our users who don't know what n_distinct
> is, let alone how to calculate it.  Also, changing it requires an
> exclusive lock on the table. Of course, you and I have been over this
> issue before.
>

If 99.6% of our users don't have a problem with n_distinct in their system,
that would mean that only 50% of the people with the problem don't know how
to solve it.  And those people can usually get excellent free help on the
internet.

But if the problem not with n_distinct, but rather with most_common_freqs
(which I encounter more often than problems with n_distinct), all I can do
is shrug and say "yeah I know about that problem.  Either crank up
statistics target as high as it will go, or it sucks to be you."

>
> One thing I'm wondering is why our estimator is creates n_distinct as a
> % so seldom.  Really, any time n_distinct is over 10K we should be
> estimating a % instead.  Now, estimating that % has its own issues, but
> it does seem like a peculiar quirk of our stats model.
>
> Anyway, in the particular case I posted fixing n_distinct to realistic
> numbers (%) fixed the query plan.
>

But wouldn't fixing the absolute number also have fixed the plan?  If you
are going to set a number manually and then nail it in place so that
analyze stops changing it, then I can certainly see how the fractional
method is desirable.  But if the goal is not to do that but have the
correct value estimated in the first place, I don't really see much benefit
from converting the estimate into a fraction and then back again.

> >
> > The problem, as I see it, is different. We assume that if there are
> > 100 distinct values and you use LIMIT 1 that you would only need to
> > scan 1% of rows. We assume that the data is arranged in the table in a
> > very homogenous layout. When data is not, and it seldom is, we get
> > problems.
> >
> > Simply put, assuming that LIMIT will reduce the size of all scans is
> > just way wrong. I've seen many plans where increasing the LIMIT
> > dramatically improves the plan.
> >
> > If we can at least agree it is a problem, we can try to move forwards.
>

I don't think anyone doubts there is a problem (many more than one of
them), there is just disagreement about the priority and what can be done
about it.

>
> That is certainly another problem.  Does correlation stat figure in the
> LIMIT calculation at all, currently?  That's what correlation stat is
> for, no?
>

I don't think correlation is up to the task as a complete solution,
although it might help a little.  There is no way a simple correlation can
encode that John retired 15 years ago and hasn't logged on since, while
Johannes was hired yesterday and never logged on before then.

 Cheers,

Jeff

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

Reply via email to