Re: [HACKERS] from_collapse_limit vs. geqo_threshold

Tom Lane Mon, 25 May 2009 15:15:49 -0700

Robert Haas <[email protected]> writes:
> On Thu, May 21, 2009 at 7:50 AM, Tom Lane <[email protected]> wrote:
>> Josh Berkus <[email protected]> writes:
>>> ... trying to remember why I wrote that ... what would happen if
>>> FROM_COLLAPSE_LIMIT was *more* than GEQO_THRESHOLD?
>> 
>> I think I wrote it, not you.  The point of the advice is to keep
>> subquery collapsation (hm, what's the right noun form?  Need caffeine)
>> from turning a non-GEQO query into a GEQO one, and thus subjecting
>> you to unpredictable plans.  Maybe the resulting plans would be better
>> on average, or maybe they wouldn't, but in any case they'd be
>> unpredictable.


> That's more or less what I figured, but my real world experience is
> that pulling up subqueries and using GEQO leads to plans that are
> random but tolerable, whereas not pulling up subqueries leads to plans
> that are almost uniformly bad.

I went back and looked at the CVS history to try to refresh my memory
about how we got here.  As best I can find, there were two steps:

1. The original commit of the ability to have subqueries at all,
during 7.1 development:

2000-09-29 14:21  tgl

        Subselects in FROM clause, per
        ISO syntax: FROM (SELECT ...) [AS] alias.  (Don't forget that an
        alias is required.)  Views reimplemented as expanding to
        subselect-in-FROM.  Grouping, aggregates, DISTINCT in views
        actually work now (he says optimistically).  No UNION support in
        subselects/views yet, but I have some ideas about that. 
        Rule-related permissions checking moved out of rewriter and into
        executor.  INITDB REQUIRED!

This introduced the ability to pull up subqueries, but with an arbitrary
limit of geqo_threshold/2 on the number of relations that would be
collected into a single planning problem.

2. During 7.4 development, we did this:

2003-01-25 18:10  tgl

        Allow the planner to collapse explicit inner JOINs together, rather
        than necessarily following the JOIN syntax to develop the query
        plan.  The old behavior is still available by setting GUC variable
        JOIN_COLLAPSE_LIMIT to 1.  Also create a GUC variable
        FROM_COLLAPSE_LIMIT to control the similar decision about when to
        collapse sub-SELECT lists into their parent lists.  (This behavior
        existed already, but the limit was always GEQO_THRESHOLD/2; now
        it's separately adjustable.)

The excuse for join_collapse_limit to exist at all is largely one of
backwards compatibility.  Up to then, we had not-infrequently suggested
that people could force a desired join order by writing an explicit JOIN
nest, and eliminating that escape hatch altogether didn't seem like a
good idea.  I think from_collapse_limit was added largely on grounds of
symmetry.

Now, as to why the original commit had the geqo_threshold/2 restriction:
it was obviously not based on field experience with flattening, because
we didn't have any.  What I think it *was* based on was that GEQO sucked
really badly back then, and I wanted to avoid having it kick in for
queries that it had never kicked in for in previous releases.  Some
quick comparisons say that 7.1 in GEQO mode was about 5X slower than
HEAD (despite its planning being a lot more simplistic), and tended to
find considerably worse plans.  Some of the significant improvements
since then:

2004-01-23 18:54  tgl

        Revise GEQO planner to make use of some heuristic knowledge about
        SQL, namely that it's good to join where there are join clauses
        rather than where there are not.  Also enable it to generate bushy
        plans at need, so that it doesn't fail in the presence of multiple
        IN clauses containing sub-joins.

2004-01-21 18:33  tgl

        Repair error apparently introduced in the initial
        coding of GUC: the default value for geqo_effort is supposed to be
        40, not 1.  The actual 'genetic' component of the GEQO algorithm
        has been practically disabled since 7.1 because of this mistake. 

Also, up to 7.0 there were some nasty memory leaks in the planner and
especially in GEQO, because we didn't have the memory context mechanism.
I think those were actually fixed as of 2000-09-29, but GEQO still had a
reputation for blowing out backend memory.

Now I'm still not exactly happy with GEQO, but it's surely a lot better
than it was in the fall of 2000.  So on the whole it does seem that the
current relationships between from_collapse_limit, join_collapse_limit,
and geqo_threshold are based on obsolete information and should be
revisited.  I don't have any data at hand to suggest specific new
default values, though.

                        regards, tom lane

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] from_collapse_limit vs. geqo_threshold

Reply via email to