On Friday, July 4, 2003, at 07:07 AM, Brian Tarbox wrote:
We had about 40 tables in the db, with joined queries on about 8-12 tables.
A while ago a tested a moderately complex schema on MySQL, Pg, and Oracle. I usually heavily normalize schemas and then define views as a denormalized API, which sends MySQL to the book of toys already. The views more often than not would join anywhere from 6-12 tables, using plain (as opposed to compound) foreign keys to primary key straight joins.
I noticed that Pg was more than an order of magnitude slower for joins > 8 tables than Oracle. I won't claim that none of this can have been due to lack of tuning. My point is the following though. After I dug in it turned out that of the 4 secs Pg needed to execute the query it spent 3.9 secs in the planner. The execution plan Pg came up with was pretty good - it just needed an extraordinary amount of time to arrive at it, spoiling its own results.
Asking this list I then learned how to tweak GEQO such that it would pick up the planning and do it faster than it would otherwise. I was able to get the planner time down to a quarter - still a multitude of the actual execution time.
I was told on this list that query planning suffers from combinatorial explosion very quickly - and I completely buy that. It's just - Oracle planned the same query in a fraction of a second, using the cost-based optimizer, on a slower machine. I've seen it plan 15-table joins in much less than a second, and I have no idea how it would do that. In addition, once you've prepared a query in Oracle, the execution plan is pre-compiled.
If I were a CS student I'd offer myself to the hall of humiliation and set out to write a fast query planner for Pg ...
-hilmar -- ------------------------------------------------------------- Hilmar Lapp email: lapp at gnf.org GNF, San Diego, Ca. 92121 phone: +1-858-812-1757 -------------------------------------------------------------
---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match