On Mon, Nov 14, 2016 at 9:57 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Robert Haas <robertmh...@gmail.com> writes: >> On Fri, Nov 4, 2016 at 6:52 AM, Ashutosh Bapat >> <ashutosh.ba...@enterprisedb.com> wrote: >>> Costing PartitionJoinPath needs more thought so that we don't end up >>> with bad overall plans. Here's an idea. Partition-wise joins are >>> better compared to the unpartitioned ones, because of the smaller >>> sizes of partitions. If we think of join as O(MN) operation where M >>> and N are sizes of unpartitioned tables being joined, partition-wise >>> join computes P joins each with average O(M/P * N/P) order where P is >>> the number of partitions, which is still O(MN) with constant factor >>> reduced by P times. I think, we need to apply similar logic to >>> costing. Let's say cost of a join is J(M, N) = S (M, N) + R (M, N) >>> where S and R are setup cost and joining cost (for M, N rows) resp. >>> Cost of partition-wise join would be P * J(M/P, N/P) = P * S(M/P, N/P) >>> + P * R(M/P, N/P). Each of the join methods will have different S and >>> R functions and may not be linear on the number of rows. So, >>> PartitionJoinPath costs are obtained from corresponding regular path >>> costs subjected to above transformation. This way, we will be >>> protected from choosing a PartitionJoinPath when it's not optimal. > >> I'm not sure that I really understand the stuff with big-O notation >> and M, N, and P. But I think what you are saying is that we could >> cost a PartitionJoinPath by costing some of the partitions (it might >> be a good idea to choose the biggest ones) and assuming the cost for >> the remaining ones will be roughly proportional. That does seem like >> a reasonable strategy to me. > > I'm not sure to what extent the above argument depends on the assumption > that join is O(MN), but I will point out that in no case of practical > interest for large tables is it actually O(MN). That would be true > only for the stupidest possible nested-loop join method. It would be > wise to convince ourselves that the argument holds for more realistic > big-O costs, eg hash join is more like O(M+N) if all goes well.

Yeah, I agree. To recap briefly, the problem we're trying to solve here is how to build a path for a partitionwise join without an explosion in the amount of memory the planner uses or the number of paths created. In the initial design, if there are N partitions per relation, the total number of paths generated by the planner increases by a factor of N+1, which gets ugly if, say, N = 1000, or even N = 100. To reign that in, we want to do a rough cut at costing the partitionwise join that will be good enough to let us throw away obviously inferior paths, and then work out the exact paths we're going to use only for partitionwise joins that are actually selected. I think costing one or a few of the larger sub-joins and assuming those costs are representative is probably a reasonable approach to that problem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers