On Thu, Aug 2, 2018 at 4:25 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > I do not trust Ashutosh's patch because of the point you noted that it > will kick in on ConvertRowtypeExprs that are not related to partitioning.
I don't remember making that point -- can you clarify? > It's also got a rather fundamental conceptual (or at least documentation) > error in that it says it's making pull_var_clause do certain things to > all ConvertRowtypeExprs, but that's not what the code actually does. > I think the tension around that has a lot to do with the fact that the > patch went through so many revisions, and I don't have any faith that > it's right even yet. As you mentioned upthread, this might be insoluble > without some representational change for converted whole-row Vars. Insoluble is a strong word.... > What I'm thinking might be a more appropriate thing, at least for > getting v11 out the door, is to refuse to generate partitionwise > joins when any whole-row vars are involved. (Perhaps that's not > enough to dodge all the problems, though?) It's enough to dodge the problem being discussed on this thread. > Now, that's a bit of a problem for postgres_fdw, because it seems to > insist on injecting WRVs even when the query text does not require any. > Why is that, and can't we get rid of it? It's horrid for performance > even without any of these other considerations. But if we fail to get > rid of that in time for v11, it would mean that postgres_fdw can't > participate in PWJ, which isn't great but I think we could live with it > for now. I don't quite know what you mean here -- postgres_fdw does use whole-row vars for EPQ handling, which may be what you're thinking about. > BTW, what exactly do we think the production status of PWJ is, anyway? > I notice that upthread it was stated that enable_partitionwise_join > defaults to off, as indeed it still does, and we'd turn it on later > when we'd gotten rid of some memory-hogging problems. If that hasn't > happened yet (and I don't see any open item about considering enabling > PWJ by default for v11), then I have exactly no hesitation about > lobotomizing PWJ as hard as we need to to mask this problem for v11. > I'd certainly prefer a solution along those lines to either of these > patches. Yeah, that hasn't been done yet. Partition-wise join probably needs a good bit of work in a number of areas to do all the things that people will want it to do -- see the thread on advanced partition-matching, which is an improvement but still doesn't cover every case that could come up. In terms of memory usage, I think that we need some discussion of the approach that should be taken. I see a couple of possible things we could do, not necessarily mutually exclusive. 1. We could try to avoid translating all of the parent's data structures for every child RelOptInfo, either by rejiggering things so that the child doesn't need all of that data, or by making it able to use the parent's copy of the data. 2. We could try to recycle memory more quickly. For example, if we're planning a partition-wise join of A-B-C-D, first plan paths for A1-B1-C1-D1 (and all proper subsets of those rels), then throw away most of the memory, then plan paths for A2-B2-C2-D2, etc. 3. We could generate paths for on "representative" children (e.g. the biggest ones) and use that to estimate the cost of the partition-wise plan. If that plan is selected, then go back and figure out real paths for the other children. All of these things help in different ways, and Ashutosh had code for some version of all of them at various points, but the problem is quite difficult. If you have three tables with 1000 partitions each, then without partition-wise join you need 3000 (partitions) + 3 (baserels) + 3 (level-2 joinrels) + 1 (level-3 joinrel) RelOptInfos. If you do a partition-wise join, then you get 3000 level-2 child-joinrels and 1000 level-3 child joinrels. Those RelOptInfo structures, and the paths attached to them, cannot help but take up memory. Perhaps that's OK, and we ought to just say "well, if you want better plans, you're going to have to allow more memory for planning". If not, we have to decide how hard we want to work in which areas and how good the result needs to be in order to have this turned on by difficult. Honestly, I'm pretty impressed that we have added not one but two members to the RelOptKind enum without as little collateral damage as there has been. This issue is annoying and the discussion has gone on longer than anyone would probably prefer, but it's really pretty minor in the grand scheme of things, at least IMHO. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company