On Fri, Jun 8, 2018 at 2:10 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: >> I don't understand this complaint. Before, the code took one >> AppendRelInfo, and according to you, it was clear what was supposed to >> happen. Now it takes an array of AppendRelInfos and, according to >> you, it's completely unclear. Yet that seems, to me at least, to be a >> straightforward generalization. If 1 AppendRelInfo is an adequate >> specification of one translations, why are N AppendRelInfos not an >> adequate specification of N translations? > > Because the relationships between the transforms are unclear. Are we > supposed to apply those N transformations to the expression in sequence? > It doesn't look to me like that's what the code does. I think --- I might > be wrong --- that the code is relying on the transformations to be > non-overlapping, that is a change made by any one of them cannot be > further affected by another one. This is, however, undocumented.
OK, I see. I guess that possible confusion didn't occur to me both because I was looking at the code, which I knew wouldn't handle overlapping transformations, and also because the intended use was for partition-wise join, where the problem can't come up because we're translating from the toplevel RTIs for the join to the set of RTIs appropriate for one child-join. But, it's certainly fine to add a comment. Regarding your original complaint that find_appinfos_by_relids will "linearly search the appinfo list to produce ... an array, which also has to be linearly searched", let me just note that there is a big difference between the probable lengths of those arrays. Given an N-way partition-wise join between tables with P partitions each, the entire appinfo list will contain N * P entries, but the entries produced by find_appinfos_by_relids() will at most of length P. There is a lot of user interest in having P grow into the thousands, or at the very least the hundreds, but the planner is probably going to be in a tough spot anyway if there are more than a few tens of tables and, in practice, I suspect that the number of compatibly partitioned tables in a query is not likely to grow even that large. Moreover, searching an array should be noticeably more cache-efficient than searching a list. So the overall efficiency of searching the array is probably 2 to 4 orders of magnitude better than searching the list. That being said, I don't mind a bit if you want to look for further speedups here, but if you do, keep in mind that a lot of queries won't even use partition-wise join, so all of the arrays will be of length 1. Even when partition-wise join is used, it is quite likely not to be used for every table in the query, in which case it will still be of length 1 in some cases. So pessimizing nappinfos = 1 even slightly is probably a regression overall. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company