On Fri, Mar 2, 2018 at 7:32 PM, David Rowley <david.row...@2ndquadrant.com> wrote: > Let's look at the following perhaps unlikely case. (I picked an > extreme case to demonstrate why this may be an inferior method) > > Given the table abc (...) partition by range (a,b,c), with the query: > > select * from abc where a >= 1 and a >= 2 and a >= 3 and b >= 1 and b >>= 2 and b = 3 and c >= 1 and c >= 2 and c = 3; > > We would likely still be parsing those clauses into some struct like > PartitionClauseInfo and would end up with some arrays or Lists with > the clauses segmented by partition key. > > It appears to me, for your method to work we'd need to try every > combination of the clauses matching each partition key, which in this > case is 3 * 3 * 3 searches. Amit's current method is 1 search, after > the clause reduction which is 3 + 3 + 3 (O(N) per partition key) [...] > With that considered, is it still a good idea to do it this way?
I dunno. What do you think? That case is indeed pretty unfortunate, but it's also pretty artificial. It's not obvious to me that we shouldn't care about it, but it's also not obvious to me that we should. If we have some bizarre cases that slip through the cracks or don't perform terribly well, maybe nobody would ever notice or care. On the other hand, maybe they would. I suppose in my ideal world, this could be handled by building a GREATEST or LEAST expression. In other words, if someone says foo >= 1 AND foo >= 2, instead of doing separate pruning steps, we'd just prune once based on foo >= GREATEST(1,2). But that doesn't really work, because there's no provision to tell MinMaxExpr from which opfamily we wish to draw the operator used to compare 1 and 2 and no guarantee that such an operator exists for the actual data types of 1 and 2. (Imagine that 1 and 2 of different data types; the relevant opfamily might have an operator that can compare a value of the same type as foo to 1 and similarly for 2, but no operator that can compare 1 and 2 to each other.) One thing that we could do is just only accept one clause for each column-strategy pairing, presumably either the first one or the last one. So in your example either a >= 1 or a >= 3 would be accepted and the others would be discarded for purposes of partition pruning. If a user complains, we could tell THEM to manually do the rewrite suggested in the previous step, and just write a >= GREATEST(1,2,3). (And of course if it's that simple, they might want to then pre-simplify to a >= 3!) Another alternative is to include some kind of additional type of step in the "pruning program" which can do this GREATEST/LEAST operation ... but that's adding quite a bit of complexity for what seems like it's pretty much a corner case, and as noted above, there's no guarantee that we even have the correct operator available. It should be fine if the partitioning column/expression and all of the constants being compared are of the same type, and in practice *most* of the time even when they're not, but we're going to have to have some handling for the strange cases -- and I think the only real choices are "try every combination and maybe be slow", "try 1 combination and maybe fail to prune everything that could have been pruned", and some intermediate possibilities along the same lines. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company