On Thu, 14 May 2020 at 03:48, Andy Fan <zhihui.fan1...@gmail.com> wrote: > On Wed, May 13, 2020 at 8:04 PM Ashutosh Bapat <ashutosh.bapat....@gmail.com> > wrote: >> My impression about the one row stuff, is that there is too much >> special casing around it. We should somehow structure the UniqueKey >> data so that one row unique keys come naturally rather than special >> cased. E.g every column in such a case is unique in the result so >> create as many UniqueKeys are the number of columns > > > This is the beginning state of the UniqueKey, later David suggested > this as an optimization[1], I buy-in the idea and later I found it mean > more than the original one [2], so I think onerow is needed actually.
Having the "onerow" flag was not how I intended it to work. Here's an example of how I thought it should work: Assume t1 has UniqueKeys on {a} SELECT DISTINCT a,b FROM t1; Here the DISTINCT can be a no-op due to "a" being unique within t1. Or more basically, {a} is a subset of {a,b}. The code which does this is relation_has_uniquekeys_for(), which contains the code: + if (list_is_subset(ukey->exprs, exprs)) + return true; In this case, ukey->exprs is {a} and exprs is {a,b}. So, if the UniqueKey's exprs are a subset of, in this case, the DISTINCT exprs then relation_has_uniquekeys_for() returns true. Basically list_is_subset({a}, {a,b}), Answer: "Yes". For the onerow stuff, if we can prove the relation returns only a single row, e.g an aggregate without a GROUP BY, or there are EquivalenceClasses with ec_has_const == true for each key of a unique index, then why can't set just set the UniqueKeys to {}? That would mean the code to determine if we can avoid performing an explicit DISTINCT operation would be called with list_is_subset({}, {a,b}), which is also true, in fact, an empty set is a subset of any set. Why is there a need to special case that fact? In light of those thoughts, can you explain why you think we need to keep the onerow flag? David