I can see that it would be useful. But there are a couple of things about it that are messy.
I think the syntax is a bit clunky because it uses parentheses; a query would be difficult to read if people would like multiple columns, expressions, and aliases. I also think the semantics are messy. If you read https://www.postgresql.org/docs/9.5/static/sql-select.html#SQL-DISTINCT <https://www.postgresql.org/docs/9.5/static/sql-select.html#SQL-DISTINCT> you will see that “DISTINCT ON” is evaluated after the ORDER BY clause, whereas regular DISTINCT is evaluated before the ORDER BY clause. So there will be a bizarre interaction if DISTINCT ON is used with UNION and ORDER BY. By the way, good ol’ MySQL doesn’t have this problem; you can just write SELECT a, b, c FROM t GROUP BY a and it just picks the first value of b and c. This “feature” is one of my least favorite things about MySQL, so let’s not emulate it. :) Julian > On Feb 21, 2018, at 12:44 PM, Aman Sinha <[email protected]> wrote: > > The DISTINCT 'ON' clause is not supported but I am trying to see if there > are other people who have run into this. One of the use cases I have > intuitively maps to something like this: > > SELECT DISTINCT *ON (a)*, b, c FROM T > > Here suppose 'a' is an INT and b, c are some complex types such as > array. In my example, the values of b and c happen to be the same in all > rows belonging to the same group of 'a', so I just want the first row. > Since these are arrays, I cannot use MIN(b) GROUP BY a . Other > alternative is to create a new aggregate function that picks the first > value but it makes the syntax verbose especially with large number of such > columns. > > Incidentally, Postgres supports this [1] > > [1] > https://www.postgresql.org/docs/current/static/sql-select.html#SQL-DISTINCT > > > -Aman
