Re: DISTINCT ON clause

Julian Hyde Wed, 21 Feb 2018 13:31:33 -0800

I can see that it would be useful. But there are a couple of things about it 
that are messy.

I think the syntax is a bit clunky because it uses parentheses; a query would 
be difficult to read if people would like multiple columns, expressions, and 
aliases.

I also think the semantics are messy. If you read 
https://www.postgresql.org/docs/9.5/static/sql-select.html#SQL-DISTINCT 
<https://www.postgresql.org/docs/9.5/static/sql-select.html#SQL-DISTINCT> you 
will see that “DISTINCT ON” is evaluated after the ORDER BY clause, whereas 
regular DISTINCT is evaluated before the ORDER BY clause. So there will be a 
bizarre interaction if DISTINCT ON is used with UNION and ORDER BY.

By the way, good ol’ MySQL doesn’t have this problem; you can just write

 SELECT a, b, c FROM t GROUP BY a

and it just picks the first value of b and c. This “feature” is one of my least 
favorite things about MySQL, so let’s not emulate it. :)

Julian

> On Feb 21, 2018, at 12:44 PM, Aman Sinha <[email protected]> wrote:
> 
> The DISTINCT 'ON' clause is not supported but I am trying to see if there
> are other people who have run into this.  One of the use cases I have
> intuitively maps to something like this:
> 
>  SELECT DISTINCT *ON (a)*,  b, c  FROM T
> 
> Here  suppose 'a' is an INT  and b, c are some complex types such as
> array.  In my example, the values of  b and c happen to be the same in all
> rows belonging to the same group of 'a', so I just want the first row.
> Since these are arrays, I cannot use MIN(b) GROUP BY a .   Other
> alternative is to create a new aggregate function that picks the first
> value but it makes the syntax verbose especially with large number of such
> columns.
> 
> Incidentally, Postgres supports this [1]
> 
> [1]
> https://www.postgresql.org/docs/current/static/sql-select.html#SQL-DISTINCT
> 
> 
> -Aman

Re: DISTINCT ON clause

Reply via email to