On 12/8/22 06:48, David G. Johnston wrote:
On Wed, Dec 7, 2022 at 10:00 PM Vik Fearing <v...@postgresfriends.org> wrote:

On 12/7/22 04:22, David G. Johnston wrote:
On Mon, Dec 5, 2022 at 10:40 PM Vik Fearing <v...@postgresfriends.org>
wrote:

On 12/6/22 05:57, David G. Johnston wrote:
On Mon, Dec 5, 2022 at 9:48 PM Vik Fearing <v...@postgresfriends.org>
wrote:

I can imagine an optimization that would remove an ORDER BY clause
because it isn't needed for any other aggregate.


I'm referring to the query:

select any_value(v order by v) from (values (2),(1),(3)) as vals (v);
// produces 1, per the documented implementation-defined behavior.

Implementation-dependent.  It is NOT implementation-defined, per spec.

I really don't care all that much about the spec here given that ORDER BY
in an aggregate call is non-spec.


Well, this is demonstrably wrong.

<array aggregate function> ::=
    ARRAY_AGG <left paren>
      <value expression>
      [ ORDER BY <sort specification list> ]
      <right paren>


Demoable only by you and a few others...


The standard is publicly available. It is strange that we, being so open, hold ourselves to such a closed standard; but that is what we do.


We should update our documentation - the source of SQL Standard knowledge
for mere mortals.

https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-AGGREGATES

"Note: The ability to specify both DISTINCT and ORDER BY in an aggregate
function is a PostgreSQL extension."

Apparently only DISTINCT remains as our extension.


Using DISTINCT in an aggregate is also standard. What that note is saying is that the standard does not allow *both* to be used at the same time.

The standard defines these things for specific aggregates whereas we are much more generic about it and therefore have to deal with the combinations.

I have submitted a doc patch to clarify that.


You are de-facto creating a first_value aggregate (which is by definition
non-standard) whether you like it or not.


I am de jure creating an any_value aggregate (which is by definition
standard) whether you like it or not.


Yes, both statements seem true.  At least until we decide to start ignoring
a user's explicit order by clause.


I ran some tests and including an ORDER BY in an aggregate that doesn't care (like COUNT) is devastating for performance. I will be proposing a solution to that soon and I invite you to participate in that conversation when I do.
--
Vik Fearing



Reply via email to