Yes, if you want to do multiple layers of aggregation, use CTEs (WITH) or
nested sub-queries. For example, the following is I believe valid standard SQL,
and actually computes something useful:
WITH q1 AS
(SELECT deptno, job, AVG(sal) AS avg_sal
FROM emp
GROUP BY deptno, job)
WITH q2 AS
(SELECT deptno, AVG(avg_sal) AS avg_avg_sal
FROM q1
GROUP BY deptno)
SELECT AVG(avg_avg_sal)
FROM q2
GROUP BY ()
(You can omit the “GROUP BY ()” line, but I think it makes things clearer.)
Julian
> On Feb 10, 2022, at 12:17 PM, Justin Swanhart <[email protected]> wrote:
>
> I wish you could unsend emails :) Answering my own question, no, because
> that would return three rows with the average :D
>
> On Thu, Feb 10, 2022 at 3:16 PM Justin Swanhart <[email protected]> wrote:
>
>> Just out of curiosity, is the second level aggregation using AVG in a
>> window context? It the frame is the whole table and it aggregates over it?
>>
>> On Thu, Feb 10, 2022 at 3:12 PM Justin Swanhart <[email protected]>
>> wrote:
>>
>>> That is really neat about Oracle.
>>>
>>> The alternative in general is to use a subquery:
>>> SELECT avg(avg(sal)) FROM emp GROUP BY deptno;
>>> becomes
>>> select avg(the_avg)
>>> from (select avg(sal) from emp group b deptno) an_alias;
>>>
>>> or
>>>
>>> with the_cte as (select avg(sal) x from emp group by deptno)
>>> select avg(x) from the_cte;
>>>
>>> On Thu, Feb 10, 2022 at 3:03 PM Julian Hyde <[email protected]>
>>> wrote:
>>>
>>>> Some databases, e.g. Oracle, allow TWO levels of nesting:
>>>>
>>>> SELECT avg(sal) FROM emp GROUP BY deptno;
>>>>
>>>> AVG(SAL)
>>>> ========
>>>> 1,566.67
>>>> 2,175.00
>>>> 2,916.65
>>>>
>>>> SELECT avg(avg(sal)) FROM emp GROUP BY deptno;
>>>>
>>>> AVG(SUM(SAL))
>>>> =============
>>>> 9,675
>>>>
>>>> The first level aggregates by department (returning 3 records), and the
>>>> second level computes the grand total (returning 1 record). But that is an
>>>> exceptional case.
>>>>
>>>> Generally, any expression in the SELECT or HAVING clause of an aggregate
>>>> query is either ‘before’ or ‘after’ aggregation. Consider
>>>>
>>>> SELECT t.x + 1 AS a, 2 + SUM(t.y + 3) AS b
>>>> FROM t
>>>> GROUP BY t.x
>>>>
>>>> The expressions “t.y” and “t.y + 3” occur before aggregation; “t.x”,
>>>> “t.x + 1”, “SUM(t.y + 3)” and “2 + SUM(t.y + 3)” occur after aggregation.
>>>> SQL semantics rely heavily on this stratification. Allowing an extra level
>>>> of aggregation would mess it all up.
>>>>
>>>> Julian
>>>>
>>>>
>>>>
>>>>> On Feb 10, 2022, at 9:45 AM, Justin Swanhart <[email protected]>
>>>> wrote:
>>>>>
>>>>> This is a SQL limitation.
>>>>>
>>>>> mysql> select sum(1);
>>>>> +--------+
>>>>> | sum(1) |
>>>>> +--------+
>>>>> | 1 |
>>>>> +--------+
>>>>> 1 row in set (0.00 sec)
>>>>>
>>>>> mysql> select sum(sum(1));
>>>>> ERROR 1111 (HY000): Invalid use of group function
>>>>>
>>>>> On Thu, Feb 10, 2022 at 12:39 PM Gavin Ray <[email protected]>
>>>> wrote:
>>>>>
>>>>>> Went to test this query out and found that it can't be performed:
>>>>>>
>>>>>> SELECT
>>>>>> JSON_OBJECT(
>>>>>> KEY 'users'
>>>>>> VALUE JSON_ARRAYAGG(
>>>>>> JSON_OBJECT(
>>>>>> KEY 'name' VALUE "users"."name",
>>>>>> KEY 'todos' VALUE JSON_ARRAYAGG(
>>>>>> JSON_OBJECT(
>>>>>> KEY 'description' VALUE "todos"."description"
>>>>>> )
>>>>>> )
>>>>>> )
>>>>>> )
>>>>>> )
>>>>>> FROM
>>>>>> "users"
>>>>>> LEFT OUTER JOIN
>>>>>> "todos" ON "users"."id" = "todos"."user_id";
>>>>>>
>>>>>> Checking the source, seems this is a blanket policy, not a
>>>>>> datasource-specific thing.
>>>>>> From a functional perspective, it doesn't feel like it's much
>>>> different
>>>>>> from JOINs
>>>>>> But I don't understand relational theory or DB functionality in the
>>>> least,
>>>>>> so I'm not fit to judge.
>>>>>>
>>>>>> Just curious why Calcite doesn't allow this
>>>>>>
>>>>
>>>>