Oh wow, that fixed it!

I changed that line in the parser to be Expression(ExprContext.ACCEPT_ALL)
And now my nested JSON queries work!

The minimal reproduction is:

SELECT JSON_OBJECT(
    KEY 'id'
    VALUE (SELECT 1)
)

+----------+
|  EXPR$0  |
+----------+
| {"id":1} |
+----------+
1 row selected (0.01 seconds)

On Wed, Feb 23, 2022 at 6:30 PM Gavin Ray <[email protected]> wrote:

> Okay, I think maybe I have an idea what is going on
>
> The parser grammar for "JsonNameAndValue":
>
> https://github.com/apache/calcite/blob/82dd78a14f6aef2eeec2f9c94978d04b4acc5359/core/src/main/codegen/templates/Parser.jj#L6267-L6296
>
> Has:
> e = Expression(ExprContext.ACCEPT_NON_QUERY)
>
> This enum value appears to be the one that says it can't contain queries:
>
> https://github.com/apache/calcite/blob/82dd78a14f6aef2eeec2f9c94978d04b4acc5359/core/src/main/java/org/apache/calcite/sql/parser/SqlAbstractParserImpl.java#L328-L331
>
>     /**
>      * Accept only non-query expressions in this context.
>      */
>     ACCEPT_NON_QUERY,
>
> So it's not a bug, but Calcite's parser just doesn't take query
> expressions for JSON values.
>
> I guess that changing this might break a bunch of stuff, and probably is
> very difficult?
>
> On Wed, Feb 23, 2022 at 6:10 PM Gavin Ray <[email protected]> wrote:
>
>> Sorry, I realized this might be a time investment to try to
>> debug/reproduce
>> So I found a website that let me create the schema, rows, and query on
>> various DB's
>>
>> Here is an example on Oracle and MySQL that you can view/modify in your
>> browser:
>>
>> ORACLE:
>> https://dbfiddle.uk/?rdbms=oracle_21&fiddle=2eed69c44fa63adf9830213163ba73d0
>> MYSQL:
>> https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=f94c7957eae4f5ebe4c879c17fbe64ea
>>
>> On Wed, Feb 23, 2022 at 2:59 PM Gavin Ray <[email protected]> wrote:
>>
>>> I think this actually is a bug in Calcite's parser or it's interpretation
>>> I tested on H2, Oracle, and MySQL, the below is valid on those DB's but
>>> fails on Calcite's parser:
>>>
>>> =================
>>> H2/Oracle
>>> =================
>>> SELECT
>>>     "houses"."id" AS "id",
>>>     "houses"."address" AS "address",
>>>     (
>>>         SELECT
>>>             JSON_ARRAYAGG(JSON_OBJECT(
>>>                 KEY 'id' VALUE "users"."id",
>>>                 KEY 'name' VALUE "users"."name",
>>>                 KEY 'todos' VALUE (
>>>                     SELECT
>>>                         JSON_ARRAYAGG(JSON_OBJECT(
>>>                             KEY 'id' VALUE "todos"."id",
>>>                             KEY 'description' VALUE "todos"."description"
>>>                         ))
>>>                     FROM
>>>                         "todos"
>>>                     WHERE
>>>                         "todos"."user_id" = "users"."id"
>>>                 )
>>>             ))
>>>         FROM
>>>             "users"
>>>         WHERE
>>>             "users"."house_id" = "houses"."id"
>>>     ) "users"
>>> FROM
>>>     "houses"
>>> GROUP BY
>>>     "houses"."id",
>>>     "houses"."address"
>>>
>>> =================
>>> MySQL
>>> =================
>>> SELECT
>>>     `houses`.`id` AS `id`,
>>>     `houses`.`address` AS `address`,
>>>     (
>>>         SELECT
>>>             JSON_ARRAYAGG(JSON_OBJECT(
>>>                 'id', `users`.`id`,
>>>                 'name', `users`.`name`,
>>>                 'todos', (
>>>                     SELECT
>>>                         JSON_ARRAYAGG(JSON_OBJECT(
>>>                             'id', `todos`.`id`,
>>>                             'description', `todos`.`description`
>>>                         ))
>>>                     FROM
>>>                         `todos`
>>>                     WHERE
>>>                         `todos`.`user_id` = `users`.`id`
>>>                 )
>>>             ))
>>>         FROM
>>>             `users`
>>>         WHERE
>>>             `users`.`house_id` = `houses`.`id`
>>>     ) `users`
>>> FROM
>>>     `houses`
>>> GROUP BY
>>>     `houses`.`id`,
>>>     `houses`.`address`
>>>
>>> On Wed, Feb 23, 2022 at 12:52 PM Gavin Ray <[email protected]>
>>> wrote:
>>>
>>>> That's a valid point, let me see check what some common dialects will
>>>> accept for this sort of thing
>>>>
>>>> On Wed, Feb 23, 2022 at 12:36 PM Julian Hyde <[email protected]>
>>>> wrote:
>>>>
>>>>> It’s a parser error. That usually means that the user has made a
>>>>> mistake.
>>>>>
>>>>> Try your SQL on another DB with JSON support before you declare this a
>>>>> Calcite bug.
>>>>>
>>>>> Julian
>>>>>
>>>>> > On Feb 23, 2022, at 09:22, Gavin Ray <[email protected]> wrote:
>>>>> >
>>>>> > No dice still unfortunately =/
>>>>> >
>>>>> > If it's any easier, I put a self-contained single class reproduction
>>>>> on
>>>>> > Github:
>>>>> >
>>>>> https://github.com/GavinRay97/calcite-nested-json-subquery-bug/blob/main/src/test/java/com/example/AppTest.java
>>>>> >
>>>>> > Is it worth filing a JIRA ticket over you think?
>>>>> >
>>>>> >
>>>>> >> On Wed, Feb 23, 2022 at 3:45 AM Julian Hyde <[email protected]>
>>>>> wrote:
>>>>> >>
>>>>> >> Try ‘value ((‘ in place of ‘value (‘.
>>>>> >>
>>>>> >> Julian
>>>>> >>
>>>>> >>>> On Feb 21, 2022, at 9:33 AM, Gavin Ray <[email protected]>
>>>>> wrote:
>>>>> >>>
>>>>> >>> I hadn't thought about the fact that ORM's probably have to solve
>>>>> this
>>>>> >>> problem as well
>>>>> >>> That is a great suggestion, I will try to investigate some of the
>>>>> popular
>>>>> >>> ORM codebases and see if there are any tricks they are using.
>>>>> >>>
>>>>> >>> I seem to maybe be getting a tiny bit closer by using subqueries
>>>>> like
>>>>> >>> Julian suggested instead of operator calls
>>>>> >>> But if I may ask what is probably a very stupid question:
>>>>> >>>
>>>>> >>> What might the error message
>>>>> >>> "parse failed: Query expression encountered in illegal context
>>>>> >>> (state=,code=0)"
>>>>> >>>
>>>>> >>> Mean in the below query?
>>>>> >>>
>>>>> >>> The reason why I am confused is because the query runs if I remove
>>>>> the
>>>>> >>> innermost subquery ("todos")
>>>>> >>> But the innermost subquery is a direct copy-paste of the subquery
>>>>> above
>>>>> >> it,
>>>>> >>> so I know it MUST be valid
>>>>> >>>
>>>>> >>> As usual, thank you so much for your help/guidance Stamatis.
>>>>> >>>
>>>>> >>> select
>>>>> >>> "g0"."id" "id",
>>>>> >>> "g0"."address" "address",
>>>>> >>> (
>>>>> >>>   select json_arrayagg(json_object(
>>>>> >>>     key 'id' value "g1"."id",
>>>>> >>>     key 'todos' value (
>>>>> >>>       select json_arrayagg(json_object(
>>>>> >>>         key 'id' value "g2"."id",
>>>>> >>>         key 'description' value "g2"."description",
>>>>> >>>       ))
>>>>> >>>       from (
>>>>> >>>         select * from "todos"
>>>>> >>>         where "g1"."id" = "user_id"
>>>>> >>>         order by "id"
>>>>> >>>       ) "g2"
>>>>> >>>     )
>>>>> >>>   ))
>>>>> >>>   from (
>>>>> >>>     select * from "users"
>>>>> >>>     where "g0"."id" = "house_id"
>>>>> >>>     order by "id"
>>>>> >>>   ) "g1"
>>>>> >>> ) "users"
>>>>> >>> from "houses" "g0"
>>>>> >>> order by "g0"."id"
>>>>> >>>
>>>>> >>>> On Mon, Feb 21, 2022 at 8:07 AM Stamatis Zampetakis <
>>>>> [email protected]>
>>>>> >>>> wrote:
>>>>> >>>>
>>>>> >>>> Hi Gavin,
>>>>> >>>>
>>>>> >>>> A few more comments in case they help to get you a bit further on
>>>>> your
>>>>> >>>> work.
>>>>> >>>>
>>>>> >>>> The need to return the result as a single object is a common
>>>>> problem in
>>>>> >>>> object relational mapping (ORM) frameworks/APIS (JPA, Datanucleus,
>>>>> >>>> Hibernate, etc.). Apart from the suggestions so far maybe you
>>>>> could look
>>>>> >>>> into these frameworks as well for more inspiration.
>>>>> >>>>
>>>>> >>>> Moreover your approach of decomposing the query into individual
>>>>> parts is
>>>>> >>>> commonly known as the N+1 problem [1].
>>>>> >>>>
>>>>> >>>> Lastly, keep in mind that you can introduce custom UDF, UDAF
>>>>> functions
>>>>> >> if
>>>>> >>>> you need more flexibility on reconstructing the final result.
>>>>> >>>>
>>>>> >>>> Best,
>>>>> >>>> Stamatis
>>>>> >>>>
>>>>> >>>> [1]
>>>>> >>>>
>>>>> >>>>
>>>>> >>
>>>>> https://stackoverflow.com/questions/97197/what-is-the-n1-selects-problem-in-orm-object-relational-mapping
>>>>> >>>>
>>>>> >>>>> On Sun, Feb 13, 2022 at 3:59 AM Gavin Ray <[email protected]
>>>>> >
>>>>> >> wrote:
>>>>> >>>>>
>>>>> >>>>> Ah wait nevermind, got excited and spoke too soon. Looking at it
>>>>> more
>>>>> >>>>> closely, that data isn't correct.
>>>>> >>>>> At least it's in somewhat the right shape, ha!
>>>>> >>>>>
>>>>> >>>>>> On Sat, Feb 12, 2022 at 9:57 PM Gavin Ray <
>>>>> [email protected]>
>>>>> >> wrote:
>>>>> >>>>>
>>>>> >>>>>> After ~5 hours, I think I may have made some progress =)
>>>>> >>>>>>
>>>>> >>>>>> I have this, which currently works. The problem is that the
>>>>> nested
>>>>> >>>>> columns
>>>>> >>>>>> don't have names on them.
>>>>> >>>>>> Since I need to return a nested "Map<String, Object>", I have to
>>>>> >> figure
>>>>> >>>>>> out how to convert this query into a form that gives column
>>>>> names.
>>>>> >>>>>>
>>>>> >>>>>> But this is still great progress I think!
>>>>> >>>>>>
>>>>> >>>>>> SELECT
>>>>> >>>>>>   "todos".*,
>>>>> >>>>>>   ARRAY(
>>>>> >>>>>>       SELECT
>>>>> >>>>>>           "users".*,
>>>>> >>>>>>           ARRAY(
>>>>> >>>>>>               SELECT
>>>>> >>>>>>                   "todos".*
>>>>> >>>>>>               FROM
>>>>> >>>>>>                   "todos"
>>>>> >>>>>>           ) AS "todos"
>>>>> >>>>>>       FROM
>>>>> >>>>>>           "users"
>>>>> >>>>>>   ) AS "users"
>>>>> >>>>>> FROM
>>>>> >>>>>>   "todos"
>>>>> >>>>>> WHERE
>>>>> >>>>>>   "user_id" IN (
>>>>> >>>>>>       SELECT
>>>>> >>>>>>           "user_id"
>>>>> >>>>>>       FROM
>>>>> >>>>>>           "users"
>>>>> >>>>>>       WHERE
>>>>> >>>>>>           "house_id" IN (
>>>>> >>>>>>               SELECT
>>>>> >>>>>>                   "id"
>>>>> >>>>>>               FROM
>>>>> >>>>>>                   "houses"
>>>>> >>>>>>           )
>>>>> >>>>>>   );
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>
>>>>> >>>>
>>>>> >>
>>>>> +----+---------+------------------------+------------------------------------------------------------------------------+
>>>>> >>>>>> | id | user_id |      description       |
>>>>> >>>>>>                                           |
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>
>>>>> >>>>
>>>>> >>
>>>>> +----+---------+------------------------+------------------------------------------------------------------------------+
>>>>> >>>>>> | 1  | 1       | Take out the trash     | [{1, John, 1, [{1, 1,
>>>>> Take
>>>>> >>>> out
>>>>> >>>>>> the trash}, {2, 1, Watch my favorite show}, { |
>>>>> >>>>>> | 2  | 1       | Watch my favorite show | [{1, John, 1, [{1, 1,
>>>>> Take
>>>>> >>>> out
>>>>> >>>>>> the trash}, {2, 1, Watch my favorite show}, { |
>>>>> >>>>>> | 3  | 1       | Charge my phone        | [{1, John, 1, [{1, 1,
>>>>> Take
>>>>> >>>> out
>>>>> >>>>>> the trash}, {2, 1, Watch my favorite show}, { |
>>>>> >>>>>> | 4  | 2       | Cook dinner            | [{1, John, 1, [{1, 1,
>>>>> Take
>>>>> >>>> out
>>>>> >>>>>> the trash}, {2, 1, Watch my favorite show}, { |
>>>>> >>>>>> | 5  | 2       | Read a book            | [{1, John, 1, [{1, 1,
>>>>> Take
>>>>> >>>> out
>>>>> >>>>>> the trash}, {2, 1, Watch my favorite show}, { |
>>>>> >>>>>> | 6  | 2       | Organize office        | [{1, John, 1, [{1, 1,
>>>>> Take
>>>>> >>>> out
>>>>> >>>>>> the trash}, {2, 1, Watch my favorite show}, { |
>>>>> >>>>>> | 7  | 3       | Walk the dog           | [{1, John, 1, [{1, 1,
>>>>> Take
>>>>> >>>> out
>>>>> >>>>>> the trash}, {2, 1, Watch my favorite show}, { |
>>>>> >>>>>> | 8  | 3       | Feed the cat           | [{1, John, 1, [{1, 1,
>>>>> Take
>>>>> >>>> out
>>>>> >>>>>> the trash}, {2, 1, Watch my favorite show}, { |
>>>>> >>>>>>
>>>>> >>>>>>
>>>>> >>>>>
>>>>> >>>>
>>>>> >>
>>>>> +----+---------+------------------------+------------------------------------------------------------------------------+
>>>>> >>>>>>
>>>>> >>>>>> On Sat, Feb 12, 2022 at 4:13 PM Gavin Ray <
>>>>> [email protected]>
>>>>> >>>> wrote:
>>>>> >>>>>>
>>>>> >>>>>>> Nevermind, this is a standard term not something
>>>>> Calcite-specific it
>>>>> >>>>>>> seems!
>>>>> >>>>>>>
>>>>> >>>>>>> https://en.wikipedia.org/wiki/Correlated_subquery
>>>>> >>>>>>>
>>>>> >>>>>>> On Sat, Feb 12, 2022 at 3:46 PM Gavin Ray <
>>>>> [email protected]>
>>>>> >>>>> wrote:
>>>>> >>>>>>>
>>>>> >>>>>>>> Forgive my ignorance/lack of experience
>>>>> >>>>>>>>
>>>>> >>>>>>>> I am somewhat familiar with the ARRAY() function, but not
>>>>> sure I
>>>>> >> know
>>>>> >>>>>>>> the term "correlated"
>>>>> >>>>>>>> Searching the Calcite codebase for uses of "correlated" +
>>>>> "query", I
>>>>> >>>>>>>> found:
>>>>> >>>>>>>>
>>>>> >>>>>>>>
>>>>> >>>>>>>>
>>>>> >>>>>
>>>>> >>>>
>>>>> >>
>>>>> https://github.com/apache/calcite/blob/1d4f1b394bfdba03c5538017e12ab2431b137ca9/core/src/test/java/org/apache/calcite/test/SqlToRelConverterTest.java#L1603-L1612
>>>>> >>>>>>>>
>>>>> >>>>>>>> @Test void testCorrelatedSubQueryInJoin() {
>>>>> >>>>>>>>   final String sql = "select *\n"
>>>>> >>>>>>>>       + "from emp as e\n"
>>>>> >>>>>>>>       + "join dept as d using (deptno)\n"
>>>>> >>>>>>>>       + "where d.name = (\n"
>>>>> >>>>>>>>       + "  select max(name)\n"
>>>>> >>>>>>>>       + "  from dept as d2\n"
>>>>> >>>>>>>>       + "  where d2.deptno = d.deptno)";
>>>>> >>>>>>>>   sql(sql).withExpand(false).ok();
>>>>> >>>>>>>> }
>>>>> >>>>>>>>
>>>>> >>>>>>>> But I also see this, which says it is "uncorrelated" but
>>>>> seems very
>>>>> >>>>>>>> similar?
>>>>> >>>>>>>>
>>>>> >>>>>>>> @Test void testInUncorrelatedSubQuery() {
>>>>> >>>>>>>>   final String sql = "select empno from emp where deptno in"
>>>>> >>>>>>>>       + " (select deptno from dept)";
>>>>> >>>>>>>>   sql(sql).ok();
>>>>> >>>>>>>> }
>>>>> >>>>>>>>
>>>>> >>>>>>>> I wouldn't blame you for not answering such a basic question
>>>>> -- but
>>>>> >>>>> what
>>>>> >>>>>>>> exactly does "correlation" mean here?
>>>>> >>>>>>>>
>>>>> >>>>>>>> Thanks, as usual Julian
>>>>> >>>>>>>>
>>>>> >>>>>>>>
>>>>> >>>>>>>> On Sat, Feb 12, 2022 at 3:08 PM Julian Hyde <
>>>>> [email protected]
>>>>> >>>
>>>>> >>>>>>>> wrote:
>>>>> >>>>>>>>
>>>>> >>>>>>>>> Correlated ARRAY sub-query?
>>>>> >>>>>>>>>
>>>>> >>>>>>>>>> On Feb 12, 2022, at 10:40 AM, Gavin Ray <
>>>>> [email protected]>
>>>>> >>>>>>>>> wrote:
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> Apologies for the delay in replying
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> This makes things clear and seems obvious now that you
>>>>> point it
>>>>> >>>> out.
>>>>> >>>>>>>>>> Thank you, Justin and Julian =)
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> Let me ask another question (if I may) that I am struggling
>>>>> to
>>>>> >>>>> phrase
>>>>> >>>>>>>>>> easily.
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> So with GraphQL, you might have a query like:
>>>>> >>>>>>>>>> - "Get houses"
>>>>> >>>>>>>>>> - "For each house get the user that lives in the house
>>>>> >>>>>>>>>> - "And for each user get their list of todos"
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> The result has to come back such that it's a single object
>>>>> for
>>>>> >>>> each
>>>>> >>>>>>>>> row
>>>>> >>>>>>>>>> ==============
>>>>> >>>>>>>>>> {
>>>>> >>>>>>>>>> houses: [{
>>>>> >>>>>>>>>>  address: "123 Main Street",
>>>>> >>>>>>>>>>  users: [{
>>>>> >>>>>>>>>>    name: "Joe",
>>>>> >>>>>>>>>>    todos: [{
>>>>> >>>>>>>>>>      description: "Take out trash"
>>>>> >>>>>>>>>>    }]
>>>>> >>>>>>>>>>  }]
>>>>> >>>>>>>>>> }
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> From a SQL perspective, the logical equivalent would be
>>>>> something
>>>>> >>>>>>>>> like:
>>>>> >>>>>>>>>> ==============
>>>>> >>>>>>>>>> SELECT
>>>>> >>>>>>>>>>  house.address,
>>>>> >>>>>>>>>>  (somehow nest users + double-nest todos under user)
>>>>> >>>>>>>>>> FROM
>>>>> >>>>>>>>>>  house
>>>>> >>>>>>>>>> JOIN
>>>>> >>>>>>>>>>  user ON user.house_id = house.id
>>>>> >>>>>>>>>>  todos ON todos.user_id = user.id
>>>>> >>>>>>>>>> WHERE
>>>>> >>>>>>>>>>  house.id = 1
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> I'm not familiar enough with SQL to have figured out a way
>>>>> to make
>>>>> >>>>>>>>> this
>>>>> >>>>>>>>>> kind of
>>>>> >>>>>>>>>> query using operators that are supported across most of the
>>>>> DB's
>>>>> >>>>>>>>> Calcite has
>>>>> >>>>>>>>>> adapters for.
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> Currently what I have done instead, on a tip from
>>>>> Gopalakrishna
>>>>> >>>>> Holla
>>>>> >>>>>>>>> from
>>>>> >>>>>>>>>> LinkedIn Coral team who has built GraphQL-on-Calcite, was
>>>>> to break
>>>>> >>>>> up
>>>>> >>>>>>>>> the
>>>>> >>>>>>>>>> query
>>>>> >>>>>>>>>> into individual parts and then do the join in-memory:
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> SELECT ... FROM users;
>>>>> >>>>>>>>>> SELECT ... FROM todos WHERE todos.user_id IN (ResultSet
>>>>> from prev
>>>>> >>>>>>>>> response);
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> However, the way I am doing this seems like it's probably
>>>>> very
>>>>> >>>>>>>>> inefficient.
>>>>> >>>>>>>>>> Because I do a series of nested loops to add the key to each
>>>>> >>>> object
>>>>> >>>>>>>>> in the
>>>>> >>>>>>>>>> parent ResultSet row:
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>
>>>>> >>>>>
>>>>> >>>>
>>>>> >>
>>>>> https://github.com/GavinRay97/GraphQLCalcite/blob/648e0ac4f6810a3c360d13a03e6597c33406de4b/src/main/kotlin/TableDataFetcher.kt#L135-L153
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> Is there some better way of doing this?
>>>>> >>>>>>>>>> I would be eternally grateful for any advice.
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> On Thu, Feb 10, 2022 at 3:27 PM Julian Hyde <
>>>>> >>>> [email protected]
>>>>> >>>>>>
>>>>> >>>>>>>>> wrote:
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>>> Yes, if you want to do multiple layers of aggregation, use
>>>>> CTEs
>>>>> >>>>>>>>> (WITH) or
>>>>> >>>>>>>>>>> nested sub-queries. For example, the following is I
>>>>> believe valid
>>>>> >>>>>>>>> standard
>>>>> >>>>>>>>>>> SQL, and actually computes something useful:
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>> WITH q1 AS
>>>>> >>>>>>>>>>> (SELECT deptno, job, AVG(sal) AS avg_sal
>>>>> >>>>>>>>>>>  FROM emp
>>>>> >>>>>>>>>>>  GROUP BY deptno, job)
>>>>> >>>>>>>>>>> WITH q2 AS
>>>>> >>>>>>>>>>> (SELECT deptno, AVG(avg_sal) AS avg_avg_sal
>>>>> >>>>>>>>>>>  FROM q1
>>>>> >>>>>>>>>>>  GROUP BY deptno)
>>>>> >>>>>>>>>>> SELECT AVG(avg_avg_sal)
>>>>> >>>>>>>>>>> FROM q2
>>>>> >>>>>>>>>>> GROUP BY ()
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>> (You can omit the “GROUP BY ()” line, but I think it makes
>>>>> things
>>>>> >>>>>>>>> clearer.)
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>> Julian
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>>> On Feb 10, 2022, at 12:17 PM, Justin Swanhart <
>>>>> >>>>> [email protected]>
>>>>> >>>>>>>>>>> wrote:
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> I wish you could unsend emails :)  Answering my own
>>>>> question,
>>>>> >>>> no,
>>>>> >>>>>>>>> because
>>>>> >>>>>>>>>>>> that would return three rows with the average :D
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> On Thu, Feb 10, 2022 at 3:16 PM Justin Swanhart <
>>>>> >>>>>>>>> [email protected]>
>>>>> >>>>>>>>>>> wrote:
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>>> Just out of curiosity, is the second level aggregation
>>>>> using
>>>>> >>>> AVG
>>>>> >>>>>>>>> in a
>>>>> >>>>>>>>>>>>> window context?  It the frame is the whole table and it
>>>>> >>>>> aggregates
>>>>> >>>>>>>>> over
>>>>> >>>>>>>>>>> it?
>>>>> >>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>> On Thu, Feb 10, 2022 at 3:12 PM Justin Swanhart <
>>>>> >>>>>>>>> [email protected]>
>>>>> >>>>>>>>>>>>> wrote:
>>>>> >>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>> That is really neat about Oracle.
>>>>> >>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>> The alternative in general is to use a subquery:
>>>>> >>>>>>>>>>>>>> SELECT avg(avg(sal)) FROM emp GROUP BY deptno;
>>>>> >>>>>>>>>>>>>> becomes
>>>>> >>>>>>>>>>>>>> select avg(the_avg)
>>>>> >>>>>>>>>>>>>> from (select avg(sal) from emp group b deptno) an_alias;
>>>>> >>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>> or
>>>>> >>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>> with the_cte as (select avg(sal) x from emp group by
>>>>> deptno)
>>>>> >>>>>>>>>>>>>> select avg(x) from the_cte;
>>>>> >>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>> On Thu, Feb 10, 2022 at 3:03 PM Julian Hyde <
>>>>> >>>>>>>>> [email protected]>
>>>>> >>>>>>>>>>>>>> wrote:
>>>>> >>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>> Some databases, e.g. Oracle, allow TWO levels of
>>>>> nesting:
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>> SELECT avg(sal) FROM emp GROUP BY deptno;
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>> AVG(SAL)
>>>>> >>>>>>>>>>>>>>> ========
>>>>> >>>>>>>>>>>>>>> 1,566.67
>>>>> >>>>>>>>>>>>>>> 2,175.00
>>>>> >>>>>>>>>>>>>>> 2,916.65
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>> SELECT avg(avg(sal)) FROM emp GROUP BY deptno;
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>> AVG(SUM(SAL))
>>>>> >>>>>>>>>>>>>>> =============
>>>>> >>>>>>>>>>>>>>>     9,675
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>> The first level aggregates by department (returning 3
>>>>> >>>> records),
>>>>> >>>>>>>>> and
>>>>> >>>>>>>>>>> the
>>>>> >>>>>>>>>>>>>>> second level computes the grand total (returning 1
>>>>> record).
>>>>> >>>> But
>>>>> >>>>>>>>> that
>>>>> >>>>>>>>>>> is an
>>>>> >>>>>>>>>>>>>>> exceptional case.
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>> Generally, any expression in the SELECT or HAVING
>>>>> clause of
>>>>> >>>> an
>>>>> >>>>>>>>>>> aggregate
>>>>> >>>>>>>>>>>>>>> query is either ‘before’ or ‘after’ aggregation.
>>>>> Consider
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>> SELECT t.x + 1 AS a, 2 + SUM(t.y + 3) AS b
>>>>> >>>>>>>>>>>>>>> FROM t
>>>>> >>>>>>>>>>>>>>> GROUP BY t.x
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>> The expressions “t.y” and “t.y + 3” occur before
>>>>> aggregation;
>>>>> >>>>>>>>> “t.x”,
>>>>> >>>>>>>>>>>>>>> “t.x + 1”, “SUM(t.y + 3)” and “2 + SUM(t.y + 3)” occur
>>>>> after
>>>>> >>>>>>>>>>> aggregation.
>>>>> >>>>>>>>>>>>>>> SQL semantics rely heavily on this stratification.
>>>>> Allowing
>>>>> >>>> an
>>>>> >>>>>>>>> extra
>>>>> >>>>>>>>>>> level
>>>>> >>>>>>>>>>>>>>> of aggregation would mess it all up.
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>> Julian
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>> On Feb 10, 2022, at 9:45 AM, Justin Swanhart <
>>>>> >>>>>>>>> [email protected]>
>>>>> >>>>>>>>>>>>>>> wrote:
>>>>> >>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>> This is a SQL limitation.
>>>>> >>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>> mysql> select sum(1);
>>>>> >>>>>>>>>>>>>>>> +--------+
>>>>> >>>>>>>>>>>>>>>> | sum(1) |
>>>>> >>>>>>>>>>>>>>>> +--------+
>>>>> >>>>>>>>>>>>>>>> |      1 |
>>>>> >>>>>>>>>>>>>>>> +--------+
>>>>> >>>>>>>>>>>>>>>> 1 row in set (0.00 sec)
>>>>> >>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>> mysql> select sum(sum(1));
>>>>> >>>>>>>>>>>>>>>> ERROR 1111 (HY000): Invalid use of group function
>>>>> >>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>> On Thu, Feb 10, 2022 at 12:39 PM Gavin Ray <
>>>>> >>>>>>>>> [email protected]>
>>>>> >>>>>>>>>>>>>>> wrote:
>>>>> >>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>> Went to test this query out and found that it can't
>>>>> be
>>>>> >>>>>>>>> performed:
>>>>> >>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>> SELECT
>>>>> >>>>>>>>>>>>>>>>> JSON_OBJECT(
>>>>> >>>>>>>>>>>>>>>>>    KEY 'users'
>>>>> >>>>>>>>>>>>>>>>>    VALUE JSON_ARRAYAGG(
>>>>> >>>>>>>>>>>>>>>>>        JSON_OBJECT(
>>>>> >>>>>>>>>>>>>>>>>            KEY 'name' VALUE "users"."name",
>>>>> >>>>>>>>>>>>>>>>>            KEY 'todos' VALUE JSON_ARRAYAGG(
>>>>> >>>>>>>>>>>>>>>>>                JSON_OBJECT(
>>>>> >>>>>>>>>>>>>>>>>                    KEY 'description' VALUE
>>>>> >>>>>>>>> "todos"."description"
>>>>> >>>>>>>>>>>>>>>>>                )
>>>>> >>>>>>>>>>>>>>>>>            )
>>>>> >>>>>>>>>>>>>>>>>        )
>>>>> >>>>>>>>>>>>>>>>>    )
>>>>> >>>>>>>>>>>>>>>>> )
>>>>> >>>>>>>>>>>>>>>>> FROM
>>>>> >>>>>>>>>>>>>>>>> "users"
>>>>> >>>>>>>>>>>>>>>>> LEFT OUTER JOIN
>>>>> >>>>>>>>>>>>>>>>> "todos" ON "users"."id" = "todos"."user_id";
>>>>> >>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>> Checking the source, seems this is a blanket policy,
>>>>> not a
>>>>> >>>>>>>>>>>>>>>>> datasource-specific thing.
>>>>> >>>>>>>>>>>>>>>>> From a functional perspective, it doesn't feel like
>>>>> it's
>>>>> >>>> much
>>>>> >>>>>>>>>>>>>>> different
>>>>> >>>>>>>>>>>>>>>>> from JOINs
>>>>> >>>>>>>>>>>>>>>>> But I don't understand relational theory or DB
>>>>> >>>> functionality
>>>>> >>>>>>>>> in the
>>>>> >>>>>>>>>>>>>>> least,
>>>>> >>>>>>>>>>>>>>>>> so I'm not fit to judge.
>>>>> >>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>> Just curious why Calcite doesn't allow this
>>>>> >>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>
>>>>> >>>>>>>>>
>>>>> >>>>>
>>>>> >>>>
>>>>> >>
>>>>>
>>>>

Reply via email to