Well actually, there is one exception -- there's a small bug with the output being improperly escaped but that bug is already on JIRA and seems much easier to approach.
https://issues.apache.org/jira/browse/CALCITE-4989 On Wed, Feb 23, 2022 at 6:46 PM Gavin Ray <[email protected]> wrote: > Oh wow, that fixed it! > > I changed that line in the parser to be Expression(ExprContext.ACCEPT_ALL) > And now my nested JSON queries work! > > The minimal reproduction is: > > SELECT JSON_OBJECT( > KEY 'id' > VALUE (SELECT 1) > ) > > +----------+ > | EXPR$0 | > +----------+ > | {"id":1} | > +----------+ > 1 row selected (0.01 seconds) > > On Wed, Feb 23, 2022 at 6:30 PM Gavin Ray <[email protected]> wrote: > >> Okay, I think maybe I have an idea what is going on >> >> The parser grammar for "JsonNameAndValue": >> >> https://github.com/apache/calcite/blob/82dd78a14f6aef2eeec2f9c94978d04b4acc5359/core/src/main/codegen/templates/Parser.jj#L6267-L6296 >> >> Has: >> e = Expression(ExprContext.ACCEPT_NON_QUERY) >> >> This enum value appears to be the one that says it can't contain queries: >> >> https://github.com/apache/calcite/blob/82dd78a14f6aef2eeec2f9c94978d04b4acc5359/core/src/main/java/org/apache/calcite/sql/parser/SqlAbstractParserImpl.java#L328-L331 >> >> /** >> * Accept only non-query expressions in this context. >> */ >> ACCEPT_NON_QUERY, >> >> So it's not a bug, but Calcite's parser just doesn't take query >> expressions for JSON values. >> >> I guess that changing this might break a bunch of stuff, and probably is >> very difficult? >> >> On Wed, Feb 23, 2022 at 6:10 PM Gavin Ray <[email protected]> wrote: >> >>> Sorry, I realized this might be a time investment to try to >>> debug/reproduce >>> So I found a website that let me create the schema, rows, and query on >>> various DB's >>> >>> Here is an example on Oracle and MySQL that you can view/modify in your >>> browser: >>> >>> ORACLE: >>> https://dbfiddle.uk/?rdbms=oracle_21&fiddle=2eed69c44fa63adf9830213163ba73d0 >>> MYSQL: >>> https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=f94c7957eae4f5ebe4c879c17fbe64ea >>> >>> On Wed, Feb 23, 2022 at 2:59 PM Gavin Ray <[email protected]> wrote: >>> >>>> I think this actually is a bug in Calcite's parser or it's >>>> interpretation >>>> I tested on H2, Oracle, and MySQL, the below is valid on those DB's but >>>> fails on Calcite's parser: >>>> >>>> ================= >>>> H2/Oracle >>>> ================= >>>> SELECT >>>> "houses"."id" AS "id", >>>> "houses"."address" AS "address", >>>> ( >>>> SELECT >>>> JSON_ARRAYAGG(JSON_OBJECT( >>>> KEY 'id' VALUE "users"."id", >>>> KEY 'name' VALUE "users"."name", >>>> KEY 'todos' VALUE ( >>>> SELECT >>>> JSON_ARRAYAGG(JSON_OBJECT( >>>> KEY 'id' VALUE "todos"."id", >>>> KEY 'description' VALUE >>>> "todos"."description" >>>> )) >>>> FROM >>>> "todos" >>>> WHERE >>>> "todos"."user_id" = "users"."id" >>>> ) >>>> )) >>>> FROM >>>> "users" >>>> WHERE >>>> "users"."house_id" = "houses"."id" >>>> ) "users" >>>> FROM >>>> "houses" >>>> GROUP BY >>>> "houses"."id", >>>> "houses"."address" >>>> >>>> ================= >>>> MySQL >>>> ================= >>>> SELECT >>>> `houses`.`id` AS `id`, >>>> `houses`.`address` AS `address`, >>>> ( >>>> SELECT >>>> JSON_ARRAYAGG(JSON_OBJECT( >>>> 'id', `users`.`id`, >>>> 'name', `users`.`name`, >>>> 'todos', ( >>>> SELECT >>>> JSON_ARRAYAGG(JSON_OBJECT( >>>> 'id', `todos`.`id`, >>>> 'description', `todos`.`description` >>>> )) >>>> FROM >>>> `todos` >>>> WHERE >>>> `todos`.`user_id` = `users`.`id` >>>> ) >>>> )) >>>> FROM >>>> `users` >>>> WHERE >>>> `users`.`house_id` = `houses`.`id` >>>> ) `users` >>>> FROM >>>> `houses` >>>> GROUP BY >>>> `houses`.`id`, >>>> `houses`.`address` >>>> >>>> On Wed, Feb 23, 2022 at 12:52 PM Gavin Ray <[email protected]> >>>> wrote: >>>> >>>>> That's a valid point, let me see check what some common dialects will >>>>> accept for this sort of thing >>>>> >>>>> On Wed, Feb 23, 2022 at 12:36 PM Julian Hyde <[email protected]> >>>>> wrote: >>>>> >>>>>> It’s a parser error. That usually means that the user has made a >>>>>> mistake. >>>>>> >>>>>> Try your SQL on another DB with JSON support before you declare this >>>>>> a Calcite bug. >>>>>> >>>>>> Julian >>>>>> >>>>>> > On Feb 23, 2022, at 09:22, Gavin Ray <[email protected]> wrote: >>>>>> > >>>>>> > No dice still unfortunately =/ >>>>>> > >>>>>> > If it's any easier, I put a self-contained single class >>>>>> reproduction on >>>>>> > Github: >>>>>> > >>>>>> https://github.com/GavinRay97/calcite-nested-json-subquery-bug/blob/main/src/test/java/com/example/AppTest.java >>>>>> > >>>>>> > Is it worth filing a JIRA ticket over you think? >>>>>> > >>>>>> > >>>>>> >> On Wed, Feb 23, 2022 at 3:45 AM Julian Hyde < >>>>>> [email protected]> wrote: >>>>>> >> >>>>>> >> Try ‘value ((‘ in place of ‘value (‘. >>>>>> >> >>>>>> >> Julian >>>>>> >> >>>>>> >>>> On Feb 21, 2022, at 9:33 AM, Gavin Ray <[email protected]> >>>>>> wrote: >>>>>> >>> >>>>>> >>> I hadn't thought about the fact that ORM's probably have to >>>>>> solve this >>>>>> >>> problem as well >>>>>> >>> That is a great suggestion, I will try to investigate some of the >>>>>> popular >>>>>> >>> ORM codebases and see if there are any tricks they are using. >>>>>> >>> >>>>>> >>> I seem to maybe be getting a tiny bit closer by using subqueries >>>>>> like >>>>>> >>> Julian suggested instead of operator calls >>>>>> >>> But if I may ask what is probably a very stupid question: >>>>>> >>> >>>>>> >>> What might the error message >>>>>> >>> "parse failed: Query expression encountered in illegal context >>>>>> >>> (state=,code=0)" >>>>>> >>> >>>>>> >>> Mean in the below query? >>>>>> >>> >>>>>> >>> The reason why I am confused is because the query runs if I >>>>>> remove the >>>>>> >>> innermost subquery ("todos") >>>>>> >>> But the innermost subquery is a direct copy-paste of the subquery >>>>>> above >>>>>> >> it, >>>>>> >>> so I know it MUST be valid >>>>>> >>> >>>>>> >>> As usual, thank you so much for your help/guidance Stamatis. >>>>>> >>> >>>>>> >>> select >>>>>> >>> "g0"."id" "id", >>>>>> >>> "g0"."address" "address", >>>>>> >>> ( >>>>>> >>> select json_arrayagg(json_object( >>>>>> >>> key 'id' value "g1"."id", >>>>>> >>> key 'todos' value ( >>>>>> >>> select json_arrayagg(json_object( >>>>>> >>> key 'id' value "g2"."id", >>>>>> >>> key 'description' value "g2"."description", >>>>>> >>> )) >>>>>> >>> from ( >>>>>> >>> select * from "todos" >>>>>> >>> where "g1"."id" = "user_id" >>>>>> >>> order by "id" >>>>>> >>> ) "g2" >>>>>> >>> ) >>>>>> >>> )) >>>>>> >>> from ( >>>>>> >>> select * from "users" >>>>>> >>> where "g0"."id" = "house_id" >>>>>> >>> order by "id" >>>>>> >>> ) "g1" >>>>>> >>> ) "users" >>>>>> >>> from "houses" "g0" >>>>>> >>> order by "g0"."id" >>>>>> >>> >>>>>> >>>> On Mon, Feb 21, 2022 at 8:07 AM Stamatis Zampetakis < >>>>>> [email protected]> >>>>>> >>>> wrote: >>>>>> >>>> >>>>>> >>>> Hi Gavin, >>>>>> >>>> >>>>>> >>>> A few more comments in case they help to get you a bit further >>>>>> on your >>>>>> >>>> work. >>>>>> >>>> >>>>>> >>>> The need to return the result as a single object is a common >>>>>> problem in >>>>>> >>>> object relational mapping (ORM) frameworks/APIS (JPA, >>>>>> Datanucleus, >>>>>> >>>> Hibernate, etc.). Apart from the suggestions so far maybe you >>>>>> could look >>>>>> >>>> into these frameworks as well for more inspiration. >>>>>> >>>> >>>>>> >>>> Moreover your approach of decomposing the query into individual >>>>>> parts is >>>>>> >>>> commonly known as the N+1 problem [1]. >>>>>> >>>> >>>>>> >>>> Lastly, keep in mind that you can introduce custom UDF, UDAF >>>>>> functions >>>>>> >> if >>>>>> >>>> you need more flexibility on reconstructing the final result. >>>>>> >>>> >>>>>> >>>> Best, >>>>>> >>>> Stamatis >>>>>> >>>> >>>>>> >>>> [1] >>>>>> >>>> >>>>>> >>>> >>>>>> >> >>>>>> https://stackoverflow.com/questions/97197/what-is-the-n1-selects-problem-in-orm-object-relational-mapping >>>>>> >>>> >>>>>> >>>>> On Sun, Feb 13, 2022 at 3:59 AM Gavin Ray < >>>>>> [email protected]> >>>>>> >> wrote: >>>>>> >>>>> >>>>>> >>>>> Ah wait nevermind, got excited and spoke too soon. Looking at >>>>>> it more >>>>>> >>>>> closely, that data isn't correct. >>>>>> >>>>> At least it's in somewhat the right shape, ha! >>>>>> >>>>> >>>>>> >>>>>> On Sat, Feb 12, 2022 at 9:57 PM Gavin Ray < >>>>>> [email protected]> >>>>>> >> wrote: >>>>>> >>>>> >>>>>> >>>>>> After ~5 hours, I think I may have made some progress =) >>>>>> >>>>>> >>>>>> >>>>>> I have this, which currently works. The problem is that the >>>>>> nested >>>>>> >>>>> columns >>>>>> >>>>>> don't have names on them. >>>>>> >>>>>> Since I need to return a nested "Map<String, Object>", I have >>>>>> to >>>>>> >> figure >>>>>> >>>>>> out how to convert this query into a form that gives column >>>>>> names. >>>>>> >>>>>> >>>>>> >>>>>> But this is still great progress I think! >>>>>> >>>>>> >>>>>> >>>>>> SELECT >>>>>> >>>>>> "todos".*, >>>>>> >>>>>> ARRAY( >>>>>> >>>>>> SELECT >>>>>> >>>>>> "users".*, >>>>>> >>>>>> ARRAY( >>>>>> >>>>>> SELECT >>>>>> >>>>>> "todos".* >>>>>> >>>>>> FROM >>>>>> >>>>>> "todos" >>>>>> >>>>>> ) AS "todos" >>>>>> >>>>>> FROM >>>>>> >>>>>> "users" >>>>>> >>>>>> ) AS "users" >>>>>> >>>>>> FROM >>>>>> >>>>>> "todos" >>>>>> >>>>>> WHERE >>>>>> >>>>>> "user_id" IN ( >>>>>> >>>>>> SELECT >>>>>> >>>>>> "user_id" >>>>>> >>>>>> FROM >>>>>> >>>>>> "users" >>>>>> >>>>>> WHERE >>>>>> >>>>>> "house_id" IN ( >>>>>> >>>>>> SELECT >>>>>> >>>>>> "id" >>>>>> >>>>>> FROM >>>>>> >>>>>> "houses" >>>>>> >>>>>> ) >>>>>> >>>>>> ); >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>>> >>>> >>>>>> >> >>>>>> +----+---------+------------------------+------------------------------------------------------------------------------+ >>>>>> >>>>>> | id | user_id | description | >>>>>> >>>>>> | >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>>> >>>> >>>>>> >> >>>>>> +----+---------+------------------------+------------------------------------------------------------------------------+ >>>>>> >>>>>> | 1 | 1 | Take out the trash | [{1, John, 1, [{1, >>>>>> 1, Take >>>>>> >>>> out >>>>>> >>>>>> the trash}, {2, 1, Watch my favorite show}, { | >>>>>> >>>>>> | 2 | 1 | Watch my favorite show | [{1, John, 1, [{1, >>>>>> 1, Take >>>>>> >>>> out >>>>>> >>>>>> the trash}, {2, 1, Watch my favorite show}, { | >>>>>> >>>>>> | 3 | 1 | Charge my phone | [{1, John, 1, [{1, >>>>>> 1, Take >>>>>> >>>> out >>>>>> >>>>>> the trash}, {2, 1, Watch my favorite show}, { | >>>>>> >>>>>> | 4 | 2 | Cook dinner | [{1, John, 1, [{1, >>>>>> 1, Take >>>>>> >>>> out >>>>>> >>>>>> the trash}, {2, 1, Watch my favorite show}, { | >>>>>> >>>>>> | 5 | 2 | Read a book | [{1, John, 1, [{1, >>>>>> 1, Take >>>>>> >>>> out >>>>>> >>>>>> the trash}, {2, 1, Watch my favorite show}, { | >>>>>> >>>>>> | 6 | 2 | Organize office | [{1, John, 1, [{1, >>>>>> 1, Take >>>>>> >>>> out >>>>>> >>>>>> the trash}, {2, 1, Watch my favorite show}, { | >>>>>> >>>>>> | 7 | 3 | Walk the dog | [{1, John, 1, [{1, >>>>>> 1, Take >>>>>> >>>> out >>>>>> >>>>>> the trash}, {2, 1, Watch my favorite show}, { | >>>>>> >>>>>> | 8 | 3 | Feed the cat | [{1, John, 1, [{1, >>>>>> 1, Take >>>>>> >>>> out >>>>>> >>>>>> the trash}, {2, 1, Watch my favorite show}, { | >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>>> >>>> >>>>>> >> >>>>>> +----+---------+------------------------+------------------------------------------------------------------------------+ >>>>>> >>>>>> >>>>>> >>>>>> On Sat, Feb 12, 2022 at 4:13 PM Gavin Ray < >>>>>> [email protected]> >>>>>> >>>> wrote: >>>>>> >>>>>> >>>>>> >>>>>>> Nevermind, this is a standard term not something >>>>>> Calcite-specific it >>>>>> >>>>>>> seems! >>>>>> >>>>>>> >>>>>> >>>>>>> https://en.wikipedia.org/wiki/Correlated_subquery >>>>>> >>>>>>> >>>>>> >>>>>>> On Sat, Feb 12, 2022 at 3:46 PM Gavin Ray < >>>>>> [email protected]> >>>>>> >>>>> wrote: >>>>>> >>>>>>> >>>>>> >>>>>>>> Forgive my ignorance/lack of experience >>>>>> >>>>>>>> >>>>>> >>>>>>>> I am somewhat familiar with the ARRAY() function, but not >>>>>> sure I >>>>>> >> know >>>>>> >>>>>>>> the term "correlated" >>>>>> >>>>>>>> Searching the Calcite codebase for uses of "correlated" + >>>>>> "query", I >>>>>> >>>>>>>> found: >>>>>> >>>>>>>> >>>>>> >>>>>>>> >>>>>> >>>>>>>> >>>>>> >>>>> >>>>>> >>>> >>>>>> >> >>>>>> https://github.com/apache/calcite/blob/1d4f1b394bfdba03c5538017e12ab2431b137ca9/core/src/test/java/org/apache/calcite/test/SqlToRelConverterTest.java#L1603-L1612 >>>>>> >>>>>>>> >>>>>> >>>>>>>> @Test void testCorrelatedSubQueryInJoin() { >>>>>> >>>>>>>> final String sql = "select *\n" >>>>>> >>>>>>>> + "from emp as e\n" >>>>>> >>>>>>>> + "join dept as d using (deptno)\n" >>>>>> >>>>>>>> + "where d.name = (\n" >>>>>> >>>>>>>> + " select max(name)\n" >>>>>> >>>>>>>> + " from dept as d2\n" >>>>>> >>>>>>>> + " where d2.deptno = d.deptno)"; >>>>>> >>>>>>>> sql(sql).withExpand(false).ok(); >>>>>> >>>>>>>> } >>>>>> >>>>>>>> >>>>>> >>>>>>>> But I also see this, which says it is "uncorrelated" but >>>>>> seems very >>>>>> >>>>>>>> similar? >>>>>> >>>>>>>> >>>>>> >>>>>>>> @Test void testInUncorrelatedSubQuery() { >>>>>> >>>>>>>> final String sql = "select empno from emp where deptno in" >>>>>> >>>>>>>> + " (select deptno from dept)"; >>>>>> >>>>>>>> sql(sql).ok(); >>>>>> >>>>>>>> } >>>>>> >>>>>>>> >>>>>> >>>>>>>> I wouldn't blame you for not answering such a basic question >>>>>> -- but >>>>>> >>>>> what >>>>>> >>>>>>>> exactly does "correlation" mean here? >>>>>> >>>>>>>> >>>>>> >>>>>>>> Thanks, as usual Julian >>>>>> >>>>>>>> >>>>>> >>>>>>>> >>>>>> >>>>>>>> On Sat, Feb 12, 2022 at 3:08 PM Julian Hyde < >>>>>> [email protected] >>>>>> >>> >>>>>> >>>>>>>> wrote: >>>>>> >>>>>>>> >>>>>> >>>>>>>>> Correlated ARRAY sub-query? >>>>>> >>>>>>>>> >>>>>> >>>>>>>>>> On Feb 12, 2022, at 10:40 AM, Gavin Ray < >>>>>> [email protected]> >>>>>> >>>>>>>>> wrote: >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> Apologies for the delay in replying >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> This makes things clear and seems obvious now that you >>>>>> point it >>>>>> >>>> out. >>>>>> >>>>>>>>>> Thank you, Justin and Julian =) >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> Let me ask another question (if I may) that I am >>>>>> struggling to >>>>>> >>>>> phrase >>>>>> >>>>>>>>>> easily. >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> So with GraphQL, you might have a query like: >>>>>> >>>>>>>>>> - "Get houses" >>>>>> >>>>>>>>>> - "For each house get the user that lives in the house >>>>>> >>>>>>>>>> - "And for each user get their list of todos" >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> The result has to come back such that it's a single object >>>>>> for >>>>>> >>>> each >>>>>> >>>>>>>>> row >>>>>> >>>>>>>>>> ============== >>>>>> >>>>>>>>>> { >>>>>> >>>>>>>>>> houses: [{ >>>>>> >>>>>>>>>> address: "123 Main Street", >>>>>> >>>>>>>>>> users: [{ >>>>>> >>>>>>>>>> name: "Joe", >>>>>> >>>>>>>>>> todos: [{ >>>>>> >>>>>>>>>> description: "Take out trash" >>>>>> >>>>>>>>>> }] >>>>>> >>>>>>>>>> }] >>>>>> >>>>>>>>>> } >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> From a SQL perspective, the logical equivalent would be >>>>>> something >>>>>> >>>>>>>>> like: >>>>>> >>>>>>>>>> ============== >>>>>> >>>>>>>>>> SELECT >>>>>> >>>>>>>>>> house.address, >>>>>> >>>>>>>>>> (somehow nest users + double-nest todos under user) >>>>>> >>>>>>>>>> FROM >>>>>> >>>>>>>>>> house >>>>>> >>>>>>>>>> JOIN >>>>>> >>>>>>>>>> user ON user.house_id = house.id >>>>>> >>>>>>>>>> todos ON todos.user_id = user.id >>>>>> >>>>>>>>>> WHERE >>>>>> >>>>>>>>>> house.id = 1 >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> I'm not familiar enough with SQL to have figured out a way >>>>>> to make >>>>>> >>>>>>>>> this >>>>>> >>>>>>>>>> kind of >>>>>> >>>>>>>>>> query using operators that are supported across most of >>>>>> the DB's >>>>>> >>>>>>>>> Calcite has >>>>>> >>>>>>>>>> adapters for. >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> Currently what I have done instead, on a tip from >>>>>> Gopalakrishna >>>>>> >>>>> Holla >>>>>> >>>>>>>>> from >>>>>> >>>>>>>>>> LinkedIn Coral team who has built GraphQL-on-Calcite, was >>>>>> to break >>>>>> >>>>> up >>>>>> >>>>>>>>> the >>>>>> >>>>>>>>>> query >>>>>> >>>>>>>>>> into individual parts and then do the join in-memory: >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> SELECT ... FROM users; >>>>>> >>>>>>>>>> SELECT ... FROM todos WHERE todos.user_id IN (ResultSet >>>>>> from prev >>>>>> >>>>>>>>> response); >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> However, the way I am doing this seems like it's probably >>>>>> very >>>>>> >>>>>>>>> inefficient. >>>>>> >>>>>>>>>> Because I do a series of nested loops to add the key to >>>>>> each >>>>>> >>>> object >>>>>> >>>>>>>>> in the >>>>>> >>>>>>>>>> parent ResultSet row: >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>> >>>>>> >>>>> >>>>>> >>>> >>>>>> >> >>>>>> https://github.com/GavinRay97/GraphQLCalcite/blob/648e0ac4f6810a3c360d13a03e6597c33406de4b/src/main/kotlin/TableDataFetcher.kt#L135-L153 >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> Is there some better way of doing this? >>>>>> >>>>>>>>>> I would be eternally grateful for any advice. >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> On Thu, Feb 10, 2022 at 3:27 PM Julian Hyde < >>>>>> >>>> [email protected] >>>>>> >>>>>> >>>>>> >>>>>>>>> wrote: >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>>> Yes, if you want to do multiple layers of aggregation, >>>>>> use CTEs >>>>>> >>>>>>>>> (WITH) or >>>>>> >>>>>>>>>>> nested sub-queries. For example, the following is I >>>>>> believe valid >>>>>> >>>>>>>>> standard >>>>>> >>>>>>>>>>> SQL, and actually computes something useful: >>>>>> >>>>>>>>>>> >>>>>> >>>>>>>>>>> WITH q1 AS >>>>>> >>>>>>>>>>> (SELECT deptno, job, AVG(sal) AS avg_sal >>>>>> >>>>>>>>>>> FROM emp >>>>>> >>>>>>>>>>> GROUP BY deptno, job) >>>>>> >>>>>>>>>>> WITH q2 AS >>>>>> >>>>>>>>>>> (SELECT deptno, AVG(avg_sal) AS avg_avg_sal >>>>>> >>>>>>>>>>> FROM q1 >>>>>> >>>>>>>>>>> GROUP BY deptno) >>>>>> >>>>>>>>>>> SELECT AVG(avg_avg_sal) >>>>>> >>>>>>>>>>> FROM q2 >>>>>> >>>>>>>>>>> GROUP BY () >>>>>> >>>>>>>>>>> >>>>>> >>>>>>>>>>> (You can omit the “GROUP BY ()” line, but I think it >>>>>> makes things >>>>>> >>>>>>>>> clearer.) >>>>>> >>>>>>>>>>> >>>>>> >>>>>>>>>>> Julian >>>>>> >>>>>>>>>>> >>>>>> >>>>>>>>>>> >>>>>> >>>>>>>>>>> >>>>>> >>>>>>>>>>>> On Feb 10, 2022, at 12:17 PM, Justin Swanhart < >>>>>> >>>>> [email protected]> >>>>>> >>>>>>>>>>> wrote: >>>>>> >>>>>>>>>>>> >>>>>> >>>>>>>>>>>> I wish you could unsend emails :) Answering my own >>>>>> question, >>>>>> >>>> no, >>>>>> >>>>>>>>> because >>>>>> >>>>>>>>>>>> that would return three rows with the average :D >>>>>> >>>>>>>>>>>> >>>>>> >>>>>>>>>>>> On Thu, Feb 10, 2022 at 3:16 PM Justin Swanhart < >>>>>> >>>>>>>>> [email protected]> >>>>>> >>>>>>>>>>> wrote: >>>>>> >>>>>>>>>>>> >>>>>> >>>>>>>>>>>>> Just out of curiosity, is the second level aggregation >>>>>> using >>>>>> >>>> AVG >>>>>> >>>>>>>>> in a >>>>>> >>>>>>>>>>>>> window context? It the frame is the whole table and it >>>>>> >>>>> aggregates >>>>>> >>>>>>>>> over >>>>>> >>>>>>>>>>> it? >>>>>> >>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>> On Thu, Feb 10, 2022 at 3:12 PM Justin Swanhart < >>>>>> >>>>>>>>> [email protected]> >>>>>> >>>>>>>>>>>>> wrote: >>>>>> >>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>> That is really neat about Oracle. >>>>>> >>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>> The alternative in general is to use a subquery: >>>>>> >>>>>>>>>>>>>> SELECT avg(avg(sal)) FROM emp GROUP BY deptno; >>>>>> >>>>>>>>>>>>>> becomes >>>>>> >>>>>>>>>>>>>> select avg(the_avg) >>>>>> >>>>>>>>>>>>>> from (select avg(sal) from emp group b deptno) >>>>>> an_alias; >>>>>> >>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>> or >>>>>> >>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>> with the_cte as (select avg(sal) x from emp group by >>>>>> deptno) >>>>>> >>>>>>>>>>>>>> select avg(x) from the_cte; >>>>>> >>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>> On Thu, Feb 10, 2022 at 3:03 PM Julian Hyde < >>>>>> >>>>>>>>> [email protected]> >>>>>> >>>>>>>>>>>>>> wrote: >>>>>> >>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>> Some databases, e.g. Oracle, allow TWO levels of >>>>>> nesting: >>>>>> >>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>> SELECT avg(sal) FROM emp GROUP BY deptno; >>>>>> >>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>> AVG(SAL) >>>>>> >>>>>>>>>>>>>>> ======== >>>>>> >>>>>>>>>>>>>>> 1,566.67 >>>>>> >>>>>>>>>>>>>>> 2,175.00 >>>>>> >>>>>>>>>>>>>>> 2,916.65 >>>>>> >>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>> SELECT avg(avg(sal)) FROM emp GROUP BY deptno; >>>>>> >>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>> AVG(SUM(SAL)) >>>>>> >>>>>>>>>>>>>>> ============= >>>>>> >>>>>>>>>>>>>>> 9,675 >>>>>> >>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>> The first level aggregates by department (returning 3 >>>>>> >>>> records), >>>>>> >>>>>>>>> and >>>>>> >>>>>>>>>>> the >>>>>> >>>>>>>>>>>>>>> second level computes the grand total (returning 1 >>>>>> record). >>>>>> >>>> But >>>>>> >>>>>>>>> that >>>>>> >>>>>>>>>>> is an >>>>>> >>>>>>>>>>>>>>> exceptional case. >>>>>> >>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>> Generally, any expression in the SELECT or HAVING >>>>>> clause of >>>>>> >>>> an >>>>>> >>>>>>>>>>> aggregate >>>>>> >>>>>>>>>>>>>>> query is either ‘before’ or ‘after’ aggregation. >>>>>> Consider >>>>>> >>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>> SELECT t.x + 1 AS a, 2 + SUM(t.y + 3) AS b >>>>>> >>>>>>>>>>>>>>> FROM t >>>>>> >>>>>>>>>>>>>>> GROUP BY t.x >>>>>> >>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>> The expressions “t.y” and “t.y + 3” occur before >>>>>> aggregation; >>>>>> >>>>>>>>> “t.x”, >>>>>> >>>>>>>>>>>>>>> “t.x + 1”, “SUM(t.y + 3)” and “2 + SUM(t.y + 3)” >>>>>> occur after >>>>>> >>>>>>>>>>> aggregation. >>>>>> >>>>>>>>>>>>>>> SQL semantics rely heavily on this stratification. >>>>>> Allowing >>>>>> >>>> an >>>>>> >>>>>>>>> extra >>>>>> >>>>>>>>>>> level >>>>>> >>>>>>>>>>>>>>> of aggregation would mess it all up. >>>>>> >>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>> Julian >>>>>> >>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>> On Feb 10, 2022, at 9:45 AM, Justin Swanhart < >>>>>> >>>>>>>>> [email protected]> >>>>>> >>>>>>>>>>>>>>> wrote: >>>>>> >>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>> This is a SQL limitation. >>>>>> >>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>> mysql> select sum(1); >>>>>> >>>>>>>>>>>>>>>> +--------+ >>>>>> >>>>>>>>>>>>>>>> | sum(1) | >>>>>> >>>>>>>>>>>>>>>> +--------+ >>>>>> >>>>>>>>>>>>>>>> | 1 | >>>>>> >>>>>>>>>>>>>>>> +--------+ >>>>>> >>>>>>>>>>>>>>>> 1 row in set (0.00 sec) >>>>>> >>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>> mysql> select sum(sum(1)); >>>>>> >>>>>>>>>>>>>>>> ERROR 1111 (HY000): Invalid use of group function >>>>>> >>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>> On Thu, Feb 10, 2022 at 12:39 PM Gavin Ray < >>>>>> >>>>>>>>> [email protected]> >>>>>> >>>>>>>>>>>>>>> wrote: >>>>>> >>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>> Went to test this query out and found that it can't >>>>>> be >>>>>> >>>>>>>>> performed: >>>>>> >>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>> SELECT >>>>>> >>>>>>>>>>>>>>>>> JSON_OBJECT( >>>>>> >>>>>>>>>>>>>>>>> KEY 'users' >>>>>> >>>>>>>>>>>>>>>>> VALUE JSON_ARRAYAGG( >>>>>> >>>>>>>>>>>>>>>>> JSON_OBJECT( >>>>>> >>>>>>>>>>>>>>>>> KEY 'name' VALUE "users"."name", >>>>>> >>>>>>>>>>>>>>>>> KEY 'todos' VALUE JSON_ARRAYAGG( >>>>>> >>>>>>>>>>>>>>>>> JSON_OBJECT( >>>>>> >>>>>>>>>>>>>>>>> KEY 'description' VALUE >>>>>> >>>>>>>>> "todos"."description" >>>>>> >>>>>>>>>>>>>>>>> ) >>>>>> >>>>>>>>>>>>>>>>> ) >>>>>> >>>>>>>>>>>>>>>>> ) >>>>>> >>>>>>>>>>>>>>>>> ) >>>>>> >>>>>>>>>>>>>>>>> ) >>>>>> >>>>>>>>>>>>>>>>> FROM >>>>>> >>>>>>>>>>>>>>>>> "users" >>>>>> >>>>>>>>>>>>>>>>> LEFT OUTER JOIN >>>>>> >>>>>>>>>>>>>>>>> "todos" ON "users"."id" = "todos"."user_id"; >>>>>> >>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>> Checking the source, seems this is a blanket >>>>>> policy, not a >>>>>> >>>>>>>>>>>>>>>>> datasource-specific thing. >>>>>> >>>>>>>>>>>>>>>>> From a functional perspective, it doesn't feel like >>>>>> it's >>>>>> >>>> much >>>>>> >>>>>>>>>>>>>>> different >>>>>> >>>>>>>>>>>>>>>>> from JOINs >>>>>> >>>>>>>>>>>>>>>>> But I don't understand relational theory or DB >>>>>> >>>> functionality >>>>>> >>>>>>>>> in the >>>>>> >>>>>>>>>>>>>>> least, >>>>>> >>>>>>>>>>>>>>>>> so I'm not fit to judge. >>>>>> >>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>> Just curious why Calcite doesn't allow this >>>>>> >>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>> >>>>>> >>>>>>>>>>> >>>>>> >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> >>>>> >>>>>> >>>> >>>>>> >> >>>>>> >>>>>
