Nevermind, this is a standard term not something Calcite-specific it seems!
https://en.wikipedia.org/wiki/Correlated_subquery On Sat, Feb 12, 2022 at 3:46 PM Gavin Ray <[email protected]> wrote: > Forgive my ignorance/lack of experience > > I am somewhat familiar with the ARRAY() function, but not sure I know the > term "correlated" > Searching the Calcite codebase for uses of "correlated" + "query", I found: > > > https://github.com/apache/calcite/blob/1d4f1b394bfdba03c5538017e12ab2431b137ca9/core/src/test/java/org/apache/calcite/test/SqlToRelConverterTest.java#L1603-L1612 > > @Test void testCorrelatedSubQueryInJoin() { > final String sql = "select *\n" > + "from emp as e\n" > + "join dept as d using (deptno)\n" > + "where d.name = (\n" > + " select max(name)\n" > + " from dept as d2\n" > + " where d2.deptno = d.deptno)"; > sql(sql).withExpand(false).ok(); > } > > But I also see this, which says it is "uncorrelated" but seems very > similar? > > @Test void testInUncorrelatedSubQuery() { > final String sql = "select empno from emp where deptno in" > + " (select deptno from dept)"; > sql(sql).ok(); > } > > I wouldn't blame you for not answering such a basic question -- but what > exactly does "correlation" mean here? > > Thanks, as usual Julian > > > On Sat, Feb 12, 2022 at 3:08 PM Julian Hyde <[email protected]> > wrote: > >> Correlated ARRAY sub-query? >> >> > On Feb 12, 2022, at 10:40 AM, Gavin Ray <[email protected]> wrote: >> > >> > Apologies for the delay in replying >> > >> > This makes things clear and seems obvious now that you point it out. >> > Thank you, Justin and Julian =) >> > >> > Let me ask another question (if I may) that I am struggling to phrase >> > easily. >> > >> > So with GraphQL, you might have a query like: >> > - "Get houses" >> > - "For each house get the user that lives in the house >> > - "And for each user get their list of todos" >> > >> > The result has to come back such that it's a single object for each row >> > ============== >> > { >> > houses: [{ >> > address: "123 Main Street", >> > users: [{ >> > name: "Joe", >> > todos: [{ >> > description: "Take out trash" >> > }] >> > }] >> > } >> > >> > From a SQL perspective, the logical equivalent would be something like: >> > ============== >> > SELECT >> > house.address, >> > (somehow nest users + double-nest todos under user) >> > FROM >> > house >> > JOIN >> > user ON user.house_id = house.id >> > todos ON todos.user_id = user.id >> > WHERE >> > house.id = 1 >> > >> > I'm not familiar enough with SQL to have figured out a way to make this >> > kind of >> > query using operators that are supported across most of the DB's >> Calcite has >> > adapters for. >> > >> > Currently what I have done instead, on a tip from Gopalakrishna Holla >> from >> > LinkedIn Coral team who has built GraphQL-on-Calcite, was to break up >> the >> > query >> > into individual parts and then do the join in-memory: >> > >> > SELECT ... FROM users; >> > SELECT ... FROM todos WHERE todos.user_id IN (ResultSet from prev >> response); >> > >> > However, the way I am doing this seems like it's probably very >> inefficient. >> > Because I do a series of nested loops to add the key to each object in >> the >> > parent ResultSet row: >> > >> > >> https://github.com/GavinRay97/GraphQLCalcite/blob/648e0ac4f6810a3c360d13a03e6597c33406de4b/src/main/kotlin/TableDataFetcher.kt#L135-L153 >> > >> > Is there some better way of doing this? >> > I would be eternally grateful for any advice. >> > >> > >> > On Thu, Feb 10, 2022 at 3:27 PM Julian Hyde <[email protected]> >> wrote: >> > >> >> Yes, if you want to do multiple layers of aggregation, use CTEs (WITH) >> or >> >> nested sub-queries. For example, the following is I believe valid >> standard >> >> SQL, and actually computes something useful: >> >> >> >> WITH q1 AS >> >> (SELECT deptno, job, AVG(sal) AS avg_sal >> >> FROM emp >> >> GROUP BY deptno, job) >> >> WITH q2 AS >> >> (SELECT deptno, AVG(avg_sal) AS avg_avg_sal >> >> FROM q1 >> >> GROUP BY deptno) >> >> SELECT AVG(avg_avg_sal) >> >> FROM q2 >> >> GROUP BY () >> >> >> >> (You can omit the “GROUP BY ()” line, but I think it makes things >> clearer.) >> >> >> >> Julian >> >> >> >> >> >> >> >>> On Feb 10, 2022, at 12:17 PM, Justin Swanhart <[email protected]> >> >> wrote: >> >>> >> >>> I wish you could unsend emails :) Answering my own question, no, >> because >> >>> that would return three rows with the average :D >> >>> >> >>> On Thu, Feb 10, 2022 at 3:16 PM Justin Swanhart <[email protected]> >> >> wrote: >> >>> >> >>>> Just out of curiosity, is the second level aggregation using AVG in a >> >>>> window context? It the frame is the whole table and it aggregates >> over >> >> it? >> >>>> >> >>>> On Thu, Feb 10, 2022 at 3:12 PM Justin Swanhart <[email protected] >> > >> >>>> wrote: >> >>>> >> >>>>> That is really neat about Oracle. >> >>>>> >> >>>>> The alternative in general is to use a subquery: >> >>>>> SELECT avg(avg(sal)) FROM emp GROUP BY deptno; >> >>>>> becomes >> >>>>> select avg(the_avg) >> >>>>> from (select avg(sal) from emp group b deptno) an_alias; >> >>>>> >> >>>>> or >> >>>>> >> >>>>> with the_cte as (select avg(sal) x from emp group by deptno) >> >>>>> select avg(x) from the_cte; >> >>>>> >> >>>>> On Thu, Feb 10, 2022 at 3:03 PM Julian Hyde <[email protected] >> > >> >>>>> wrote: >> >>>>> >> >>>>>> Some databases, e.g. Oracle, allow TWO levels of nesting: >> >>>>>> >> >>>>>> SELECT avg(sal) FROM emp GROUP BY deptno; >> >>>>>> >> >>>>>> AVG(SAL) >> >>>>>> ======== >> >>>>>> 1,566.67 >> >>>>>> 2,175.00 >> >>>>>> 2,916.65 >> >>>>>> >> >>>>>> SELECT avg(avg(sal)) FROM emp GROUP BY deptno; >> >>>>>> >> >>>>>> AVG(SUM(SAL)) >> >>>>>> ============= >> >>>>>> 9,675 >> >>>>>> >> >>>>>> The first level aggregates by department (returning 3 records), and >> >> the >> >>>>>> second level computes the grand total (returning 1 record). But >> that >> >> is an >> >>>>>> exceptional case. >> >>>>>> >> >>>>>> Generally, any expression in the SELECT or HAVING clause of an >> >> aggregate >> >>>>>> query is either ‘before’ or ‘after’ aggregation. Consider >> >>>>>> >> >>>>>> SELECT t.x + 1 AS a, 2 + SUM(t.y + 3) AS b >> >>>>>> FROM t >> >>>>>> GROUP BY t.x >> >>>>>> >> >>>>>> The expressions “t.y” and “t.y + 3” occur before aggregation; >> “t.x”, >> >>>>>> “t.x + 1”, “SUM(t.y + 3)” and “2 + SUM(t.y + 3)” occur after >> >> aggregation. >> >>>>>> SQL semantics rely heavily on this stratification. Allowing an >> extra >> >> level >> >>>>>> of aggregation would mess it all up. >> >>>>>> >> >>>>>> Julian >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>>> On Feb 10, 2022, at 9:45 AM, Justin Swanhart <[email protected] >> > >> >>>>>> wrote: >> >>>>>>> >> >>>>>>> This is a SQL limitation. >> >>>>>>> >> >>>>>>> mysql> select sum(1); >> >>>>>>> +--------+ >> >>>>>>> | sum(1) | >> >>>>>>> +--------+ >> >>>>>>> | 1 | >> >>>>>>> +--------+ >> >>>>>>> 1 row in set (0.00 sec) >> >>>>>>> >> >>>>>>> mysql> select sum(sum(1)); >> >>>>>>> ERROR 1111 (HY000): Invalid use of group function >> >>>>>>> >> >>>>>>> On Thu, Feb 10, 2022 at 12:39 PM Gavin Ray <[email protected] >> > >> >>>>>> wrote: >> >>>>>>> >> >>>>>>>> Went to test this query out and found that it can't be performed: >> >>>>>>>> >> >>>>>>>> SELECT >> >>>>>>>> JSON_OBJECT( >> >>>>>>>> KEY 'users' >> >>>>>>>> VALUE JSON_ARRAYAGG( >> >>>>>>>> JSON_OBJECT( >> >>>>>>>> KEY 'name' VALUE "users"."name", >> >>>>>>>> KEY 'todos' VALUE JSON_ARRAYAGG( >> >>>>>>>> JSON_OBJECT( >> >>>>>>>> KEY 'description' VALUE >> "todos"."description" >> >>>>>>>> ) >> >>>>>>>> ) >> >>>>>>>> ) >> >>>>>>>> ) >> >>>>>>>> ) >> >>>>>>>> FROM >> >>>>>>>> "users" >> >>>>>>>> LEFT OUTER JOIN >> >>>>>>>> "todos" ON "users"."id" = "todos"."user_id"; >> >>>>>>>> >> >>>>>>>> Checking the source, seems this is a blanket policy, not a >> >>>>>>>> datasource-specific thing. >> >>>>>>>> From a functional perspective, it doesn't feel like it's much >> >>>>>> different >> >>>>>>>> from JOINs >> >>>>>>>> But I don't understand relational theory or DB functionality in >> the >> >>>>>> least, >> >>>>>>>> so I'm not fit to judge. >> >>>>>>>> >> >>>>>>>> Just curious why Calcite doesn't allow this >> >>>>>>>> >> >>>>>> >> >>>>>> >> >> >> >> >> >>
