Try ‘value ((‘ in place of ‘value (‘. Julian
> On Feb 21, 2022, at 9:33 AM, Gavin Ray <[email protected]> wrote: > > I hadn't thought about the fact that ORM's probably have to solve this > problem as well > That is a great suggestion, I will try to investigate some of the popular > ORM codebases and see if there are any tricks they are using. > > I seem to maybe be getting a tiny bit closer by using subqueries like > Julian suggested instead of operator calls > But if I may ask what is probably a very stupid question: > > What might the error message > "parse failed: Query expression encountered in illegal context > (state=,code=0)" > > Mean in the below query? > > The reason why I am confused is because the query runs if I remove the > innermost subquery ("todos") > But the innermost subquery is a direct copy-paste of the subquery above it, > so I know it MUST be valid > > As usual, thank you so much for your help/guidance Stamatis. > > select > "g0"."id" "id", > "g0"."address" "address", > ( > select json_arrayagg(json_object( > key 'id' value "g1"."id", > key 'todos' value ( > select json_arrayagg(json_object( > key 'id' value "g2"."id", > key 'description' value "g2"."description", > )) > from ( > select * from "todos" > where "g1"."id" = "user_id" > order by "id" > ) "g2" > ) > )) > from ( > select * from "users" > where "g0"."id" = "house_id" > order by "id" > ) "g1" > ) "users" > from "houses" "g0" > order by "g0"."id" > >> On Mon, Feb 21, 2022 at 8:07 AM Stamatis Zampetakis <[email protected]> >> wrote: >> >> Hi Gavin, >> >> A few more comments in case they help to get you a bit further on your >> work. >> >> The need to return the result as a single object is a common problem in >> object relational mapping (ORM) frameworks/APIS (JPA, Datanucleus, >> Hibernate, etc.). Apart from the suggestions so far maybe you could look >> into these frameworks as well for more inspiration. >> >> Moreover your approach of decomposing the query into individual parts is >> commonly known as the N+1 problem [1]. >> >> Lastly, keep in mind that you can introduce custom UDF, UDAF functions if >> you need more flexibility on reconstructing the final result. >> >> Best, >> Stamatis >> >> [1] >> >> https://stackoverflow.com/questions/97197/what-is-the-n1-selects-problem-in-orm-object-relational-mapping >> >>> On Sun, Feb 13, 2022 at 3:59 AM Gavin Ray <[email protected]> wrote: >>> >>> Ah wait nevermind, got excited and spoke too soon. Looking at it more >>> closely, that data isn't correct. >>> At least it's in somewhat the right shape, ha! >>> >>>> On Sat, Feb 12, 2022 at 9:57 PM Gavin Ray <[email protected]> wrote: >>> >>>> After ~5 hours, I think I may have made some progress =) >>>> >>>> I have this, which currently works. The problem is that the nested >>> columns >>>> don't have names on them. >>>> Since I need to return a nested "Map<String, Object>", I have to figure >>>> out how to convert this query into a form that gives column names. >>>> >>>> But this is still great progress I think! >>>> >>>> SELECT >>>> "todos".*, >>>> ARRAY( >>>> SELECT >>>> "users".*, >>>> ARRAY( >>>> SELECT >>>> "todos".* >>>> FROM >>>> "todos" >>>> ) AS "todos" >>>> FROM >>>> "users" >>>> ) AS "users" >>>> FROM >>>> "todos" >>>> WHERE >>>> "user_id" IN ( >>>> SELECT >>>> "user_id" >>>> FROM >>>> "users" >>>> WHERE >>>> "house_id" IN ( >>>> SELECT >>>> "id" >>>> FROM >>>> "houses" >>>> ) >>>> ); >>>> >>>> >>>> >>>> >>> >> +----+---------+------------------------+------------------------------------------------------------------------------+ >>>> | id | user_id | description | >>>> | >>>> >>>> >>> >> +----+---------+------------------------+------------------------------------------------------------------------------+ >>>> | 1 | 1 | Take out the trash | [{1, John, 1, [{1, 1, Take >> out >>>> the trash}, {2, 1, Watch my favorite show}, { | >>>> | 2 | 1 | Watch my favorite show | [{1, John, 1, [{1, 1, Take >> out >>>> the trash}, {2, 1, Watch my favorite show}, { | >>>> | 3 | 1 | Charge my phone | [{1, John, 1, [{1, 1, Take >> out >>>> the trash}, {2, 1, Watch my favorite show}, { | >>>> | 4 | 2 | Cook dinner | [{1, John, 1, [{1, 1, Take >> out >>>> the trash}, {2, 1, Watch my favorite show}, { | >>>> | 5 | 2 | Read a book | [{1, John, 1, [{1, 1, Take >> out >>>> the trash}, {2, 1, Watch my favorite show}, { | >>>> | 6 | 2 | Organize office | [{1, John, 1, [{1, 1, Take >> out >>>> the trash}, {2, 1, Watch my favorite show}, { | >>>> | 7 | 3 | Walk the dog | [{1, John, 1, [{1, 1, Take >> out >>>> the trash}, {2, 1, Watch my favorite show}, { | >>>> | 8 | 3 | Feed the cat | [{1, John, 1, [{1, 1, Take >> out >>>> the trash}, {2, 1, Watch my favorite show}, { | >>>> >>>> >>> >> +----+---------+------------------------+------------------------------------------------------------------------------+ >>>> >>>> On Sat, Feb 12, 2022 at 4:13 PM Gavin Ray <[email protected]> >> wrote: >>>> >>>>> Nevermind, this is a standard term not something Calcite-specific it >>>>> seems! >>>>> >>>>> https://en.wikipedia.org/wiki/Correlated_subquery >>>>> >>>>> On Sat, Feb 12, 2022 at 3:46 PM Gavin Ray <[email protected]> >>> wrote: >>>>> >>>>>> Forgive my ignorance/lack of experience >>>>>> >>>>>> I am somewhat familiar with the ARRAY() function, but not sure I know >>>>>> the term "correlated" >>>>>> Searching the Calcite codebase for uses of "correlated" + "query", I >>>>>> found: >>>>>> >>>>>> >>>>>> >>> >> https://github.com/apache/calcite/blob/1d4f1b394bfdba03c5538017e12ab2431b137ca9/core/src/test/java/org/apache/calcite/test/SqlToRelConverterTest.java#L1603-L1612 >>>>>> >>>>>> @Test void testCorrelatedSubQueryInJoin() { >>>>>> final String sql = "select *\n" >>>>>> + "from emp as e\n" >>>>>> + "join dept as d using (deptno)\n" >>>>>> + "where d.name = (\n" >>>>>> + " select max(name)\n" >>>>>> + " from dept as d2\n" >>>>>> + " where d2.deptno = d.deptno)"; >>>>>> sql(sql).withExpand(false).ok(); >>>>>> } >>>>>> >>>>>> But I also see this, which says it is "uncorrelated" but seems very >>>>>> similar? >>>>>> >>>>>> @Test void testInUncorrelatedSubQuery() { >>>>>> final String sql = "select empno from emp where deptno in" >>>>>> + " (select deptno from dept)"; >>>>>> sql(sql).ok(); >>>>>> } >>>>>> >>>>>> I wouldn't blame you for not answering such a basic question -- but >>> what >>>>>> exactly does "correlation" mean here? >>>>>> >>>>>> Thanks, as usual Julian >>>>>> >>>>>> >>>>>> On Sat, Feb 12, 2022 at 3:08 PM Julian Hyde <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Correlated ARRAY sub-query? >>>>>>> >>>>>>>> On Feb 12, 2022, at 10:40 AM, Gavin Ray <[email protected]> >>>>>>> wrote: >>>>>>>> >>>>>>>> Apologies for the delay in replying >>>>>>>> >>>>>>>> This makes things clear and seems obvious now that you point it >> out. >>>>>>>> Thank you, Justin and Julian =) >>>>>>>> >>>>>>>> Let me ask another question (if I may) that I am struggling to >>> phrase >>>>>>>> easily. >>>>>>>> >>>>>>>> So with GraphQL, you might have a query like: >>>>>>>> - "Get houses" >>>>>>>> - "For each house get the user that lives in the house >>>>>>>> - "And for each user get their list of todos" >>>>>>>> >>>>>>>> The result has to come back such that it's a single object for >> each >>>>>>> row >>>>>>>> ============== >>>>>>>> { >>>>>>>> houses: [{ >>>>>>>> address: "123 Main Street", >>>>>>>> users: [{ >>>>>>>> name: "Joe", >>>>>>>> todos: [{ >>>>>>>> description: "Take out trash" >>>>>>>> }] >>>>>>>> }] >>>>>>>> } >>>>>>>> >>>>>>>> From a SQL perspective, the logical equivalent would be something >>>>>>> like: >>>>>>>> ============== >>>>>>>> SELECT >>>>>>>> house.address, >>>>>>>> (somehow nest users + double-nest todos under user) >>>>>>>> FROM >>>>>>>> house >>>>>>>> JOIN >>>>>>>> user ON user.house_id = house.id >>>>>>>> todos ON todos.user_id = user.id >>>>>>>> WHERE >>>>>>>> house.id = 1 >>>>>>>> >>>>>>>> I'm not familiar enough with SQL to have figured out a way to make >>>>>>> this >>>>>>>> kind of >>>>>>>> query using operators that are supported across most of the DB's >>>>>>> Calcite has >>>>>>>> adapters for. >>>>>>>> >>>>>>>> Currently what I have done instead, on a tip from Gopalakrishna >>> Holla >>>>>>> from >>>>>>>> LinkedIn Coral team who has built GraphQL-on-Calcite, was to break >>> up >>>>>>> the >>>>>>>> query >>>>>>>> into individual parts and then do the join in-memory: >>>>>>>> >>>>>>>> SELECT ... FROM users; >>>>>>>> SELECT ... FROM todos WHERE todos.user_id IN (ResultSet from prev >>>>>>> response); >>>>>>>> >>>>>>>> However, the way I am doing this seems like it's probably very >>>>>>> inefficient. >>>>>>>> Because I do a series of nested loops to add the key to each >> object >>>>>>> in the >>>>>>>> parent ResultSet row: >>>>>>>> >>>>>>>> >>>>>>> >>> >> https://github.com/GavinRay97/GraphQLCalcite/blob/648e0ac4f6810a3c360d13a03e6597c33406de4b/src/main/kotlin/TableDataFetcher.kt#L135-L153 >>>>>>>> >>>>>>>> Is there some better way of doing this? >>>>>>>> I would be eternally grateful for any advice. >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Feb 10, 2022 at 3:27 PM Julian Hyde < >> [email protected] >>>> >>>>>>> wrote: >>>>>>>> >>>>>>>>> Yes, if you want to do multiple layers of aggregation, use CTEs >>>>>>> (WITH) or >>>>>>>>> nested sub-queries. For example, the following is I believe valid >>>>>>> standard >>>>>>>>> SQL, and actually computes something useful: >>>>>>>>> >>>>>>>>> WITH q1 AS >>>>>>>>> (SELECT deptno, job, AVG(sal) AS avg_sal >>>>>>>>> FROM emp >>>>>>>>> GROUP BY deptno, job) >>>>>>>>> WITH q2 AS >>>>>>>>> (SELECT deptno, AVG(avg_sal) AS avg_avg_sal >>>>>>>>> FROM q1 >>>>>>>>> GROUP BY deptno) >>>>>>>>> SELECT AVG(avg_avg_sal) >>>>>>>>> FROM q2 >>>>>>>>> GROUP BY () >>>>>>>>> >>>>>>>>> (You can omit the “GROUP BY ()” line, but I think it makes things >>>>>>> clearer.) >>>>>>>>> >>>>>>>>> Julian >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> On Feb 10, 2022, at 12:17 PM, Justin Swanhart < >>> [email protected]> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> I wish you could unsend emails :) Answering my own question, >> no, >>>>>>> because >>>>>>>>>> that would return three rows with the average :D >>>>>>>>>> >>>>>>>>>> On Thu, Feb 10, 2022 at 3:16 PM Justin Swanhart < >>>>>>> [email protected]> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Just out of curiosity, is the second level aggregation using >> AVG >>>>>>> in a >>>>>>>>>>> window context? It the frame is the whole table and it >>> aggregates >>>>>>> over >>>>>>>>> it? >>>>>>>>>>> >>>>>>>>>>> On Thu, Feb 10, 2022 at 3:12 PM Justin Swanhart < >>>>>>> [email protected]> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> That is really neat about Oracle. >>>>>>>>>>>> >>>>>>>>>>>> The alternative in general is to use a subquery: >>>>>>>>>>>> SELECT avg(avg(sal)) FROM emp GROUP BY deptno; >>>>>>>>>>>> becomes >>>>>>>>>>>> select avg(the_avg) >>>>>>>>>>>> from (select avg(sal) from emp group b deptno) an_alias; >>>>>>>>>>>> >>>>>>>>>>>> or >>>>>>>>>>>> >>>>>>>>>>>> with the_cte as (select avg(sal) x from emp group by deptno) >>>>>>>>>>>> select avg(x) from the_cte; >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Feb 10, 2022 at 3:03 PM Julian Hyde < >>>>>>> [email protected]> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Some databases, e.g. Oracle, allow TWO levels of nesting: >>>>>>>>>>>>> >>>>>>>>>>>>> SELECT avg(sal) FROM emp GROUP BY deptno; >>>>>>>>>>>>> >>>>>>>>>>>>> AVG(SAL) >>>>>>>>>>>>> ======== >>>>>>>>>>>>> 1,566.67 >>>>>>>>>>>>> 2,175.00 >>>>>>>>>>>>> 2,916.65 >>>>>>>>>>>>> >>>>>>>>>>>>> SELECT avg(avg(sal)) FROM emp GROUP BY deptno; >>>>>>>>>>>>> >>>>>>>>>>>>> AVG(SUM(SAL)) >>>>>>>>>>>>> ============= >>>>>>>>>>>>> 9,675 >>>>>>>>>>>>> >>>>>>>>>>>>> The first level aggregates by department (returning 3 >> records), >>>>>>> and >>>>>>>>> the >>>>>>>>>>>>> second level computes the grand total (returning 1 record). >> But >>>>>>> that >>>>>>>>> is an >>>>>>>>>>>>> exceptional case. >>>>>>>>>>>>> >>>>>>>>>>>>> Generally, any expression in the SELECT or HAVING clause of >> an >>>>>>>>> aggregate >>>>>>>>>>>>> query is either ‘before’ or ‘after’ aggregation. Consider >>>>>>>>>>>>> >>>>>>>>>>>>> SELECT t.x + 1 AS a, 2 + SUM(t.y + 3) AS b >>>>>>>>>>>>> FROM t >>>>>>>>>>>>> GROUP BY t.x >>>>>>>>>>>>> >>>>>>>>>>>>> The expressions “t.y” and “t.y + 3” occur before aggregation; >>>>>>> “t.x”, >>>>>>>>>>>>> “t.x + 1”, “SUM(t.y + 3)” and “2 + SUM(t.y + 3)” occur after >>>>>>>>> aggregation. >>>>>>>>>>>>> SQL semantics rely heavily on this stratification. Allowing >> an >>>>>>> extra >>>>>>>>> level >>>>>>>>>>>>> of aggregation would mess it all up. >>>>>>>>>>>>> >>>>>>>>>>>>> Julian >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> On Feb 10, 2022, at 9:45 AM, Justin Swanhart < >>>>>>> [email protected]> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> This is a SQL limitation. >>>>>>>>>>>>>> >>>>>>>>>>>>>> mysql> select sum(1); >>>>>>>>>>>>>> +--------+ >>>>>>>>>>>>>> | sum(1) | >>>>>>>>>>>>>> +--------+ >>>>>>>>>>>>>> | 1 | >>>>>>>>>>>>>> +--------+ >>>>>>>>>>>>>> 1 row in set (0.00 sec) >>>>>>>>>>>>>> >>>>>>>>>>>>>> mysql> select sum(sum(1)); >>>>>>>>>>>>>> ERROR 1111 (HY000): Invalid use of group function >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Thu, Feb 10, 2022 at 12:39 PM Gavin Ray < >>>>>>> [email protected]> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Went to test this query out and found that it can't be >>>>>>> performed: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> SELECT >>>>>>>>>>>>>>> JSON_OBJECT( >>>>>>>>>>>>>>> KEY 'users' >>>>>>>>>>>>>>> VALUE JSON_ARRAYAGG( >>>>>>>>>>>>>>> JSON_OBJECT( >>>>>>>>>>>>>>> KEY 'name' VALUE "users"."name", >>>>>>>>>>>>>>> KEY 'todos' VALUE JSON_ARRAYAGG( >>>>>>>>>>>>>>> JSON_OBJECT( >>>>>>>>>>>>>>> KEY 'description' VALUE >>>>>>> "todos"."description" >>>>>>>>>>>>>>> ) >>>>>>>>>>>>>>> ) >>>>>>>>>>>>>>> ) >>>>>>>>>>>>>>> ) >>>>>>>>>>>>>>> ) >>>>>>>>>>>>>>> FROM >>>>>>>>>>>>>>> "users" >>>>>>>>>>>>>>> LEFT OUTER JOIN >>>>>>>>>>>>>>> "todos" ON "users"."id" = "todos"."user_id"; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Checking the source, seems this is a blanket policy, not a >>>>>>>>>>>>>>> datasource-specific thing. >>>>>>>>>>>>>>> From a functional perspective, it doesn't feel like it's >> much >>>>>>>>>>>>> different >>>>>>>>>>>>>>> from JOINs >>>>>>>>>>>>>>> But I don't understand relational theory or DB >> functionality >>>>>>> in the >>>>>>>>>>>>> least, >>>>>>>>>>>>>>> so I'm not fit to judge. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Just curious why Calcite doesn't allow this >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>> >>
