Re: Problems about subsets clause order for MATCH_RECOGNIZE
I don’t understand MATCH_RECOGNIZE well enough to give an opinion. Is there a query that gives different results on Oracle if you change the order of items in SUBSET? It seems that the parser preserves the order of items in the subset, but the SqlToRelConverter does not, hence the line "subsets=[[[DOWN, STRT]]” in SqlToRelConverterTest.xml. I would be concerned if the parser re-ordered things, but I am not too concerned about SqlToRelConverter unless the semantics are wrong. > On Dec 14, 2018, at 12:37 AM, bupt_ljy wrote: > > Hi all, > It’s my first time to send emails to Calcite developers. It’s a really good > project and many projects benefit from it. > Now I’ve encountered a problem about the subsets for MATCH_RECOGNIZE in > thetestMatchRecognizeSubset1() testing. From the results, I can tell > that"subset stdn = (strt, down)"will be explained to "SUBSET \"STDN\" = > (\"DOWN\", \"STRT\”)”, which confuses me. IMO, It’ll affect the result of > functions like“FIRST” and“LAST”, which may not be what I want, although it > works fine with“AVG” function. > I’m not sure if this is a bug, or anyone can tell me how we arrive here? > > > > > Best, > Jiayi Liao
Re: Sessionizing raw events / does Calcite support ARRAY_AGG?
Super! I was not aware of COLLECT. That sounds like an even better fit. Great timing on the release, for my sake :-) Kenn On Sat, Dec 15, 2018, 13:51 Julian Hyde As of https://issues.apache.org/jira/browse/CALCITE-2224 (to be > released shortly in 1.18) Calcite supports the WITHIN GROUP clause > (that allows you to specify the order in which values are supplied to > an aggregate function) and the COLLECT aggregate function (similar to > ARRAY_AGG but returns a nested relation rather than an array). > > LISTAGG (an aggregate function that concatenates its arguments into a > string) and ARRAY_AGG are not implemented but would be straightforward > follow-ups to that work. > > As I noted in CALCITE-2224, the SQL standard says that ARRAY_AGG may > optionally have an ORDER BY clause inside its parentheses, e.g. > > SELECT deptno, ARRAY_AGG(empno ORDER BY sal DESC) AS emps > FROM emp > GROUP BY deptno > > I think WITHIN GROUP could and should be used instead, viz > > SELECT deptno, ARRAY_AGG(empno) WITHIN GROUP (ORDER BY sal DESC) AS emps > FROM emp > GROUP BY deptno > > because that is more consistent with other aggregate functions, and > would allow us to supply ARRAY_AGG without extra parser work. > > Julian > > > On Fri, Dec 14, 2018 at 5:58 PM Kenneth Knowles wrote: > > > > Hello! > > > > My use case is sessionizing raw events without an aggregation function. > > Approximate code that I tried out: > > > > SELECT ARRAY_AGG(ROW(...)) > > FROM ... > > GROUP BY SESSION(...) > > > > (followed by UNNEST to get the raw events, tagged with session info, back > > out into a stream) > > > > I get a parser error on the paren after ARRAY_AGG, presumably because it > is > > an identifier treated as a column name? > > > > So I was digging through Calcite's code and my conclusion is that there > is > > no implementation of ARRAY_AGG. Is there an alternative? Is there another > > way to use Calcite's streaming extensions to do sessionization of raw > > events? > > > > Kenn >
Calcite-Master - Build # 986 - Still Failing
The Apache Jenkins build system has built Calcite-Master (build #986) Status: Still Failing Check console output at https://builds.apache.org/job/Calcite-Master/986/ to view the results.
[jira] [Created] (CALCITE-2742) Update RexImpTable to user DataContext to retrieve USER and SYSTEM_USER
Jacques Nadeau created CALCITE-2742: --- Summary: Update RexImpTable to user DataContext to retrieve USER and SYSTEM_USER Key: CALCITE-2742 URL: https://issues.apache.org/jira/browse/CALCITE-2742 Project: Calcite Issue Type: Improvement Reporter: Jacques Nadeau Assignee: Jacques Nadeau Right now, USER and SYSTEM_USER are hardcoded to "sa" and the system property of user.name. Let's update them to use the DataContext system like the other similar types of values. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Sessionizing raw events / does Calcite support ARRAY_AGG?
As of https://issues.apache.org/jira/browse/CALCITE-2224 (to be released shortly in 1.18) Calcite supports the WITHIN GROUP clause (that allows you to specify the order in which values are supplied to an aggregate function) and the COLLECT aggregate function (similar to ARRAY_AGG but returns a nested relation rather than an array). LISTAGG (an aggregate function that concatenates its arguments into a string) and ARRAY_AGG are not implemented but would be straightforward follow-ups to that work. As I noted in CALCITE-2224, the SQL standard says that ARRAY_AGG may optionally have an ORDER BY clause inside its parentheses, e.g. SELECT deptno, ARRAY_AGG(empno ORDER BY sal DESC) AS emps FROM emp GROUP BY deptno I think WITHIN GROUP could and should be used instead, viz SELECT deptno, ARRAY_AGG(empno) WITHIN GROUP (ORDER BY sal DESC) AS emps FROM emp GROUP BY deptno because that is more consistent with other aggregate functions, and would allow us to supply ARRAY_AGG without extra parser work. Julian On Fri, Dec 14, 2018 at 5:58 PM Kenneth Knowles wrote: > > Hello! > > My use case is sessionizing raw events without an aggregation function. > Approximate code that I tried out: > > SELECT ARRAY_AGG(ROW(...)) > FROM ... > GROUP BY SESSION(...) > > (followed by UNNEST to get the raw events, tagged with session info, back > out into a stream) > > I get a parser error on the paren after ARRAY_AGG, presumably because it is > an identifier treated as a column name? > > So I was digging through Calcite's code and my conclusion is that there is > no implementation of ARRAY_AGG. Is there an alternative? Is there another > way to use Calcite's streaming extensions to do sessionization of raw > events? > > Kenn