Super! I was not aware of COLLECT. That sounds like an even better fit. Great timing on the release, for my sake :-)
Kenn On Sat, Dec 15, 2018, 13:51 Julian Hyde <jh...@apache.org wrote: > As of https://issues.apache.org/jira/browse/CALCITE-2224 (to be > released shortly in 1.18) Calcite supports the WITHIN GROUP clause > (that allows you to specify the order in which values are supplied to > an aggregate function) and the COLLECT aggregate function (similar to > ARRAY_AGG but returns a nested relation rather than an array). > > LISTAGG (an aggregate function that concatenates its arguments into a > string) and ARRAY_AGG are not implemented but would be straightforward > follow-ups to that work. > > As I noted in CALCITE-2224, the SQL standard says that ARRAY_AGG may > optionally have an ORDER BY clause inside its parentheses, e.g. > > SELECT deptno, ARRAY_AGG(empno ORDER BY sal DESC) AS emps > FROM emp > GROUP BY deptno > > I think WITHIN GROUP could and should be used instead, viz > > SELECT deptno, ARRAY_AGG(empno) WITHIN GROUP (ORDER BY sal DESC) AS emps > FROM emp > GROUP BY deptno > > because that is more consistent with other aggregate functions, and > would allow us to supply ARRAY_AGG without extra parser work. > > Julian > > > On Fri, Dec 14, 2018 at 5:58 PM Kenneth Knowles <k...@apache.org> wrote: > > > > Hello! > > > > My use case is sessionizing raw events without an aggregation function. > > Approximate code that I tried out: > > > > SELECT ARRAY_AGG(ROW(...)) > > FROM ... > > GROUP BY SESSION(...) > > > > (followed by UNNEST to get the raw events, tagged with session info, back > > out into a stream) > > > > I get a parser error on the paren after ARRAY_AGG, presumably because it > is > > an identifier treated as a column name? > > > > So I was digging through Calcite's code and my conclusion is that there > is > > no implementation of ARRAY_AGG. Is there an alternative? Is there another > > way to use Calcite's streaming extensions to do sessionization of raw > > events? > > > > Kenn >