[
https://issues.apache.org/jira/browse/CALCITE-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17010316#comment-17010316
]
Feng Zhu commented on CALCITE-1824:
-----------------------------------
{quote}The problem with UNION ALL is that it duplicates the input relational
expression. So we would tend to implement a plan that evaluates it twice.
Times like this, I wish we had a {{ProjectMany}} operator (compare LINQ's
[SelectMany|https://docs.microsoft.com/en-us/dotnet/api/system.linq.enumerable.selectmany]).
{quote}
I think it is a good idea. We need a specific operator (e.g., _Expand_ in some
products) which applies a number of projections to every input row, hence we
can get multiple output rows for an input row to avoid duplicate data scan.
On the other hand, I also noticed that we do not support queries like (_lateral
view ..._) now. The operator will play an important role if we support it in
the future.
> GROUP_ID returns wrong result
> -----------------------------
>
> Key: CALCITE-1824
> URL: https://issues.apache.org/jira/browse/CALCITE-1824
> Project: Calcite
> Issue Type: Bug
> Reporter: Julian Hyde
> Priority: Major
>
> We implemented the {{GROUP_ID()}} function in CALCITE-512 but we got the
> specification wrong, and it returns the wrong result.
> {{GROUP_ID}} is not in the SQL standard. It is implemented only by Oracle.
> I mistakenly believed that {{GROUP_ID()}} is equivalent to {{GROUPING_ID(g1,
> ..., gn)}} (in a query with {{GROUP BY g1, ..., gn}}). In fact, {{GROUP_ID}}
> is useful only if you have duplicate grouping sets. If grouping sets are
> distinct, {{GROUP_ID()}} will always return zero.
> Example 1
> {code}SELECT deptno, job, GROUP_ID() AS g
> FROM Emp
> GROUP BY ROLLUP(deptno, job)
> DEPTNO JOB G
> ---------- --------- ----------
> 10 CLERK 0
> 10 MANAGER 0
> 10 PRESIDENT 0
> 10 0
> 20 CLERK 0
> 20 ANALYST 0
> 20 MANAGER 0
> 20 0
> 30 CLERK 0
> 30 MANAGER 0
> 30 SALESMAN 0
> 30 0
> 0
> {code} produces grouping sets (deptno, job), (deptno), (). These are
> distinct, so GROUP_ID() is 0 for all rows.
> Example 2
> {code}SELECT deptno, GROUP_ID() AS g
> FROM Emp
> GROUP BY GROUPING SETS (deptno, (), ());
> DEPTNO G
> ---------- ----------
> 10 0
> 20 0
> 30 0
> 0
> 1
> {code}
> As you can see, the grouping set () occurs twice. So there is one row in the
> result for each occurrence: the first occurrence has g = 0; the second has g
> = 1.
> In my fix for CALCITE-1069, I will change GROUP_ID() to always return 0. This
> is wrong, but nevertheless closer to the required behavior.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)