Ruben Q L created CALCITE-6468: ---------------------------------- Summary: RelDecorrelator throws AssertionError if correlated variable is used as Aggregate group key Key: CALCITE-6468 URL: https://issues.apache.org/jira/browse/CALCITE-6468 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.37.0 Reporter: Ruben Q L Assignee: Ruben Q L Fix For: 1.38.0
The problem can be reproduced with this query (a "simplified" version of TPC-DS query1): {code:sql} WITH agg_sal AS (SELECT deptno, sum(sal) AS total FROM emp GROUP BY deptno) SELECT 1 FROM agg_sal s1 WHERE s1.total > (SELECT avg(total) FROM agg_sal s2 WHERE s1.deptno = s2.deptno) {code} If we apply subquery program, FilterAggregateTransposeRule and then we call the RelDecorrelator, it will fail with: {noformat} java.lang.AssertionError at org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:581) at org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:495) ... {noformat} The problem appears in this assert (RelDecorrelator.java:581): {code} assert newPos == newInputOutput.size(); {code} The root cause seems to be that, a few lines before, when processing the correlating variables from {{corDefOutputs}} a certain value is inserted in {{mapNewInputToProjOutputs}}: {code} if (!frame.corDefOutputs.isEmpty()) { for (Map.Entry<CorDef, Integer> entry : frame.corDefOutputs.entrySet()) { RexInputRef.add2(projects, entry.getValue(), newInputOutput); corDefOutputs.put(entry.getKey(), newPos); mapNewInputToProjOutputs.put(entry.getValue(), newPos); // <-- HERE newPos++; } } {code} The problem is that this value was already in the map, as it had been inserted previously as part of the group key processing: {code} for (int i = 0; i < oldGroupKeyCount; i++) { final int idx = groupKeyIndices.get(i); ... // add mapping of group keys. outputMap.put(idx, newPos); int newInputPos = requireNonNull(frame.oldToNewOutputs.get(idx)); RexInputRef.add2(projects, newInputPos, newInputOutput); mapNewInputToProjOutputs.put(newInputPos, newPos); // <-- HERE added firstly newPos++; } {code} Therefore, the unnecessary insertion into {{mapNewInputToProjOutputs}} and the subsequent increment of {{newPos}} when the {{CorDef}}s are processed leads to the mismatch. Notice how, right before the assertion, when processing the remaining fields, it is verified that the value is not already contained on the {{mapNewInputToProjOutputs}}: {code} // add the remaining fields final int newGroupKeyCount = newPos; for (int i = 0; i < newInputOutput.size(); i++) { if (!mapNewInputToProjOutputs.containsKey(i)) { // <-- HERE checked RexInputRef.add2(projects, i, newInputOutput); mapNewInputToProjOutputs.put(i, newPos); newPos++; } } {code} Thus, probably the solution would be to apply the same logic when the CorDef are processed: {code} if (!frame.corDefOutputs.isEmpty()) { for (Map.Entry<CorDef, Integer> entry : frame.corDefOutputs.entrySet()) { Integer pos = mapNewInputToProjOutputs.get(entry.getValue()); if (pos == null) { RexInputRef.add2(projects, entry.getValue(), newInputOutput); corDefOutputs.put(entry.getKey(), newPos); mapNewInputToProjOutputs.put(entry.getValue(), newPos); newPos++; } else { corDefOutputs.put(entry.getKey(), pos); } } } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)