Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9005 )
Change subject: IMPALA-6314: Add run time scalar subquery check for uncorrelated subqueries ...................................................................... Patch Set 9: (3 comments) http://gerrit.cloudera.org:8080/#/c/9005/11/fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java File fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java: http://gerrit.cloudera.org:8080/#/c/9005/11/fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java@175 PS11, Line 175: if (!operand.type_.isNumericType() && !operand.type_.isNull() && > I'm a bit confused about how to organise this. Good question, thanks for the detailed explanation. The following alternative might be simpler, let's discuss: Modify Subquery.analyze(): * the type_ is always a scalar or struct type * in the line that checks returnsSingleRow(), do not create an ArrayType * setIsRuntimeScalar() based on returnsSingleRow() In InPredicate.analyze() and ExistsPredicate.nalyze(): * setIsRuntimeScalar(false) as necessary No need for StmtRewriter.rewriteNonScalarSubqueries() My idea is that by default all subqueries are treated as runtime scalar, unless statically know to only return a single row. For the exceptions (InPredicate and ExistsPredicate), we manually setIsRuntimeScalar() during their analysis. This seems to be more in line with what we really want. New Exprs will just work with scalar subqueries without special handling. Arguably, the old type_ is more semantically correct, but I don't think that needs to get in the way of the simplest solution. We can still check stmt.returnsSingleRow() instead of relying on the Subquery type_ in those places is where it matters. What do you think? http://gerrit.cloudera.org:8080/#/c/9005/9/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java: http://gerrit.cloudera.org:8080/#/c/9005/9/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1116 PS9, Line 1116: if (inlineViewRef.getViewStmt().isRuntimeScalar()) { > The propagation of the conjunct 'id = 3' starts in InlineViewRef.analyze() The optimization is definitely intentional and it's not clear to me why this issue is specific to queries with the runtime cardinality check. I'm not yet satisfied with the quick fix. There is no good way to explain why it's there because we don't understand it, so future code readers will likewise not understand it. Give me some time to investigate this issue more deeply. http://gerrit.cloudera.org:8080/#/c/9005/11/fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java File fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java: http://gerrit.cloudera.org:8080/#/c/9005/11/fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java@1434 PS11, Line 1434: > Thanks for pointing to this test, I addressed the TODO and now the subquery Thanks! Works for me. -- To view, visit http://gerrit.cloudera.org:8080/9005 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0f52b93a60eeacedd242a2f17fa6b99c4fc38e06 Gerrit-Change-Number: 9005 Gerrit-PatchSet: 9 Gerrit-Owner: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Attila Jeges <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Vuk Ercegovac <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Thu, 22 Feb 2018 07:42:49 +0000 Gerrit-HasComments: Yes
