Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9005 )

Change subject: IMPALA-6314: Add run time scalar subquery check for 
uncorrelated subqueries

Patch Set 23:


File fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java:

PS23, Line 425:       analysisResult_.stmt_.origSqlString_ = 
> This is pretty weird. See my comment in StatementBase.
Removed it from here.
Now I'm setting origSqlString_ in SelectStmt.analyze() and UnionStmt.analyze().

File fe/src/main/java/org/apache/impala/analysis/StatementBase.java:

PS23, Line 43:   protected String origSqlString_ = null;
> Why is this here and not in QueryStmt? I think it's cleaner to have it in Q
OK, moved it to QueryStmt.

So when I set it in AnalysisContext.analyze() after the first analysis phase 
(before any rewrites), wasn't it correctly set for all the statements?

File fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java:

PS21, Line 1025:     if (isCorrelated && 
!item.getExpr().contains(Expr.IS_BUILTIN_AGG_FN)) {
> I looked into this and concluded that we should convert the type check to a
Turned it into a precondition.

To fix the error message I changed the condition in L1032 and introduced a new 
predicate called IS_UDA_FN.

File fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java:

PS23, Line 747:     cardinalityCheckNode.computeStats(ctx_.getRootAnalyzer());
> Why do we need to computeStats() again?
I introduced it in PS2 as some kind of workaround, but it seems we no longer 
need it.
Removed this line.

File fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java:

PS23, Line 1163:     // Scalar subquery check is done at runtime, not at 
> not at parse-time -> not during analysis

PS23, Line 1164:     AnalyzesOk("select id from functional.alltypestiny t where 
int_col = " +
> Loop over cmpOperators and have a few different variants inside the loop
Created a loop.

PS23, Line 1174:     AnalyzesOk("select id from functional.alltypestiny t where 
int_col < " +
> Do we have a negative test that shows disallowing correlated scalar subquer
No, we don't have such tests, and currently the following query is allowed:

select * from functional.alltypes t1 where id < (select id from 
functional.alltypes t2 where t1.int_col = t2.int_col);

However, it raises the cardinality error, but I don't think we handle 
correlated subqueries correctly all the time.

PS23, Line 1395:     AnalyzesOk("select count(1) from functional.alltypes " +
> Also add a variant with (subquery) IS NULL. That variant is currently forbi
Wasn't sure where to put IS NULL, so I put it two different places.
In the meantime I had to modify IsNullPredicate.java, and noticed a weird 

 select count(1)
 from functional.alltypes
 where (select int_col from functional.alltypes) is null

The above query executes succesfully even though the subquery clearly returns 
multiple rows.

But when I use IS NOT NULL:

 select count(1)
 from functional.alltypes
 where (select int_col from functional.alltypes) is not null

Then the query fails with the expected error "Subquery must not return more 
than one row".

And when I refer to a column that is completely NULL, the results are the exact 
opposite, ie. IS NULL yields error and IS NOT NULL executes successfully.

I guess the explanation is that the IS NULL and IS NOT NULL predicates are 
pushed down to the scan node of the subquery.

I think the behavior is OK if we use the argument we already used before: "we 
only deduce information from the query statement, so the statement as a whole 
is correct".

File fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java:

PS23, Line 82:       if (node instanceof StatementBase) {
> No need for this. We're always going to have a StatementBase
Switched to QueryStmt as you suggested.

PS23, Line 917:     //TODO: make it work again or delete
              : //    testToSql("select * from functional.alltypes where not 
(id < 10 and " +
              : //            "(int_col in (select int_col from 
functional.alltypestiny)) and " +
              : //            "(string_col = (select max(string_col) from 
              : //        "SELECT * FROM functional.alltypes WHERE NOT (id < 10 
AND " +
              : //            "(int_col IN (SELECT int_col FROM 
functional.alltypestiny)) AND " +
              : //        "(string_col = (SELECT max(string_col) FROM 
> Please file an improvement JIRA for this case, and remove the test.
Removed the test and filed IMPALA-6840.

To view, visit http://gerrit.cloudera.org:8080/9005
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0f52b93a60eeacedd242a2f17fa6b99c4fc38e06
Gerrit-Change-Number: 9005
Gerrit-PatchSet: 23
Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <atti...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Thu, 12 Apr 2018 16:39:05 +0000
Gerrit-HasComments: Yes

Reply via email to