wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/18099 )
Change subject: IMPALA-11049: Substitute order by elements when creating SortInfo ...................................................................... Patch Set 6: > > Patch Set 6: > > > > Hi Quanlong and Zoltan, there are almost 25 tests failed in > jenkins job with this patch. I checked these tests one by one, and > found at least two problems. > > 1. UNION and VALUES execute failed, such as: select 1+1,2,3 UNION > select 4,5,6 order by 2. After patch, 'orderByElements_' been > rewrite to SlotRef, in 'reAnalyze' phase, when execute substitute > again, SlotRef analyze failed due to 'Preconditions.checkState(rawPath_ > != null);', since example sql clause is not a real column, just a > 'NumericLiteral'. We can remove this check to slove this problem. > > 2. Another problem is hard to resloved, INTERSECT sql return > exception like this: 'IllegalStateException: Illegal reference to > non-materialized slot: tid=3 sid=39'. I also read the code. The > main reason is that when transform INTERSECT to a SelectStmt like > 'select count(*) xxx' in StmtRewriter.rewriteSetOperationStatement(), > all select list been materizlized in SelectStmt.SelectAnalyzer.expandStar(), > but SlotRef in 'orderByElements_' not been materizlized, and add to > this new SelectStmt's order by directly. When this new SelectStmt > execute analyze, there will be some SlotRef in 'materializedExprs_' > which are not 'materizlized'. Here is a example: > > select id, bool_col, tinyint_col, smallint_col, int_col, > bigint_col, float_col, double_col, date_string_col, string_col, > timestamp_col, year, month from alltypestiny where year=2009 and > month in (1,2,3) > > intersect > > select id, bool_col, tinyint_col, smallint_col, int_col, > bigint_col, float_col, double_col, date_string_col, string_col, > timestamp_col, year, month from alltypestiny where year=2009 and > month in (1,2) > > intersect > > (select id, bool_col, tinyint_col, smallint_col, int_col, > bigint_col, float_col, double_col, date_string_col, string_col, > timestamp_col, year, month from alltypestiny where year=2009 and > month=1) > > order by 1 limit 1; > > > > Besides, I'm not sure if there any other problems not been found, > so I think maybe we should just keep this patch adding check and > analyze in 'SimplifyCastExprRule', substitute order by elements > maybe need to modify a lot of code. How do you think? > > Thanks for digging into this! I tend to add checks and analyze the > exprs in need in the rules, which is we already done in > ConvertToCNFRule. I found that most expr rewrite rules already added 'isAnalyzed' check, but some of return directly like this: if (!expr.isAnalyzed()) return expr; Do you mean replace these 'return' by 'expr.analyze(analyzer)'? -- To view, visit http://gerrit.cloudera.org:8080/18099 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2780e04a6d5a32e224cd0470cf6f166a832363ec Gerrit-Change-Number: 18099 Gerrit-PatchSet: 6 Gerrit-Owner: wangsheng <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: wangsheng <[email protected]> Gerrit-Comment-Date: Thu, 27 Jan 2022 09:33:10 +0000 Gerrit-HasComments: No
