Alex Behm has posted comments on this change. Change subject: IMPALA-4792: Fix number of distinct values for a CASE with constant outputs ......................................................................
Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/5768/2/fe/src/main/java/org/apache/impala/analysis/CaseExpr.java File fe/src/main/java/org/apache/impala/analysis/CaseExpr.java: Line 384: // Otherwise, take a max over all the outputs as well as all the slotrefs that > The existing algorithm looks at all SlotRef descendants of the Expr. For th I agree. The behavior with incomplete stats is debatable. The old behavior doesn't really make sense in several cases (like case expr). I don't think we should try to preserve "compatibility" with it too much, especially since this Expr NDV information is not used for making critical plan choices (for now, there is almost no downside to trying to fix expr NDVs aggressively). Using the NDV of the 'when' exprs in any way seems wrong for case expr. I'm thinking we should only consider the 'then exprs' and if the NDV for one of them is unknown, then we should make the NDV of the case expr unknown as well. Might want to ask Marcel what he thinks. Line 388: boolean allOutputsKnown = true; > I will look into this. The way it works today is that Expr's analyze calls Agree. -- To view, visit http://gerrit.cloudera.org:8080/5768 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I21dbdaad8452b7e58c477612b47847dccd9d98d2 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-HasComments: Yes