Alex Behm has posted comments on this change. Change subject: IMPALA-5036: Parquet count star optimization ......................................................................
Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/6812/5/fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java File fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java: Line 886: prefix + "8InitZeroIN10impala_udf9BigIntValEEEvPNS2_15FunctionContextEPT_", > You'd have to replace the uses of the agg slot with the zeroifnull() expres Taras, I think the rewrite solution becomes more viable if we follow the approach where the AggInfo is passed directly into the HdfsScanNode. The scan can create an smap with two entries: count(*) -> sum(num_rows_slot) slotref -> zeroifnull(slotref) where the slotref of the second entry is the agg slot from the first-level aggregation corresponding to count(*). Once we return fro init() from the scan node, we apply the agg optimization smap to all the AggInfos (local and merge). http://gerrit.cloudera.org:8080/#/c/6812/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: Line 109: * > This comment doesn't explain the overall optimization. What I'm looking for Taras, I think Dan is looking for a comment along the lines of what we have in applyParquetCountStartOptimization(). Maybe we can add a condensed version of that somewhere. Dan, where do you expect this comment? Here? in SingleNodePlanner.createSelectPlan()? Somewhere else? PS5, Line 140: This : // scan does additional analysis in init() to determine whether it is correct to apply : // the optimization. > Okay. If it doesn't work out, I just think the comments needs to be clarifi I think this approach will work out. We can use Analyzer.tableRefMap_ to determine how many table refs are in a query block. -- To view, visit http://gerrit.cloudera.org:8080/6812 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky <tbobrovyt...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com> Gerrit-Reviewer: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Marcel Kornacker <mar...@cloudera.com> Gerrit-Reviewer: Mostafa Mokhtar <mmokh...@cloudera.com> Gerrit-Reviewer: Taras Bobrovytsky <tbobrovyt...@cloudera.com> Gerrit-Reviewer: Zach Amsden <zams...@cloudera.com> Gerrit-HasComments: Yes