Michael Ho has posted comments on this change. Change subject: IMPALA-4397,IMPALA-3259: reduce codegen time and memory ......................................................................
Patch Set 7: (2 comments) http://gerrit.cloudera.org:8080/#/c/4956/4/be/src/exec/hdfs-scanner.h File be/src/exec/hdfs-scanner.h: PS4, Line 190: CODEGEN_WRITE_TUPLES_MAX_COLUMNS = 250; > I'd just be picking an arbitrary threshold in that case too, since I don't I agree that it's more actionable if it's based on number of tuples. However, isn't the longer compilation + optimization time with too many tuples a direct result of having large functions ? If so, instruction count seems to be a more robust estimate. For example, the generated functions can be different if we change the codegen function. In that case this threshold may not hold anymore. This threshold really shouldn't be arbitrary. The ideal case is that the planner can estimate a threshold of instruction count in which the cost of codegen is less than interpretation. Also, it seems that the planner should have all the information needed to make the call. Is there some unexpected difficulty for doing it in the planner ? http://gerrit.cloudera.org:8080/#/c/4956/4/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java File fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java: PS4, Line 823: if (!hasGrouping) { > I disable it for this particular exchange. This is a proactive thing only r This isn't exactly what I had in mind. If you look at ExchangeNode::Codegen(), we only codegen when is_mergeing_ is true. This is pretty much the same situation as the Agg node here. So, it would be nice to follow the same pattern for exchange node and convert if (is_merging_) to DCHECK(is_merging_); -- To view, visit http://gerrit.cloudera.org:8080/4956 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id10015b49da182cb181a653ac8464b4a18b71091 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Marcel Kornacker <mar...@cloudera.com> Gerrit-Reviewer: Michael Ho <k...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-HasComments: Yes