Tim Armstrong has posted comments on this change. Change subject: IMPALA-4397,IMPALA-3259: reduce codegen time and memory ......................................................................
Patch Set 4: (16 comments) http://gerrit.cloudera.org:8080/#/c/4956/4/be/src/codegen/llvm-codegen.cc File be/src/codegen/llvm-codegen.cc: Line 952: RETURN_IF_ERROR(OptimizeModule()); > Is there a reason why we don't call DestroyIRModule() when OptimizeModule() No reason, it's inconsistent. I don't think it's really necessary to destroy the IR module on the error path, since it will be cleaned up shortly anyway, so I removed it on those paths to simplify things. PS4, Line 986: memory_manager_->bytes_allocated()), : memory_manager_->bytes_allocated() > The code may be easier to read if this factored into a single variable inst Done Line 1039: if (!mem_tracker_->TryConsume(estimated_memory)) { > I wonder if it's better to skip optimization if we cannot reserve the memor Compiling the code can also be pretty CPU/memory intensive so it probably isn't even safe to do that. I think the codegen memory will only be greater than the runtime memory requirement in some pretty extreme cases. E.g. in the queries I looked at the memory calculation was at most a few MB per fragment, and usually less. PS4, Line 1041: Substitute("Codegen failed to reserve '$0' bytes for optimization", : estimated_memory), : estimated_memory); > These line wraps look a bit weird to me but not sure if it's a side effect I agree they look weird but clang-format really wants to format it like this. http://gerrit.cloudera.org:8080/#/c/4956/4/be/src/codegen/llvm-codegen.h File be/src/codegen/llvm-codegen.h: PS4, Line 708: optimiser > nit: optimizer We generally use a mix of American and commonwealth English spellings in comments. I guess here we're inconsistent internally so I just changed it. PS4, Line 709: 512 bytes > Mind documenting how this is derived so we can update it accordingly in cas Elaborated a bit. It's very approximate (and will become very inaccurate once functions get large). http://gerrit.cloudera.org:8080/#/c/4956/4/be/src/codegen/mcjit-mem-mgr.h File be/src/codegen/mcjit-mem-mgr.h: PS4, Line 36: memory > This is for tracking memory consumed by the compiled machine code, right ? Reworded to clarify. http://gerrit.cloudera.org:8080/#/c/4956/4/be/src/exec/exec-node.h File be/src/exec/exec-node.h: PS4, Line 187: false > true ? Done http://gerrit.cloudera.org:8080/#/c/4956/4/be/src/exec/hash-table.cc File be/src/exec/hash-table.cc: PS4, Line 762: if (ctxs.size() > Expr::CODEGEN_INLINE_EXPRS_THRESHOLD) { : // Avoid bloating function by inlining too many exprs into it. : expr_fn->addFnAttr(llvm::Attribute::NoInline); : } > I wonder if this can be more generalized by putting it in LlvmCodegen::Fina Not a bad idea, but doesn't address the main scenario this patch helps with, which is when many small functions are inlined into a large function. http://gerrit.cloudera.org:8080/#/c/4956/4/be/src/exec/hdfs-scanner.h File be/src/exec/hdfs-scanner.h: PS4, Line 190: CODEGEN_WRITE_TUPLES_MAX_COLUMNS = 250; > This seems a bit adhoc to me. May be better to track the size of IR functio I'd just be picking an arbitrary threshold in that case too, since I don't really have a good way to accurately estimate the cost. The advantage of doing it this way is that it's easier to understand why it was disabled and gives a user an actionable way to work around it. PS4, Line 190: const > constexpr There isn't a semantic difference between static const and static constexpr when it's an integer literal. http://en.cppreference.com/w/cpp/language/constant_expression http://gerrit.cloudera.org:8080/#/c/4956/4/be/src/exprs/aggregate-functions-ir.cc File be/src/exprs/aggregate-functions-ir.cc: PS4, Line 1911: : > Is it necessary to remove it ? For comparison, please see FirstValUpdate(). I added it back in. It shouldn't be necessary to instantiate it, but this makes it more consistent. http://gerrit.cloudera.org:8080/#/c/4956/4/be/src/exprs/expr.h File be/src/exprs/expr.h: PS4, Line 311: CODEGEN_INLINE_EXPRS_THRESHOLD > Wouldn't instruction count be a more precise estimation ? It's a pretty arbitrary threshold anyway (afaik optimisation time is only loosely correlated with instruction count) so I like that this is a little simpler and easier to understand. PS4, Line 311: const > constexpr ? I don't feel strongly but I don't think there's any advantage to constexpr when it's a simple literal. http://gerrit.cloudera.org:8080/#/c/4956/4/common/thrift/PlanNodes.thrift File common/thrift/PlanNodes.thrift: Line 494: 24: required bool disable_codegen > please move up, into the common/non-union part of this struct (field 8?) an Done http://gerrit.cloudera.org:8080/#/c/4956/4/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java File fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java: PS4, Line 823: if (!hasGrouping) mergeAggNode.setDisableCodegen(true); > Sure. This seems to set things up so the planner can selectively disable co I disable it for this particular exchange. This is a proactive thing only right? Since we don't actually codegen non-merging exchanges. -- To view, visit http://gerrit.cloudera.org:8080/4956 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id10015b49da182cb181a653ac8464b4a18b71091 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
