Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/19181 )
Change subject: IMPALA-11470: Add Cache For Codegen Functions ...................................................................... Patch Set 3: (14 comments) Some initial feedback. I still want to do a deeper dive on the cache key and double check caching mechanism. Overall makes sense. http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen-cache.h File be/src/codegen/llvm-codegen-cache.h: http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen-cache.h@95 PS3, Line 95: CodeGenCacheKeyConstructor() {} nit: default constructor/destructor don't need to be explicitly declared. http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen.h File be/src/codegen/llvm-codegen.h: http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen.h@191 PS3, Line 191: CodeGenCacheKeyConstructor* codegen_cache_key_constructor() { This getter is never used, I'd be tempted to omit it. Leaves us more flexibility to move it around later. http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen.h@807 PS3, Line 807: FragmentState* state_; Not clear why this is no longer const. http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen.h@933 PS3, Line 933: std::unique_ptr<CodeGenCacheKeyConstructor> codegen_cache_key_constructor_; Why is this a unique_ptr? You could inline the object and it would be implicitly initialized. I'd also consider moving it to be a member of CodeGenCache. Or rename it to something distinct from CodeGenCache if we think it might be useful as a general key, like CodeGenKeyConstructor. http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen.cc File be/src/codegen/llvm-codegen.cc: http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen.cc@97 PS3, Line 97: using std::shared_ptr; This doesn't seem to be used. http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/exprs/scalar-expr.h File be/src/exprs/scalar-expr.h: http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/exprs/scalar-expr.h@254 PS3, Line 254: bool AllowCodegenCache() { return allow_codegen_cache_; } nit: const function. http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/exec-env.h File be/src/runtime/exec-env.h: http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/exec-env.h@156 PS3, Line 156: bool codegen_cache_enabled() { return codegen_cache_ != nullptr; } Should be const. All these getters should be const, and might be a good opportunity to switch to std::unique_ptr. http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/exec-env.cc File be/src/runtime/exec-env.cc: http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/exec-env.cc@104 PS3, Line 104: "Specify the capacity of the codegen cache. If set to 0, codegen cache is disabled."); What kind of numeric parsing does this support? Does Impala have something standard? http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/fragment-state.h File be/src/runtime/fragment-state.h: http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/fragment-state.h@75 PS3, Line 75: bool codegen_cache_enabled() { This should be a const function. http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/fragment-state.h@92 PS3, Line 92: bool AllowCodegenCache() { return allow_codegen_cache_; } This should be a const function. http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/fragment-state.h@231 PS3, Line 231: /// Indicates whether codegen cache is allowed. Normally if the fragment contains "Normally" implies there are circumstances where a fragment containing udf functions would be allowed in the cache. If there are, we should list them. http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/krpc-data-stream-sender.cc File be/src/runtime/krpc-data-stream-sender.cc: http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/krpc-data-stream-sender.cc@93 PS3, Line 93: exchange_hash_seed_ = KrpcDataStreamSender::EXCHANGE_HASH_SEED_CONST; This reverts any improvements from IMPALA-8005. Can we replace query_id with something that would be consistent for the fragment across distinct queries? http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/query-state.h File be/src/runtime/query-state.h: http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/query-state.h@154 PS3, Line 154: bool codegen_cache_enabled(); This seems like it should be a const function. http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/query-state.h@468 PS3, Line 468: bool disable_codegen_cache_ = false; We're going to enable the codegen cache by default? This seems like something we should at least talk about a gradual rollout. If we did decide to switch, I'd change the name to something affirmative too (enable_codegen_cache). -- To view, visit http://gerrit.cloudera.org:8080/19181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If42c78a7f51fd582e5fe331fead494dadf544eb1 Gerrit-Change-Number: 19181 Gerrit-PatchSet: 3 Gerrit-Owner: Yida Wu <[email protected]> Gerrit-Reviewer: Abhishek Rawat <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]> Gerrit-Comment-Date: Fri, 04 Nov 2022 23:51:43 +0000 Gerrit-HasComments: Yes
