Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19181 )

Change subject: IMPALA-11470: Add Cache For Codegen Functions
......................................................................


Patch Set 3:

(14 comments)

Some initial feedback. I still want to do a deeper dive on the cache key and 
double check caching mechanism. Overall makes sense.

http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen-cache.h
File be/src/codegen/llvm-codegen-cache.h:

http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen-cache.h@95
PS3, Line 95:   CodeGenCacheKeyConstructor() {}
nit: default constructor/destructor don't need to be explicitly declared.


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen.h@191
PS3, Line 191:   CodeGenCacheKeyConstructor* codegen_cache_key_constructor() {
This getter is never used, I'd be tempted to omit it. Leaves us more 
flexibility to move it around later.


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen.h@807
PS3, Line 807:   FragmentState* state_;
Not clear why this is no longer const.


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen.h@933
PS3, Line 933:   std::unique_ptr<CodeGenCacheKeyConstructor> 
codegen_cache_key_constructor_;
Why is this a unique_ptr? You could inline the object and it would be 
implicitly initialized.

I'd also consider moving it to be a member of CodeGenCache. Or rename it to 
something distinct from CodeGenCache if we think it might be useful as a 
general key, like CodeGenKeyConstructor.


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen.cc@97
PS3, Line 97: using std::shared_ptr;
This doesn't seem to be used.


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/exprs/scalar-expr.h
File be/src/exprs/scalar-expr.h:

http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/exprs/scalar-expr.h@254
PS3, Line 254:   bool AllowCodegenCache() { return allow_codegen_cache_; }
nit: const function.


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/exec-env.h
File be/src/runtime/exec-env.h:

http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/exec-env.h@156
PS3, Line 156:   bool codegen_cache_enabled() { return codegen_cache_ != 
nullptr; }
Should be const. All these getters should be const, and might be a good 
opportunity to switch to std::unique_ptr.


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/exec-env.cc@104
PS3, Line 104:     "Specify the capacity of the codegen cache. If set to 0, 
codegen cache is disabled.");
What kind of numeric parsing does this support? Does Impala have something 
standard?


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/fragment-state.h
File be/src/runtime/fragment-state.h:

http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/fragment-state.h@75
PS3, Line 75:   bool codegen_cache_enabled() {
This should be a const function.


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/fragment-state.h@92
PS3, Line 92:   bool AllowCodegenCache() { return allow_codegen_cache_; }
This should be a const function.


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/fragment-state.h@231
PS3, Line 231:   /// Indicates whether codegen cache is allowed. Normally if 
the fragment contains
"Normally" implies there are circumstances where a fragment containing udf 
functions would be allowed in the cache. If there are, we should list them.


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/krpc-data-stream-sender.cc
File be/src/runtime/krpc-data-stream-sender.cc:

http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/krpc-data-stream-sender.cc@93
PS3, Line 93:       exchange_hash_seed_ = 
KrpcDataStreamSender::EXCHANGE_HASH_SEED_CONST;
This reverts any improvements from IMPALA-8005. Can we replace query_id with 
something that would be consistent for the fragment across distinct queries?


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/query-state.h@154
PS3, Line 154:   bool codegen_cache_enabled();
This seems like it should be a const function.


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/query-state.h@468
PS3, Line 468:   bool disable_codegen_cache_ = false;
We're going to enable the codegen cache by default? This seems like something 
we should at least talk about a gradual rollout. If we did decide to switch, 
I'd change the name to something affirmative too (enable_codegen_cache).



--
To view, visit http://gerrit.cloudera.org:8080/19181
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If42c78a7f51fd582e5fe331fead494dadf544eb1
Gerrit-Change-Number: 19181
Gerrit-PatchSet: 3
Gerrit-Owner: Yida Wu <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Yida Wu <[email protected]>
Gerrit-Comment-Date: Fri, 04 Nov 2022 23:51:43 +0000
Gerrit-HasComments: Yes

Reply via email to