Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20211 )

Change subject: IMPALA-12253: Improve the integration of codegen cache and 
async codegen
......................................................................


Patch Set 2:

(21 comments)

http://gerrit.cloudera.org:8080/#/c/20211/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20211/2//COMMIT_MSG@9
PS2, Line 9: the
Nit: "the" not needed.


http://gerrit.cloudera.org:8080/#/c/20211/2//COMMIT_MSG@10
PS2, Line 10: the
Nit: "the" not needed.


http://gerrit.cloudera.org:8080/#/c/20211/2//COMMIT_MSG@11
PS2, Line 11:
Nit: is enabled


http://gerrit.cloudera.org:8080/#/c/20211/2//COMMIT_MSG@13
PS2, Line 13: the
Nit: "the" not needed.


http://gerrit.cloudera.org:8080/#/c/20211/2//COMMIT_MSG@18
PS2, Line 18: reducing
eliminating?


http://gerrit.cloudera.org:8080/#/c/20211/2//COMMIT_MSG@18
PS2, Line 18: of
Nit: for?


http://gerrit.cloudera.org:8080/#/c/20211/2//COMMIT_MSG@19
PS2, Line 19: if cache exists
"if an appropriate cache entry exists" could be better.


http://gerrit.cloudera.org:8080/#/c/20211/2//COMMIT_MSG@21
PS2, Line 21: Did the perf a-b test with async codegen and codegen cache 
enabled,
Shouldn't the performance improve?


http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen-cache-test.cc
File be/src/codegen/llvm-codegen-cache-test.cc:

http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen-cache-test.cc@599
PS2, Line 599: NULL
Nit: 'nullptr' would be better.


http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen-cache-test.cc@608
PS2, Line 608: NULL
Nit: 'nullptr' would be better.


http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen.h@351
PS2, Line 351: PrepareFinalizeModule() must be called before FinalizeModule()
This should come in the second sentence, after "This should be called after all 
functions to JIT have been added to the module via AddFunctionToJit() ..."


http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen.h@897
PS2, Line 897: be
Nit: no need for "be".


http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen.h@910
PS2, Line 910:   /// If true, the module has been prepared and is ready for 
compilation.
Could add that it is set in PrepareFinalizeModule().


http://gerrit.cloudera.org:8080/#/c/20211/1/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

http://gerrit.cloudera.org:8080/#/c/20211/1/be/src/codegen/llvm-codegen.cc@1384
PS1, Line 1384:     DCHECK(cache_key_);
> Done
'!cache_key_->empty()' is not checked in the new version, is it?

Also, I think 'cache_key_ != nullptr' is more readable as it's more explicit. 
I, at least, have always used this longer form in Impala code.


http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen.cc@1277
PS2, Line 1277:   DCHECK(!is_prepared_);
We could also keep "DCHECK(!is_compiled_);" just in case.


http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen.cc@1279
PS2, Line 1279:   DCHECK(can_return != nullptr);
I'd put it after the previous DCHECK(s) before setting 'is_prepared_', so all 
precondition checking is done first.


http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen.cc@1340
PS2, Line 1340:     DCHECK(cache_key_);
I don't think this checks that 'cache_key_' is not empty, CodeGenCacheKey 
doesn't have operator bool().
This will only check that 'cache_key_' does not hold a NULL pointer, which is 
never the case because of L1327.


http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen.cc@1382
PS2, Line 1382:   bool codegen_cache_enabled = state_->codegen_cache_enabled() 
&& codegen_cache_enabled_;
In PrepareFinalizeModule() you didn't extract it into a variable - I think both 
approaches are ok but it's better if it's the same in both functions.


http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen.cc@1384
PS2, Line 1384:     DCHECK(cache_key_);
See comment on L1340.
If you only want to make sure it's not NULL, my opinion is that writing 
'cache_key_ != nullptr' explicitly is better.


http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/runtime/fragment-state.cc
File be/src/runtime/fragment-state.cc:

http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/runtime/fragment-state.cc@131
PS2, Line 131:   Status status = 
llvm_codegen->PrepareFinalizeModule(&can_return);
With this change, the work performed by PrepareFinalizeModule() is always done 
synchronously even if codegen caching is disabled.

Separating PrepareFinalizeModule() (and running it synchronously) is only 
useful for performance if codegen caching is enabled.

This could cause a performance regression when codegen caching is disabled and 
async codegen is enabled because before this change this work was also done 
asynchronously. We should run a performance test with these parameters. If 
there is a significant regression, we should put PrepareFinalizeModule() into 
the async part when codegen caching is disabled. If there's only a small 
regression, it may not be worth complicating the code because of this.


http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/service/fe-support.cc
File be/src/service/fe-support.cc:

http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/service/fe-support.cc@244
PS2, Line 244: bool can_return = false;
             :     status = codegen->PrepareFinalizeModule(&can_return);
             :     if (!can_return) {
             :       status = codegen->FinalizeModule();
             :     }
This pattern is repeated many times in this change (the only place where it is 
used differently is fragment-state.cc). We may consider creating a convenience 
function for it.



--
To view, visit http://gerrit.cloudera.org:8080/20211
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic5ae4b342ff8ef1c3b7ce35c927baa8b59d72908
Gerrit-Change-Number: 20211
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Yida Wu <[email protected]>
Gerrit-Comment-Date: Thu, 17 Aug 2023 17:25:35 +0000
Gerrit-HasComments: Yes

Reply via email to