Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/20211 )
Change subject: IMPALA-12253: Improve the integration of codegen cache and async codegen ...................................................................... Patch Set 2: (21 comments) http://gerrit.cloudera.org:8080/#/c/20211/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20211/2//COMMIT_MSG@9 PS2, Line 9: the Nit: "the" not needed. http://gerrit.cloudera.org:8080/#/c/20211/2//COMMIT_MSG@10 PS2, Line 10: the Nit: "the" not needed. http://gerrit.cloudera.org:8080/#/c/20211/2//COMMIT_MSG@11 PS2, Line 11: Nit: is enabled http://gerrit.cloudera.org:8080/#/c/20211/2//COMMIT_MSG@13 PS2, Line 13: the Nit: "the" not needed. http://gerrit.cloudera.org:8080/#/c/20211/2//COMMIT_MSG@18 PS2, Line 18: reducing eliminating? http://gerrit.cloudera.org:8080/#/c/20211/2//COMMIT_MSG@18 PS2, Line 18: of Nit: for? http://gerrit.cloudera.org:8080/#/c/20211/2//COMMIT_MSG@19 PS2, Line 19: if cache exists "if an appropriate cache entry exists" could be better. http://gerrit.cloudera.org:8080/#/c/20211/2//COMMIT_MSG@21 PS2, Line 21: Did the perf a-b test with async codegen and codegen cache enabled, Shouldn't the performance improve? http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen-cache-test.cc File be/src/codegen/llvm-codegen-cache-test.cc: http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen-cache-test.cc@599 PS2, Line 599: NULL Nit: 'nullptr' would be better. http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen-cache-test.cc@608 PS2, Line 608: NULL Nit: 'nullptr' would be better. http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen.h File be/src/codegen/llvm-codegen.h: http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen.h@351 PS2, Line 351: PrepareFinalizeModule() must be called before FinalizeModule() This should come in the second sentence, after "This should be called after all functions to JIT have been added to the module via AddFunctionToJit() ..." http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen.h@897 PS2, Line 897: be Nit: no need for "be". http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen.h@910 PS2, Line 910: /// If true, the module has been prepared and is ready for compilation. Could add that it is set in PrepareFinalizeModule(). http://gerrit.cloudera.org:8080/#/c/20211/1/be/src/codegen/llvm-codegen.cc File be/src/codegen/llvm-codegen.cc: http://gerrit.cloudera.org:8080/#/c/20211/1/be/src/codegen/llvm-codegen.cc@1384 PS1, Line 1384: DCHECK(cache_key_); > Done '!cache_key_->empty()' is not checked in the new version, is it? Also, I think 'cache_key_ != nullptr' is more readable as it's more explicit. I, at least, have always used this longer form in Impala code. http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen.cc File be/src/codegen/llvm-codegen.cc: http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen.cc@1277 PS2, Line 1277: DCHECK(!is_prepared_); We could also keep "DCHECK(!is_compiled_);" just in case. http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen.cc@1279 PS2, Line 1279: DCHECK(can_return != nullptr); I'd put it after the previous DCHECK(s) before setting 'is_prepared_', so all precondition checking is done first. http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen.cc@1340 PS2, Line 1340: DCHECK(cache_key_); I don't think this checks that 'cache_key_' is not empty, CodeGenCacheKey doesn't have operator bool(). This will only check that 'cache_key_' does not hold a NULL pointer, which is never the case because of L1327. http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen.cc@1382 PS2, Line 1382: bool codegen_cache_enabled = state_->codegen_cache_enabled() && codegen_cache_enabled_; In PrepareFinalizeModule() you didn't extract it into a variable - I think both approaches are ok but it's better if it's the same in both functions. http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/codegen/llvm-codegen.cc@1384 PS2, Line 1384: DCHECK(cache_key_); See comment on L1340. If you only want to make sure it's not NULL, my opinion is that writing 'cache_key_ != nullptr' explicitly is better. http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/runtime/fragment-state.cc File be/src/runtime/fragment-state.cc: http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/runtime/fragment-state.cc@131 PS2, Line 131: Status status = llvm_codegen->PrepareFinalizeModule(&can_return); With this change, the work performed by PrepareFinalizeModule() is always done synchronously even if codegen caching is disabled. Separating PrepareFinalizeModule() (and running it synchronously) is only useful for performance if codegen caching is enabled. This could cause a performance regression when codegen caching is disabled and async codegen is enabled because before this change this work was also done asynchronously. We should run a performance test with these parameters. If there is a significant regression, we should put PrepareFinalizeModule() into the async part when codegen caching is disabled. If there's only a small regression, it may not be worth complicating the code because of this. http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/service/fe-support.cc File be/src/service/fe-support.cc: http://gerrit.cloudera.org:8080/#/c/20211/2/be/src/service/fe-support.cc@244 PS2, Line 244: bool can_return = false; : status = codegen->PrepareFinalizeModule(&can_return); : if (!can_return) { : status = codegen->FinalizeModule(); : } This pattern is repeated many times in this change (the only place where it is used differently is fragment-state.cc). We may consider creating a convenience function for it. -- To view, visit http://gerrit.cloudera.org:8080/20211 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic5ae4b342ff8ef1c3b7ce35c927baa8b59d72908 Gerrit-Change-Number: 20211 Gerrit-PatchSet: 2 Gerrit-Owner: Yida Wu <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]> Gerrit-Comment-Date: Thu, 17 Aug 2023 17:25:35 +0000 Gerrit-HasComments: Yes
