Yida Wu has posted comments on this change. ( http://gerrit.cloudera.org:8080/19181 )
Change subject: IMPALA-11470: Add Cache For Codegen Functions ...................................................................... Patch Set 11: (6 comments) This patch changed the codegen_cache_debug_mode to codegen_cache_mode with a new enum type TCodeGenCacheMode. By default using NORMAL, which is using the full key, according to the local tpch and tpcds test, the performance seems no significant regression compared to using the hash code (OPTIMAL mode). Also changed a way to disable codegen cache for udfs and added more e2e testcases. http://gerrit.cloudera.org:8080/#/c/19181/9//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19181/9//COMMIT_MSG@34 PS9, Line 34: Codegen cache is disabled if a fragment uses a native udf, because > Just to clarify, we meant reusing the codegend module when the UDF module i Thanks Csaba and Daniel. Yes, it should only happen to the native udf. I tested test_java_udfs, test_ir_functions, test_generic_java_udfs and test_native_functions in test_udfs.py, only test_native_functions would crash. Added an e2e testcase which would crash if we don't disable the native udf for codegen cache. Also, I did a test on the addGlobalMapping(), and it doesn't affect the bitcode, so I think it is the root cause of the crash, and this should be the only place using addGlobalMapping(). I changed the logic, still avoid using codegen cache for native udf, but in the LoadFunction(), it looks good in the test, won't crash. The builtin function should not be affected because in my understanding they can't be reloaded during runtime. Please review to see if the fix is alright. May allow the native udf for cache in the next patch. http://gerrit.cloudera.org:8080/#/c/19181/9/be/src/codegen/llvm-codegen-cache.h File be/src/codegen/llvm-codegen-cache.h: http://gerrit.cloudera.org:8080/#/c/19181/9/be/src/codegen/llvm-codegen-cache.h@38 PS9, Line 38: struct HashCode) bytes. If the : /// cache is in NORMAL mode, the content of the key will be after the hash code, : /// which could be a combination of several keys, otherwise, in OPTIMAL mode, the : /// CodeGenCacheKey to be stored to the cache would only contain a hash code to > I wonder why the full key is stored in CodeGenCacheKey in debug mode - woul One main idea for the debug mode is to provide the user a "correct" way using the codegen cache, while providing more information in the logs to track the hash code. I changed the mode of codegen cache, added four modes, NORMAL/OPTIMAL/NORMAL_DEBUG/OPTIMAL_DEBUG, the debug mode allows more logs for now, we can add more things to do for debugging later. http://gerrit.cloudera.org:8080/#/c/19181/8/be/src/codegen/llvm-codegen-cache.cc File be/src/codegen/llvm-codegen-cache.cc: http://gerrit.cloudera.org:8080/#/c/19181/8/be/src/codegen/llvm-codegen-cache.cc@162 PS8, Line 162: alue(), codegen->function_n > IMHO Impala should never allow wrong results at all. Thanks Qifan. I changed the default value of the codegen cache mode. Added NORMAL and OPTIMAL. Set the default to NORMAL, which uses the full key, should not have the collision issue. In the simple tpch and tpcds test, it seems no significant difference, but would consume much more memory. We can optimize the OPTIMAL mode later for ensure the safety. http://gerrit.cloudera.org:8080/#/c/19181/9/be/src/codegen/llvm-codegen-cache.cc File be/src/codegen/llvm-codegen-cache.cc: http://gerrit.cloudera.org:8080/#/c/19181/9/be/src/codegen/llvm-codegen-cache.cc@185 PS9, Line 185: pair<unordered_set<CodeGenCacheKey::HashCode>::iterator, bool> key_to_insert_it; : { > For example, we do concurrent lookups / inserts without any additional sync Thanks Joe for the clarification, so it seems there is no need to take care of synchronization in the client side, maybe one concern the inefficiency of insertion on the same entry(key), in that case, it might be good to keep the current logic in Store() to avoid too many evictions due to concurrent insertion on the same key to some extent. http://gerrit.cloudera.org:8080/#/c/19181/4/tests/custom_cluster/test_codegen_cache.py File tests/custom_cluster/test_codegen_cache.py: http://gerrit.cloudera.org:8080/#/c/19181/4/tests/custom_cluster/test_codegen_cache.py@27 PS4, Line 27: @SkipIfNotHdfsMinicluster.scheduling > Seems a bit difficult to detect because it is fragment level disabled inste Added negative e2e testcases for udfs. http://gerrit.cloudera.org:8080/#/c/19181/9/tests/custom_cluster/test_codegen_cache.py File tests/custom_cluster/test_codegen_cache.py: http://gerrit.cloudera.org:8080/#/c/19181/9/tests/custom_cluster/test_codegen_cache.py@71 PS9, Line 71: "select * from functional.alltypes where bigint_col> 0") > Need to add more cases. Done -- To view, visit http://gerrit.cloudera.org:8080/19181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If42c78a7f51fd582e5fe331fead494dadf544eb1 Gerrit-Change-Number: 19181 Gerrit-PatchSet: 11 Gerrit-Owner: Yida Wu <[email protected]> Gerrit-Reviewer: Abhishek Rawat <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]> Gerrit-Comment-Date: Mon, 28 Nov 2022 03:43:36 +0000 Gerrit-HasComments: Yes
