Yida Wu has posted comments on this change. ( http://gerrit.cloudera.org:8080/19181 )
Change subject: IMPALA-11470: Add Cache For Codegen Functions ...................................................................... Patch Set 9: (12 comments) Thanks Qifan. http://gerrit.cloudera.org:8080/#/c/19181/8/be/src/codegen/llvm-codegen-cache.h File be/src/codegen/llvm-codegen-cache.h: http://gerrit.cloudera.org:8080/#/c/19181/8/be/src/codegen/llvm-codegen-cache.h@69 PS8, Line 69: d as the has > It seems this struct is used only for the operator() defined below. Can the Added comments. The structure is used for customized unordered set. It is a traditional way that I can find, I will try if I can write it in a more fancy way. But even not, this way should be okay. http://gerrit.cloudera.org:8080/#/c/19181/8/be/src/codegen/llvm-codegen-cache.h@70 PS8, Line 70: ruct H > Better to match the type for HashState.h1 (uint64_t). Done http://gerrit.cloudera.org:8080/#/c/19181/8/be/src/codegen/llvm-codegen-cache.h@76 PS8, Line 76: : /// Used as the comparator function in unordered_set for struct HashCode. : struct HashCodeEqual { : bool operator()(const HashCode& lhs, const HashCode& rhs) const { : > same as the above comment for struct HashCodeHash. Added comments. Same as above. http://gerrit.cloudera.org:8080/#/c/19181/8/be/src/codegen/llvm-codegen-cache.cc File be/src/codegen/llvm-codegen-cache.cc: http://gerrit.cloudera.org:8080/#/c/19181/8/be/src/codegen/llvm-codegen-cache.cc@162 PS8, Line 162: override the existing entry > It is possible to use a strategy to retain a better entry than just taking For normal procedure, the caller would call a Lookup() before StoreInternal(), in most of the cases, we won't override the existing entry with the same key. There could be a way to avoid this, for example, the caller holds a lock during Lookup() and StoreInternal(). But it could harm the Lookup() performance (which is more usedthan Store()) and may not be necessary if the case is rare and we ensure thread-safe for the Store(). Therefore we do not expect a replacement on the entry with the same key other than the very first moment when we don't have the cache entry. http://gerrit.cloudera.org:8080/#/c/19181/8/be/src/codegen/llvm-codegen-cache.cc@171 PS8, Line 171: codegen_cache_entries_in_use_->Increment(1); : codegen_cache_entries_in_use_bytes_->Increment(mem_charge); > This math is not correct if the old entry is replaced by a new one in hash When the old entry with the same key is replaced, it will be evicted, so the math for the old entry would be taken care of in the eviction callback CodeGenCache::EvictionCallback::EvictedEntry(). In the Store(), we only need to care about the new entry. The case is covered by unit test LlvmCodeGenCacheTest::TestBasicFunction. http://gerrit.cloudera.org:8080/#/c/19181/8/be/src/codegen/llvm-codegen-cache.cc@183 PS8, Line 183: key_to_insert_it > nit keys_to_insert_it Done http://gerrit.cloudera.org:8080/#/c/19181/8/be/src/codegen/llvm-codegen-cache.cc@185 PS8, Line 185: ler to avoid multiple handles fo > Can you please explain this logic maybe with some examples? Added comments. I think one case is that when we run two same queries at the same time right after the system goes up, so there is no cache at all, two look up at the same time, and fail to get any cache, so both try to insert the codegen to the cache. Since they have the same key, we assume they have the same entry, it won't hurt if we only allow the first one to go in. http://gerrit.cloudera.org:8080/#/c/19181/8/be/src/codegen/llvm-codegen-cache.cc@188 PS8, Line 188: read is allowed > nit. keys_to_insert Done http://gerrit.cloudera.org:8080/#/c/19181/8/be/src/codegen/llvm-codegen.h File be/src/codegen/llvm-codegen.h: http://gerrit.cloudera.org:8080/#/c/19181/8/be/src/codegen/llvm-codegen.h@194 PS8, Line 194: suc > nit. This may not sound like a good name. Done http://gerrit.cloudera.org:8080/#/c/19181/8/be/src/codegen/llvm-codegen.h@715 PS8, Line 715: reduc > nit. reduce Done http://gerrit.cloudera.org:8080/#/c/19181/8/be/src/codegen/llvm-codegen.h@766 PS8, Line 766: /// The generation of the string is simply the concatenation of all the names by > May need to briefly describe how such a name list is obtained, in particula Done http://gerrit.cloudera.org:8080/#/c/19181/8/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/19181/8/common/thrift/ImpalaService.thrift@747 PS8, Line 747: // insert a full key for the cache and allow more statistics, otherwise, a hashcode > of mere 64 bytes Added comments. Is it 128 bits? The size of CodeGenCacheKey::HashCode. -- To view, visit http://gerrit.cloudera.org:8080/19181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If42c78a7f51fd582e5fe331fead494dadf544eb1 Gerrit-Change-Number: 19181 Gerrit-PatchSet: 9 Gerrit-Owner: Yida Wu <[email protected]> Gerrit-Reviewer: Abhishek Rawat <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]> Gerrit-Comment-Date: Fri, 18 Nov 2022 02:54:27 +0000 Gerrit-HasComments: Yes
