Zihao Ye has posted comments on this change. ( http://gerrit.cloudera.org:8080/19532 )
Change subject: IMPALA-11904: Data cache support dumping for reloading ...................................................................... Patch Set 9: (4 comments) Hi Joe, thanks for the code review. I've fixed some issues, and I still work about the custom cluster test. In the meantime, it would be better if we could solve IMPALA-10971 first, as this issue affects the correctness of some metrics (cache size and number of entries) and could cause tests to fail. If you don't have time to deal with it, you could assign it to me, and I'll fix it as soon as possible, thanks. http://gerrit.cloudera.org:8080/#/c/19532/9/be/src/runtime/io/data-cache.cc File be/src/runtime/io/data-cache.cc: http://gerrit.cloudera.org:8080/#/c/19532/9/be/src/runtime/io/data-cache.cc@131 PS9, Line 131: DEFINE_bool(data_cache_enable_loading, false, : "(Experimental) With loading enabled, data cache will try to load the previously dump" : " file (created by the dumping feature enabled by the 'data_cache_enable_dumping') " : "before create a empty cache. if loading fails, it will proceed with regular " : "initialization."); > I'm thinking that we can combine data_cache_enable_dumping and data_cache_e Done http://gerrit.cloudera.org:8080/#/c/19532/9/be/src/runtime/io/data-cache.cc@684 PS9, Line 684: // Check if there is enough space available at this point in time. : uint64_t available_bytes; : RETURN_IF_ERROR(FileSystemUtil::GetSpaceAvailable(path_, &available_bytes)); : if (available_bytes < capacity_) { : const string& err = Substitute("Insufficient space for $0. Required $1. Only $2 is " : "available", path_, PrettyPrinter::PrintBytes(capacity_), : PrettyPrinter::PrintBytes(available_bytes)); : LOG(ERROR) << err; : return Status(err); : } > I think this check can fire if we have the cache on its own device with the Done. The current check now takes into account the size of existing cached data, but it relies on TOTAL_BYTES, as mentioned earlier. It is better to first solve IMPALA-10971 to prevent any potential issues. http://gerrit.cloudera.org:8080/#/c/19532/9/be/src/runtime/io/data-cache.cc@1056 PS9, Line 1056: Status DataCache::Partition::Load() { > I think we want to delete files that are not listed in the metadata. It sho Done http://gerrit.cloudera.org:8080/#/c/19532/9/be/src/util/cache/cache.h File be/src/util/cache/cache.h: http://gerrit.cloudera.org:8080/#/c/19532/9/be/src/util/cache/cache.h@288 PS9, Line 288: Traverse > Nit: Since we are exporting all entries as a single vector, let's call this Done -- To view, visit http://gerrit.cloudera.org:8080/19532 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id867f4fc7343898e4906332c3caa40eb57a03101 Gerrit-Change-Number: 19532 Gerrit-PatchSet: 9 Gerrit-Owner: Zihao Ye <[email protected]> Gerrit-Reviewer: David Rorke <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Zihao Ye <[email protected]> Gerrit-Comment-Date: Wed, 10 May 2023 07:59:50 +0000 Gerrit-HasComments: Yes
