Adar Dembo has posted comments on this change. Change subject: KUDU-680. bloomfile: switch to a threadlocal cache ......................................................................
Patch Set 4: (9 comments) http://gerrit.cloudera.org:8080/#/c/6569/4/src/kudu/cfile/bloomfile.cc File src/kudu/cfile/bloomfile.cc: Line 175 I imagine mem_consumption_ was here to handle the change in memory consumption in InitOnce(). But, now there are no calls to memory_footprint_excluding_reader(), which means the memory consumption of BloomFileReader isn't being tracked by anyone. Seems like it should be, in CFileSet perhaps? Line 267: // If we didn't hit in the cache, make a new cache entry and instantiate a reader. Based on how you're using the ThreadLocalCache, I think a single LookupOrInsertNew() that returns a pair (object + was it created or looked up) would be cleaner than separate Lookup() and InsertNew() methods. It'd also reduce the number of modulo calculations from 5 to 4 (one from each iteration in Lookup() and one from InsertNew()). http://gerrit.cloudera.org:8080/#/c/6569/4/src/kudu/util/mt-threadlocal-test.cc File src/kudu/util/mt-threadlocal-test.cc: Line 329: TEST_F(ThreadLocalTest, TestThreadLocalCache) { What about a multi-threaded test to verify that it's really a thread-local cache? That is, if you insert with the same key across threads, you don't wind up with the same item. http://gerrit.cloudera.org:8080/#/c/6569/4/src/kudu/util/threadlocal_cache.h File src/kudu/util/threadlocal_cache.h: PS4, Line 30: // This only ensures that the ThreadLocalCache can properly destruct : // entries when the thread exits. I don't think I understand this. Suppose ThreadLocalCache were templatized on T (the type of element from Lookup() and InsertNew()). Wouldn't that also guarantee that T's destructor runs on every element when the ThreadLocalCache is destroyed? Line 53: static constexpr int kItemCapacity = 4; It must also be <= sizeof last_hit_. Line 66: for (int i = 0; i < kItemCapacity; i++, last_hit_++) { Would it be more performant to: 1. Only modify last_hit_ on L69. 2. Calculate the index on L67 as (last_hit_ + i) % kItemCapacity. This has the added clarity advantage of mutating last_hit_ only on a real hit. PS4, Line 67: % As per L52, didn't you intend to use bit shifting here? PS4, Line 80: % Not bit shifting? Line 103: using EntryPair = std::pair<Key, std::unique_ptr<ThreadLocalCacheable>>; Would it be possible to avoid allocations entirely? Perhaps by templatizing the cache on T, and declaring the array as being of type T, or pair<Key, T>, or pair<Key, optional<T>> (if you don't want to force T to have some sort of default initialization). -- To view, visit http://gerrit.cloudera.org:8080/6569 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id8efd7f52eb376de2a9c445458827721806d9da8 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
