Adar Dembo has posted comments on this change.

Change subject: KUDU-680. bloomfile: switch to a threadlocal cache
......................................................................


Patch Set 4:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/6569/4/src/kudu/cfile/bloomfile.cc
File src/kudu/cfile/bloomfile.cc:

Line 175
I imagine mem_consumption_ was here to handle the change in memory consumption 
in InitOnce(). But, now there are no calls to 
memory_footprint_excluding_reader(), which means the memory consumption of 
BloomFileReader isn't being tracked by anyone. Seems like it should be, in 
CFileSet perhaps?


Line 267:   // If we didn't hit in the cache, make a new cache entry and 
instantiate a reader.
Based on how you're using the ThreadLocalCache, I think a single 
LookupOrInsertNew() that returns a pair (object + was it created or looked up) 
would be cleaner than separate Lookup() and InsertNew() methods. It'd also 
reduce the number of modulo calculations from 5 to 4 (one from each iteration 
in Lookup() and one from InsertNew()).


http://gerrit.cloudera.org:8080/#/c/6569/4/src/kudu/util/mt-threadlocal-test.cc
File src/kudu/util/mt-threadlocal-test.cc:

Line 329: TEST_F(ThreadLocalTest, TestThreadLocalCache) {
What about a multi-threaded test to verify that it's really a thread-local 
cache? That is, if you insert with the same key across threads, you don't wind 
up with the same item.


http://gerrit.cloudera.org:8080/#/c/6569/4/src/kudu/util/threadlocal_cache.h
File src/kudu/util/threadlocal_cache.h:

PS4, Line 30: // This only ensures that the ThreadLocalCache can properly 
destruct
            : // entries when the thread exits.
I don't think I understand this. Suppose ThreadLocalCache were templatized on T 
(the type of element from Lookup() and InsertNew()). Wouldn't that also 
guarantee that T's destructor runs on every element when the ThreadLocalCache 
is destroyed?


Line 53:   static constexpr int kItemCapacity = 4;
It must also be <= sizeof last_hit_.


Line 66:     for (int i = 0; i < kItemCapacity; i++, last_hit_++) {
Would it be more performant to:
1. Only modify last_hit_ on L69.
2. Calculate the index on L67 as (last_hit_ + i) % kItemCapacity.

This has the added clarity advantage of mutating last_hit_ only on a real hit.


PS4, Line 67: %
As per L52, didn't you intend to use bit shifting here?


PS4, Line 80: %
Not bit shifting?


Line 103:   using EntryPair = std::pair<Key, 
std::unique_ptr<ThreadLocalCacheable>>;
Would it be possible to avoid allocations entirely? Perhaps by templatizing the 
cache on T, and declaring the array as being of type T, or pair<Key, T>, or 
pair<Key, optional<T>> (if you don't want to force T to have some sort of 
default initialization).


-- 
To view, visit http://gerrit.cloudera.org:8080/6569
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id8efd7f52eb376de2a9c445458827721806d9da8
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes

Reply via email to