Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/20607 )
Change subject: KUDU-613: Introduce SLRU cache ...................................................................... Patch Set 17: (16 comments) A few more high-level observations. http://gerrit.cloudera.org:8080/#/c/20607/16//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20607/16//COMMIT_MSG@22 PS16, Line 22: Lookups/sec It would be nice to mentions the following details: * What type of build it was? * What sort of machine it was: CPU cores and frequency? It's crucial to measure this for RELEASE builds, we aren't interested to look at other types of builds here, I guess. http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/cache-bench.cc File src/kudu/util/cache-bench.cc: http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/cache-bench.cc@92 PS16, Line 92: ret+= " SLRU"; nit for here and elsewhere: separate variable and the operator by space http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/cache.h File src/kudu/util/cache.h: http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/cache.h@520 PS17, Line 520: // Returns true if probationary segment contains key. : // Returns false if not. : // Only used for SLRU cache. : virtual bool ProbationaryContains(const Slice& key) = 0; : : // Returns true if protected segment contains key. : // Returns false if not. : // Only used for SLRU cache. : virtual bool ProtectedContains(const Slice& key) = 0; Similar to SetSegmentMetrics(), this doesn't seem to be a place for these methods. http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/cache.h File src/kudu/util/cache.h: http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/cache.h@58 PS16, Line 58: Protects against random/long reads polluting cache. nit: consider either making this part of the comment more generic (as it should be at this level) or dropping this part altogether http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/cache.h@91 PS16, Line 91: protected_segment nit: should be rather named 'in_protected_segment'? http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/cache.h@116 PS16, Line 116: sanitize style nit: since this isn't an accessor-like method, its name should be capitalized http://gerrit.cloudera.org:8080/#/c/20607/14/src/kudu/util/cache.h File src/kudu/util/cache.h: http://gerrit.cloudera.org:8080/#/c/20607/14/src/kudu/util/cache.h@228 PS14, Line 228: // ensure this is safe, e.g. by destructing the entity that owned the : // original metrics. : kReset, : }; : // Set the cache metrics to update corresponding counters accordingly. : virtual void SetMetrics(std::unique_ptr<CacheMetrics> metrics, : ExistingMetricsPolicy metrics_policy) = 0; : : // Sets probationary and protected segments metrics to update corresponding counters accordingly. : // : // An upgrade of one entry will increment the number of evictions from the probationary segment by : // one and the number of insertions into the protected segment by one. : // : // A downgrade of one entry will increment the number of evictions from the protected segment by : // one and the number of insertions into the probationary segment by one. : // : // To calculate the number of inserts for the SLRU cache, subtract the number of evictions : // of the protected segment from the number of inserts of the probationary segment. : // > This will be used in the block cache to set the metrics for both segments o OK. But even with that, instead of polluting the interface of the base Cache class you could make the BlockCache class to be a template by the type of the underlying cache implementation, and remove both SetMetrics() and SetSegmentMetrics() from the common Cache interface. http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/nvm_cache.h File src/kudu/util/nvm_cache.h: http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/nvm_cache.h@46 PS16, Line 46: // Number of lookups for this entry. Used for upgrading to protected segment in SLRU cache. : // Resets to 0 when moved between probationary and protected segments in both directions. : uint32_t lookups; Is this really needed? I didn't see NVM-based implementation of SLRU cache, so this looks a bit strange given you decided not to converge on {S}LRUHandle structure http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/slru_cache.h File src/kudu/util/slru_cache.h: http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/slru_cache.h@40 PS16, Line 40: NewSLRUCache Instead, is it possible to use the same approach as in cache.cc where it's just NewCache<> specialized for particular set of template parameters? http://gerrit.cloudera.org:8080/#/c/20607/14/src/kudu/util/slru_cache.h File src/kudu/util/slru_cache.h: http://gerrit.cloudera.org:8080/#/c/20607/14/src/kudu/util/slru_cache.h@40 PS14, Line 40: Cache > The ShardedSLRUCache is local to the slru_cache.cc file. But for the logic ShardedSLRUCache inherits from Cache, so the returned ShardedSLRUCache* could be used in all scenarios that accept Cache*. Meanwhile, all the methods specific to ShardedSLRUCache could be moved out from the Cache interface (e.g. SetSegmentMetrics()). http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/slru_cache.cc File src/kudu/util/slru_cache.cc: http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/slru_cache.cc@57 PS16, Line 57: class SLRUCacheShard { Did you consider using the CacheShard template, templatized by EvictionPolicy instead of duplicating significant amount of the code? Particular methods might be added using std::enable_if, and particular implementations might be specialized. You could take a look at how std::enable_if is used in current Kudu code. http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/slru_cache.cc@401 PS16, Line 401: Cache::RLHandle* e; : e = table_.Remove(key, hash); nit here and below: define 'e' along with assigning the value at the same line http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc File src/kudu/util/slru_cache.cc: http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@57 PS17, Line 57: SLRUCacheShard Does it make sense to explicitly forbid copying and assigning of this class' instances? http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@434 PS17, Line 434: lookups_threshold Why to pass this as a parameter instead of having it as a constant for SLRUCacheShardPair instance, similar to the capacity settings? http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@441 PS17, Line 441: std::unique_ptr<SLRUCacheShard> probationary_shard_; : std::unique_ptr<SLRUCacheShard> protected_shard_; What's the reason of wrapping instances of SLRUCacheShard into std::unique_ptr? FWIW, these two are always instantiated in the constructor of the SLRUCacheShardPair class, and cannot be nullptr in any case, so why not to have them on the stack? http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@460 PS17, Line 460: probationary_shard_.reset(nullptr); : protected_shard_.reset(nullptr); This isn't needed. -- To view, visit http://gerrit.cloudera.org:8080/20607 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I45531534a2049dd38c002f4dc7982df9fd46e0bb Gerrit-Change-Number: 20607 Gerrit-PatchSet: 17 Gerrit-Owner: Mahesh Reddy <[email protected]> Gerrit-Reviewer: Abhishek Chennaka <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Attila Bukor <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy <[email protected]> Gerrit-Reviewer: Marton Greber <[email protected]> Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Fri, 10 May 2024 20:40:01 +0000 Gerrit-HasComments: Yes
