Mahesh Reddy has posted comments on this change. ( http://gerrit.cloudera.org:8080/20607 )
Change subject: KUDU-613: Introduce SLRU cache ...................................................................... Patch Set 18: (29 comments) http://gerrit.cloudera.org:8080/#/c/20607/16//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20607/16//COMMIT_MSG@22 PS16, Line 22: s and 2.6GHz > It would be nice to mentions the following details: This was a DEBUG build run locally on macOS. 6 cores and 2.6 GHz. I'll try to run it on a release build on a linux machine, but having some issues with va1022 right now. http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/cache-bench.cc File src/kudu/util/cache-bench.cc: http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/cache-bench.cc@92 PS16, Line 92: ret += " SLRU"; > nit for here and elsewhere: separate variable and the operator by space Done http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/cache.h File src/kudu/util/cache.h: http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/cache.h@520 PS17, Line 520: // Returns true if probationary segment contains key. : // Returns false if not. : // Only used for SLRU cache. : virtual bool ProbationaryContains(const Slice& key) = 0; : : // Returns true if protected segment contains key. : // Returns false if not. : // Only used for SLRU cache. : virtual bool ProtectedContains(const Slice& key) = 0; > SLRUCacheShard is the right place for these, even without any extras, I don I guess the issue stems from returning a pointer to Cache rather than ShardedSLRUCache. If we return a pointer to Cache, then all methods used in the SLRU cache will have to be defined in the base class and overridden in the appropriate derived class. If we return a pointer to ShardedSLRUCache, then we don't have to define all these methods in the base class, but then the integration with block_cache and the cache tests is more difficult as they all accept a pointer to Cache for the old LRU implementation, but they don't accept a pointer to the new ShardedSLRUCache. http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/cache.h File src/kudu/util/cache.h: http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/cache.h@58 PS16, Line 58: > nit: consider either making this part of the comment more generic (as it sh Done http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/cache.h@91 PS16, Line 91: in_protected_segm > nit: should be rather named 'in_protected_segment'? Done http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/cache.h@116 PS16, Line 116: Sanitize > style nit: since this isn't an accessor-like method, its name should be cap Done http://gerrit.cloudera.org:8080/#/c/20607/14/src/kudu/util/cache.h File src/kudu/util/cache.h: http://gerrit.cloudera.org:8080/#/c/20607/14/src/kudu/util/cache.h@228 PS14, Line 228: // ensure this is safe, e.g. by destructing the entity that owned the : // original metrics. : kReset, : }; : // Set the cache metrics to update corresponding counters accordingly. : virtual void SetMetrics(std::unique_ptr<CacheMetrics> metrics, : ExistingMetricsPolicy metrics_policy) = 0; : : // Sets probationary and protected segments metrics to update corresponding counters accordingly. : // : // An upgrade of one entry will increment the number of evictions from the probationary segment by : // one and the number of insertions into the protected segment by one. : // : // A downgrade of one entry will increment the number of evictions from the protected segment by : // one and the number of insertions into the probationary segment by one. : // : // To calculate the number of inserts for the SLRU cache, subtract the number of evictions : // of the protected segment from the number of inserts of the probationary segment. : // > OK. But even with that, instead of polluting the interface of the base Cac SetMetrics() is used in other areas of the code that use a cache (file_cache, ttl_cache), so that can't be removed from the Cache interface. Similar to my comment from ProbationaryContains and ProtectedContains, this has to do with returning a pointer to Cache instead of ShardedSLRUCache so all methods will have to be defined here. http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/nvm_cache.h File src/kudu/util/nvm_cache.h: http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/nvm_cache.h@46 PS16, Line 46: Slice key() const { : return Slice(kv_data, key_length); : } > Is this really needed? I didn't see NVM-based implementation of SLRU cache Removed it for now, a follow up patch with the NVM-based implementation of SLRU cache will include this instead. http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache-test.cc File src/kudu/util/slru_cache-test.cc: http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache-test.cc@413 PS17, Line 413: evicted to probationary segment > If the probationary segment doesn't have enough free space for this new ent Yep that's right. I guess the entry being upgraded from the probationary segment will create some space in the probationary segment, but if the entry being evicted from the protected segment is larger than the entry being upgraded, then more entries from the probationary segment will have to be evicted to create space. Added a test case to test for this behavior. http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/slru_cache.h File src/kudu/util/slru_cache.h: http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/slru_cache.h@40 PS16, Line 40: NewSLRUCache > Instead, is it possible to use the same approach as in cache.cc where it's I guess then the question is what are the template parameters? It makes sense in cache.cc since there are multiple possible eviction policies (FIFO and LRU). This file is just for DRAM and for SLRU eviction policy so I don't think specializing this for template parameters makes sense since there's only one combination of parameters. Let me know if you think otherwise or if there are other benefits to doing this. http://gerrit.cloudera.org:8080/#/c/20607/14/src/kudu/util/slru_cache.h File src/kudu/util/slru_cache.h: http://gerrit.cloudera.org:8080/#/c/20607/14/src/kudu/util/slru_cache.h@40 PS14, Line 40: Cache > ShardedSLRUCache inherits from Cache, so the returned ShardedSLRUCache* cou Tried to make this change, but the cache in block_cache is a unique pointer to Cache. Given the old LRU cache returns Cache*, it makes the most sense to keep this as Cache* as it doesn't accept ShardedSLRUCache*. http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/slru_cache.cc File src/kudu/util/slru_cache.cc: http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/slru_cache.cc@57 PS16, Line 57: // A single shard of s > Did you consider using the CacheShard template, templatized by EvictionPoli It's definitely possible, however, the cache file would be even more polluted with SLRUCacheShard specific methods (ProtectedInsert(), InsertAndReturnEvicted(), ReInsert()). I also removed the use of the mutex in the SLRUCacheShard level methods since the ShardPair has a mutex now, so all methods on the shard level with a mutex would need a new method just for SLRUCacheShard. So it's definitely possible, but it would pollute the cache.cc file even more. Let me know if you think it's worth the tradeoff. http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/slru_cache.cc@254 PS16, Line 254: > What guarantees do we have to be sure that 'table_' cannot be modified unde Only SLRUCacheShardPair::Lookup() calls this method, and within it everything is protected by a mutex. So until that mutex is released, no other ShardPair method can access the underlying shards and thus the underlying table within each shard. I'll add a note about the use of the mutex in the pair level methods. http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/slru_cache.cc@267 PS16, Line 267: return reinterpret_cast<Cache > ditto: what if Insert() is running concurrently and modifying the 'table_' Its only usage in the slru_cache.cc is protected by a mutex, similar to the explanation above where it's used within a ShardPair method that's protected by a mutex. Other than that, it's only used in unit tests through ProbationaryContains and ProtectedContains.. http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/slru_cache.cc@401 PS16, Line 401: } : } > nit here and below: define 'e' along with assigning the value at the same l Done http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc File src/kudu/util/slru_cache.cc: http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@57 PS17, Line 57: ingle shard of > Does it make sense to explicitly forbid copying and assigning of this class Done http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@273 PS17, Line 273: } : > nit: could these be joined, adding a comment on what's the intention of thi Done http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@434 PS17, Line 434: on_callback); > Why to pass this as a parameter instead of having it as a constant for SLRU Done http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@441 PS17, Line 441: : private: > What's the reason of wrapping instances of SLRUCacheShard into std::unique_ Good point, removed the use of unique pointers here. http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@460 PS17, Line 460: probationary_shard_(SLRUCacheShard(mem_tracker, : > This isn't needed. Done http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@471 PS17, Line 471: // > What is the purpose of this extra scope? Removed it http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@490 PS17, Line 490: seg > nit: delete 'and' Done http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@501 PS17, Line 501: C > What is the purpose of this extra scope? Removed it http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@543 PS17, Line 543: } else { : protected_shard_.Release(handle); : } : } : : void SLRUCacheShardPair::Erase(const Slice& key, uint32_t hash) { : > What guards against the following scenarios: I don't think having the atomic bool protects us from the scenarios you described. The only shard agnostic behavior happens when the lookup handle in thread A is the last reference to the entry. Since in both scenarios you described thread B calls Lookup() thus adding another reference to the entry, both the contents and the metrics of the SLRU cache will stay consistent. http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@552 PS17, Line 552: } > What is the purpose of this extra scope? Removed it http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@561 PS17, Line 561: boo > What is the purpose of this extra scope? Removed it http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@581 PS17, Line 581: > nit: I'm not sure I understand the proposed suggestion here. A clarification would be great, thanks! http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@601 PS17, Line 601: } : shards_.clear(); : } : : UniquePendingHandle Allocate(Slice key, int val_len, int charge) override { : int key_len = key.size(); : D > Is this really needed? Wouldn't the contained unique_ptr be destroyed auto The mem tracker doesn't get properly updated unless we explicitly clear each pointer to shard within the vector. http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@694 PS17, Line 694: } : > nit: could these lines be joined? Done -- To view, visit http://gerrit.cloudera.org:8080/20607 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I45531534a2049dd38c002f4dc7982df9fd46e0bb Gerrit-Change-Number: 20607 Gerrit-PatchSet: 18 Gerrit-Owner: Mahesh Reddy <[email protected]> Gerrit-Reviewer: Abhishek Chennaka <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Attila Bukor <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy <[email protected]> Gerrit-Reviewer: Marton Greber <[email protected]> Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Thu, 16 May 2024 21:17:03 +0000 Gerrit-HasComments: Yes
