GitHub user PavelVesely added a comment to the discussion: What large-scale data or ML challenges are you facing?
> A common requirement in cache replacement algorithms is tracking element > frequency and recency. Frequency can be directly computed using countmin, > which has been proven effective in LFU-based algorithms, including variants > with windows and decay. However, I'm uncertain about how to properly design > and utilize a sketch to track recency. Previously, I calculated it by > combining a countmin sketch with the length of a virtual queue. Tracking recency for cache replacement policies sounds like an interesting problem but I'm not quite sure about the setting or which properties are desirable. If a cache just stores all the items, then it can also store a timestamp (to get recency for LRU) or frequency (for LFU) at the cost of using additional memory proportional to the cache size. - Is it the goal to use additional memory *sublinear* in the cache size? - What kind of error is tolerable for recency? For instance, is it fine to return an item accessed 1 s ago for the last, while there is an item without an access for 1.5 seconds? GitHub link: https://github.com/apache/datasketches-rust/discussions/64#discussioncomment-15509769 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
