Ben Manes created HIVE-14983:

             Summary: Evaluate TinyLFU cache
                 Key: HIVE-14983
             Project: Hive
          Issue Type: Improvement
          Components: llap
            Reporter: Ben Manes

The ORC low-level cache is either FIFO or LRFU, with the latter being the 
default. Least-Recently-Freq-Used is an O(lg n) policy that tries to recency 
with frequency for the working set. It uses a heap for frequency ordering and a 
linked list for recency ordering.

[TinyLFU|] is an O(1) policy that uses a 
sketch to probabilistically estimate an entry's frequency. Instead of focusing 
on eviction, the policy filters out low-value entries from entering the cache. 
It retains a larger history than the working set by retaining the frequency in 
a compact counter array (e.g. [4-bit CountMin 
 Simulations show that a small LRU window + TinyLFU + SLRU main cache has the 
[best performance|].

If the project is ready to adopt Java 8 then it could use 
[Caffeine|], the successor to Guava's 
cache. It provides improved 
[concurrency|] in addition 
to the higher hit rate.

But I think extracting the project's CountMin Sketch for a custom 
implementation would be a win. The result would be simpler code and an improved 
hit rates.

This message was sent by Atlassian JIRA

Reply via email to