Michael Ho created IMPALA-8690:
----------------------------------

             Summary: Better eviction algorithm for data cache
                 Key: IMPALA-8690
                 URL: https://issues.apache.org/jira/browse/IMPALA-8690
             Project: IMPALA
          Issue Type: Improvement
          Components: Backend
    Affects Versions: Impala 3.3.0
            Reporter: Michael Ho


With the current implementation of data cache, all data access will be cached 
regardless of the access pattern. The current LRU eviction algorithm is not 
resistant to scan traffic so in case some users scan a big fact table, a lot of 
the heavily accessed items will be evicted inevitably. We should adopt better 
eviction algorithm (e.g. LRFU or some other well known ones in the literature). 
Would be nice to evaluate it against some users' traces now that IMPALA-8542 is 
fixed.

In the short run, we probably need some workaround (e.g. query hints to disable 
caching for certain tables). Will file a separate jira for it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to