Ádám Szita created HIVE-24736:
---------------------------------

             Summary: Make buffer tracking in LLAP cache with BP wrapper more 
accurate
                 Key: HIVE-24736
                 URL: https://issues.apache.org/jira/browse/HIVE-24736
             Project: Hive
          Issue Type: Improvement
          Components: llap
            Reporter: Ádám Szita
            Assignee: Ádám Szita


HIVE-22492 has introduced threadlocal buffers in which LlapCachableBuffer 
instances are stored before entering LRFU's heap - so that lock contention is 
eased up.

This is a nice performance improvement, but comes at the cost of losing the 
exact accounting of llap buffer instances - e.g. if user gives a purge command, 
not all the cache space is free'd up as one'd expect because purge only 
considers buffers that the policy knows about. In this case we'd see in LLAP's 
iomem servlet that the LRFU policy is empty, but a table may still have the 
full content loaded.

Also, if we use text based tables, during cache load, a set of -OrcEncode 
threads are used that are ephemeral in nature. Attaching buffers to these 
threads' thread local structures are ultimately lost. In an edge case we could 
load lots of data into the cache by reading in many distinct smaller text 
tables, whose buffers never reach LRFU policy, and hence cache hit ratio will 
be suffering as a consequence (memory manager will give up asking LRFU to 
evict, and will free up random buffers).

I propose we try and track the amount of data stored in the BP wrapper 
threadlocals, and flush them into the heap as a first step of a purge request. 
This will enhance supportability.
We should also replace the ephemeral OrcEncode threads with a thread pool, that 
could actually serve as small performance improvement on its own by saving time 
and memory to deal with thread lifecycle management.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to