[jira] [Commented] (PHOENIX-6761) Phoenix Client Side Metadata Caching Improvement

Kadir Ozdemir (Jira) Tue, 02 Aug 2022 14:50:08 -0700


    [ 
https://issues.apache.org/jira/browse/PHOENIX-6761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17574410#comment-17574410
 ]


Kadir Ozdemir commented on PHOENIX-6761:
----------------------------------------

As implied by the description section, it is an overkill to use a separate full 
cache for Phoenix connections and the current cache design is inefficient and 
expensive. To address all these issues, we propose eliminating connection level 
caching and leveraging a well-known thread safe efficient caching library which 
is Guava Cache from Google to implement the CQSI level cache. Please note that 
Phoenix already uses Guava Cache to implement the cache of CQSI objects. 

The current cache implementation uses the total memory footprint of the table 
objects in the cache to determine when to evict. Guava Cache supports this type 
of use cases by allowing a weight value for each cache entry and the maximum 
total weight for the cache to be used to determine when to evict.

Currently the tables with zero cache update frequency are retrieved from the 
server each time they are accessed for a query or mutation even within the same 
Phoenix connection. After every retrieval from the server, the old cache table 
ref is removed from the cache and the new one is inserted unnecessarily. 
Another obvious improvement is that tables with zero cache frequency should not 
be inserted into the cache.

> Phoenix Client Side Metadata Caching Improvement
> ------------------------------------------------
>
>                 Key: PHOENIX-6761
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6761
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Kadir Ozdemir
>            Assignee: Kadir Ozdemir
>            Priority: Major
>
> CQSI maintains a client-side metadata cache, i.e., schemas, tables, and 
> functions, that evicts the last recently used table entries when the cache 
> size grows beyond the configured size.
> Each time a Phoenix connection is created, the client-side metadata cache 
> maintained by the CQSI object creating this connection is cloned for the 
> connection. Thus, we have two levels of caches, one at the Phoenix connection 
> level and the other at the CQSI level. 
> When a Phoenix client needs to update the client side cache, it updates both 
> caches (on the connection object and on the CQSI object). The Phoenix client 
> attempts to retrieve a table from the connection level cache. If this table 
> is not there then the Phoenix client does not check the CQSI level cache, 
> instead it retrieves the object from the server and finally updates both the 
> connection and CQSI level cache.
> PMetaDataCache provides caching for tables, schemas and functions but it 
> maintains separate caches internally, one cache for each type of metadata. 
> The cache for the tables is actually a cache of PTableRef objects. PTableRef 
> holds a reference to the table object as well as the estimated size of the 
> table object, the create time, last access time, and resolved time. The 
> create time is set to the last access time value provided when the PTableRef 
> object is inserted into the cache. The resolved time is also provided when 
> the PTableRef object is inserted into the cache. Both the created time and 
> resolved time are final fields (i.e., they are not updated). PTableRef 
> provide a setter method to update the last access time. PMetaDataCache 
> updates the last access time whenever the table is retrieved from the cache. 
> The LRU eviction policy is implemented using the last access time. The 
> eviction policy is not implemented for schemas and functions. The 
> configuration parameter for the frequency of updating cache is 
> phoenix.default.update.cache.frequency. This can be defined at the cluster or 
> table level. When it is set to zero, it means cache would not be used.
> Obviously the eviction of the cache is to limit the memory consumed by the 
> cache. The expected behavior is that when a table is removed from the cache, 
> the table (PTableImpl) object is also garbage collected. However, this does 
> not really happen because multiple caches make references to the same object 
> and each cache maintains its own table refs and thus access times. This means 
> that the access time for the same table may differ from one cache to another; 
> and when one cache can evict an object, another cache will hold on the same 
> object. 
> Although individual caches implements the LRU eviction policy, the overall 
> memory eviction policy for the actual table objects is more like age based 
> cache. If a table is frequently accessed from the connection level caches, 
> the last access time maintained by the corresponding table ref objects for 
> this table will be updated. However, these updates on the access times will 
> not be visible to the CQSI level cache. The table refs in the CQSI level 
> cache have the same create time and access time. 
> Since whenever an object is inserted into the local cache of a connection 
> object, it is also inserted the cache on the CSQI object, the CQSI level 
> cache will grow faster than the caches on the connection objects. When the 
> cache reaches its maximum size, the newly inserted tables will result in 
> evicting one of the existing tables in the cache. Since the access time of 
> these tables are not updated on the CQSI level cache, it is likely that the 
> table that has stayed in the cache for the longest period of time will be 
> evicted (regardless of whether the same table is frequently accessed via the 
> connection level caches). This obviously defeats the purpose of an LRU cache.
> Another problem with the current cache is related to the choice of its 
> internal data structures and its eviction implementation. The table refs in 
> the cache are maintained in a hash map which maps a table key (which is pair 
> of a tenant id and table name) to a table ref. When the size of a cache (the 
> total byte size of the table objects referred by the cache) reaches its 
> configured limit, how much overage adding a new table would cause is 
> computed. Then all the table refs in this cache are cloned into a priority 
> queue as well as a new cache. This queue uses the access time to determine 
> the order of its elements (i.e., table refs). The table refs that should not 
> be evicted are removed from the queue, which leaves the table refs to be 
> evicted in the queue. Finally, the table refs left in the queue are removed 
> from the new cache. The new cache replaces the old one. It clear that this is 
> an expensive operation in terms of memory allocations and CPU time. The bad 
> news is that when the cache reaches its limit, every insertion would likely 
> cause an eviction and this expensive operation will be repeated for each such 
> insertion.
> Since Phoenix connections are supposed to be short lived, maintaining a 
> separate cache for each connection object and especially cloning entire cache 
> content (and then pruning the entries belonging to other tenants when the 
> connection is a tenant specific connection) are not justified. The cost of 
> such a clone operation by itself would offset the gain of not accessing the 
> CQSI level cache as the number of such accesses per connection should be 
> small because of short lived Phoenix connections. 
> Also the impact of Phoenix connection leaks, the connections that are not 
> closed by applications and simply long lived connections will be exacerbated 
> since these connections will have references to the large set of table 
> objects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PHOENIX-6761) Phoenix Client Side Metadata Caching Improvement

Reply via email to