[jira] [Commented] (PHOENIX-6883) Phoenix metadata caching redesign

Istvan Toth (Jira) Tue, 21 Feb 2023 00:26:40 -0800


    [ 
https://issues.apache.org/jira/browse/PHOENIX-6883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17691456#comment-17691456
 ]


Istvan Toth commented on PHOENIX-6883:
--------------------------------------

Sounds great.

Two possible issues come to my mind:
- When a global index table changes, we may have to invalidate the cache of it 
on the RSs for the base table (not sure if we already handle this)
- We cannot really guarantee that the RS invalidation message reaches all RSs. 
We could add/keep a long-ish metadata TTL on the RSs, for eventual consistency 
in case the invalidation fails.

I assume that the RS cache invalidation would be initiated by the SYSCAT RSs, 
not the client. 

> Phoenix metadata caching redesign
> ---------------------------------
>
>                 Key: PHOENIX-6883
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6883
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Kadir Ozdemir
>            Priority: Major
>
> PHOENIX-6761 improves the client side metadata caching by eliminating the 
> separate cache for each connection. This improvement results in memory and 
> compute savings since it eliminates copying CQSI level cache every time a 
> Phoenix connection is created, and also replaces the inefficient the CQSI 
> level cache implementation with Guava Cache from Google. 
> Despite this improvement, the overall metadata caching architecture begs for 
> redesign. This is because every operation in Phoenix need to make multiple 
> RPCs to metadata servers for the SYSTEM.CATALOG table (please see 
> PHOENIX-6860) to ensure the latest metadata changes are visible to clients. 
> These constant RPCs makes the region servers serving SYSTEM.CATALOG hot spot 
> and thus leads to poor performance and availability issues.
> The UPDATE_CACHE_FREQUENCY configuration parameter specifies how frequently 
> the client cache is updated. However, setting this parameter to a non-zero 
> value results in stale caching. Stale caching can cause data integrity 
> issues. For example, if an index table creation is not visible to the client, 
> Phoenix would skip updating the index table in the write path. That's why is 
> this parameter is typically set to zero. However, this defeats the purpose of 
> client side metadata caching.
> The redesign of the metadata caching architecture is to directly address this 
> issue by making sure that the client metadata caching is always used (that 
> is, UPDATE_CACHE_FREQUENCY is set to NEVER) but still ensures the data 
> integrity. This is achieved by three main changes. 
> The first change is to introduce server side metadata caching in all region 
> servers. Currently, the server side metadata caching is used on the region 
> servers serving SYSTEM.CATALOG. This metadata caching should be strongly 
> consistent such that the metadata updates should include invalidating the 
> corresponding entries on the server side caches. This would ensure the server 
> cache would not become stale.
> The second change is that the Phoenix client passes the LAST_DDL_TIMESTAMP 
> table attribute along with scan and mutation operations to the server regions 
> (more accurately to the Phoenix coprocessors). Then the Phoenix coprocessors 
> would check the timestamp on a given operation against with the timestamp in 
> its server side cache to validate that the client did not use stale metadata 
> when it prepared the operation. If the client did use stale metadata then the 
> coprocessor would return an exception (this exception can be called 
> StaleClientMetadataCacheException) to the client.
> The third change is that upon receiving StaleClientMetadataCacheException the 
> Phoenix client makes an RPC call to the metadata server to update the client 
> cache, reconstruct the operation with the updated cached, and retry the 
> operation.
> This redesign would require updating client and server metadata caches only 
> when metadata is stale instead of updating the client metadata cache for each 
> (scan or mutation) operation. This would eliminate hot spotting on the 
> metadata servers and thus poor performance and availability issues caused by 
> this hot spotting.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PHOENIX-6883) Phoenix metadata caching redesign

Reply via email to