[ https://issues.apache.org/jira/browse/PHOENIX-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15301559#comment-15301559 ]
James Taylor commented on PHOENIX-2941: --------------------------------------- The new UPDATE_CACHE_FREQUENCY property[1] available in 4.7 greatly reduces the RPC traffic, but I think there are some simple changes we can make that'll propagate schema changes in an acceptable manner: - Set a reasonable default UPDATE_CACHE_FREQUENCY default value for tables (~5 minutes). - When any MetaDataEntityNotFoundException exception is thrown, make an RPC to get the latest table definition. This would cover the case of one client adding a table, column, view, schema, sequence, and hinted index with another client not having this info available in it's client-side cache. - This would not cover one client dropping a column or view and another client accessing it (if the second client has the information cached) until the metadata expires from the client-side cache. It would cover the case of a table being dropped, since we get an HBase exception in this case. In my experience, dropping metadata is not that common, as it causes b/w compat issues (often times this operations would be disallowed on production systems). Even if writes to deleted columns occur, it typically doesn't cause harm as the data won't be retrievable once the cache expires. FWIW, transactional tables have a different means of propagating metadata changes, relying on the transaction manager and some read/write fences, so we can handle the drop through this mechanism in this case. - Get rid of the server-side metadata cache completely - Make the SYSTEM.CATALOG table transactional - Have a separate SYSTEM.VIEW table that stores views The last two items are somewhat orthogonal, but they're enabled by getting rid of the server-side metadata cache. Thoughts? [1] https://phoenix.apache.org/#Altering > Alternative means of propagating schema changes > ----------------------------------------------- > > Key: PHOENIX-2941 > URL: https://issues.apache.org/jira/browse/PHOENIX-2941 > Project: Phoenix > Issue Type: Improvement > Reporter: Nick Dimiduk > > The current approach to propagating schema changes (ie, add column) involves > maintaining a > [GlobalCache|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/cache/GlobalCache.java] > of table schema on both clients and in RS coprocessors. This schema > information is versioned, and query timestamp is used to determine when the > cache is considered stale and needs updated. This causes problems for users > who specify a timestamp either via connection settings (ie, PHOENIX-2607) or > using the ROW_TIMESTAMP feature. Presumably this will also negatively impact > users of the Tephra transaction system as it uses the cell timestamp to store > transaction id. > We need some other means of propagating schema changes throughout the > cluster. One approach might be a ZK node for each table that can notify > coprocessors (and clients?) that their cache is stale. -- This message was sent by Atlassian JIRA (v6.3.4#6332)