[
https://issues.apache.org/jira/browse/PHOENIX-5315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bharath Vissapragada reassigned PHOENIX-5315:
---------------------------------------------
Assignee: Bharath Vissapragada
> Cross cluster replication of the base table only should be sufficient
> ---------------------------------------------------------------------
>
> Key: PHOENIX-5315
> URL: https://issues.apache.org/jira/browse/PHOENIX-5315
> Project: Phoenix
> Issue Type: Improvement
> Reporter: Andrew Kyle Purtell
> Assignee: Bharath Vissapragada
> Priority: Major
>
> When replicating Phoenix tables using the HBase cross cluster replication
> facility, it should be sufficient (and must, for correctness and avoidance of
> race conditions and inconsistencies) to replicate the base table only. On the
> sink cluster the replication client's application of mutations from the
> replication stream to the local base table should trigger all necessary index
> update operations. To the extent that won't happen now due to implementation
> details, those details should be reworked.
> This also has important efficiency benefits: no matter how many indexes are
> defined for a base table, only the base table updates need be replicated
> (presuming Phoenix schema is synchronized over all sites by some other
> external means).
> This would likely constitute multiple components, so we should use this issue
> as an umbrella. We'd need:
> # A Phoenix implementation of HBase's ReplicationEndpoint that tails the WAL
> like a normal replication endpoint. However, rather than writing to HBase's
> replication sink APIs (which create HBase RPCs to a remote cluster), they
> should write to a new Phoenix Endpoint coprocessor.
> # An HBase coprocessor Endpoint hook that takes in a request from a remote
> cluster (containing both the WALEdit's data and the WALKey's annotated
> metadata telling the remote cluster what tenant_id, logical tablename, and
> timestamp the data is associated with). Ideally the API's message format
> should be configurable, and could be either a protobuf or an Avro schema
> similar to the one described by PHOENIX-5443. The endpoint hook would take
> the metadata + data and regenerate a complete set of Phoenix mutations, both
> data and indexes, just as the phoenix client did for the original SQL
> statement that generated the source-side edits. These mutations would be
> written to the remote cluster by the normal Phoenix write path.
> (Unfortunately, HBase uses the term "endpoint" to mean both a replication
> plugin, AND a stored-procedure-like coprocessor hook. To be clear, 1 is a
> replication plugin, 2 is a coprocessor hook)
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)