[ https://issues.apache.org/jira/browse/PHOENIX-7493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ritesh reassigned PHOENIX-7493: ------------------------------- Assignee: Ritesh (was: Jacob Isaac) > Graceful Failover with Phoenix HA > --------------------------------- > > Key: PHOENIX-7493 > URL: https://issues.apache.org/jira/browse/PHOENIX-7493 > Project: Phoenix > Issue Type: Improvement > Reporter: Kadir Ozdemir > Assignee: Ritesh > Priority: Major > > Phoenix HA (PHOENIX-6491) suggests a best effort failover process for the > failover HA policy. The first step is to make both clusters’ roles Standby, > and then wait for replication to finish (best-effort). The final step is to > make the other cluster role Active. > When the cluster role is set to Standby, the dual cluster Phoenix client does > not allow read/write operations on a standby cluster. This helps drain > replication data from the previously Active cluster to the previously Standby > cluster. However, in practice a cluster may receive changes without using the > Phoenix dual client. For example, data can be inserted through MapReduce jobs > which do not use the Phoenix JDBC client. Another example is that the > previously active cluster could be receiving replication data from a third > cluster. > This means pausing writes at the Phoenix client is not sufficient for a > graceful failover operation. Here graceful means consistent failover between > two healthy clusters. A consistent failover can be achieved only when the > replication data is completely sent to the soon to-be Active cluster. > To ensure that all incoming data is paused before the failover event, we need > to stop writing to the cluster at the server side. To achieve this, a Phoenix > coprocessor can also maintain and watch cluster role changes and stop writes > when an Active cluster becomes Standby as the dual Phoenix client does. In > order to eliminate the ambiguity on which cluster was previously Active, a > new HA role called ActiveToStandby is introduced. Both Phoenix client and > server do not allow write operations on an ActiveToStandby cluster. > With the above changes, graceful failover is achieved by the following steps > # Change the Active cluster’s role to ActiveToStandby, > # Wait for the replication data is drained > # Change the Standby cluster’s role to Active, and the ActiveToStandby > cluster’s role Standby -- This message was sent by Atlassian Jira (v8.20.10#820010)