yuzegao opened a new pull request, #3295: URL: https://github.com/apache/kvrocks/pull/3295
## Summary This PR implements graceful failover for Kvrocks cluster, allowing a master node to safely transfer control to a slave node while ensuring data consistency and minimizing service disruption. ## Background Based on [GitHub Discussion #3218](https://github.com/apache/kvrocks/discussions/3218), this feature enables controlled master-to-slave failover with: - Data consistency guarantee (waits for replication sync) - Write blocking during critical phases - Configurable timeout - State machine-based async execution ## Implementation ### Architecture - **Independent module**: `ClusterFailover` class, parallel to `SlotMigrator` - **Async execution**: Dedicated background thread for failover process - **State machine**: 8 states (`none` → `started` → `check_slave` → `pause_write` → `wait_sync` → `switching` → `success`/`failed`) ### Key Features 1. **Slave Validation**: Checks connection status, replication speed, and lag before proceeding 2. **Write Blocking**: Blocks write requests during `pause_write`, `wait_sync`, and `switching` states (returns `TRYAGAIN`) 3. **Replication Sync**: Waits for slave to catch up to target sequence number 4. **Takeover**: Sends `CLUSTERX TAKEOVER` command to slave with authentication support 5. **Slot Redirection**: Marks all slots as migrated, redirects clients via `MOVED` errors ### Commands - `CLUSTERX FAILOVER <slave-node-id> [timeout]` - Initiate failover (default timeout: 1000ms) - `CLUSTER INFO` - Now includes `cluster_failover_state:<state>` ### Files Changed **New Files**: - `src/cluster/cluster_failover.h` / `cluster_failover.cc` - Core implementation (325 lines) - `tests/gocase/integration/failover/failover_test.go` - Test suite (926 lines) - `GRACEFUL_FAILOVER_DESIGN.md` - Design document **Modified Files**: - `src/server/server.{h,cc}` - Added `ClusterFailover` member and `GetSlaveReplicationOffset()` - `src/cluster/cluster.{h,cc}` - Write blocking check, `SetMySlotsMigrated()`, `OnTakeOver()`, state reset - `src/commands/cmd_cluster.cc` - `FAILOVER` and `TAKEOVER` command handlers ## Testing Comprehensive test suite with **20 sub-test cases** (100% pass rate): - Normal flow (basic, custom timeout, authentication) - Failure scenarios (non-existent node, non-slave, invalid timeout, lag timeout, auth failure) - Concurrency (cannot start when in progress, restart after failure) - Write blocking (write blocked, read not blocked) - State query and transitions - Integration (data consistency, state reset after SETNODES) ## Compatibility ✅ **Backward compatible**: New feature, no breaking changes. Only active when `cluster-enabled=yes`. Existing clusters unaffected. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
