apurtell commented on code in PR #2461:
URL: https://github.com/apache/phoenix/pull/2461#discussion_r3290062508


##########
src/main/tla/ConsistentFailover/markdown/Admin.md:
##########
@@ -0,0 +1,151 @@
+# Admin -- Operator-Initiated Actions
+
+**Source:** [`Admin.tla`](../Admin.tla)
+
+## Overview
+
+`Admin` models the human operator (Admin actor) who drives failover and abort 
via the `PhoenixHAAdminTool` CLI, which delegates to `HAGroupStoreManager` 
coprocessor endpoints. This module contains four actions: `AdminStartFailover` 
(initiate failover), `AdminAbortFailover` (abort an in-progress failover), 
`AdminGoOffline` (take a standby cluster offline), and `AdminForceRecover` 
(force-recover from OFFLINE). The last two are gated on the 
`UseOfflinePeerDetection` feature flag and model the proactive design for peer 
OFFLINE detection using `PhoenixHAAdminTool update --force`.
+
+These are the only actions in the specification that represent deliberate 
human decisions rather than automated system behavior. All four 
(`AdminStartFailover`, `AdminAbortFailover`, `AdminGoOffline`, 
`AdminForceRecover`) receive no fairness in the liveness specifications. The 
admin is genuinely non-deterministic. The admin might never initiate a 
failover, or might abort every failover attempt. Imposing fairness on admin 
actions would force unrealistic guarantees about human behavior.
+
+### Modeling Choice: Direct ZK Writes
+
+Unlike the peer-reactive transitions in [HAGroupStore.md](HAGroupStore.md), 
admin actions are direct ZK writes -- they are not watcher-dependent. The admin 
CLI writes directly to the local ZK znode via the coprocessor endpoint. No 
`zkPeerConnected` or `zkLocalConnected` guard is needed because the admin tool 
manages its own ZK connection independently of the `HAGroupStoreClient` watcher 
infrastructure. `AdminGoOffline` and `AdminForceRecover` also use the `--force` 
path which writes directly to ZK, so no `zkLocalConnected` guard is needed for 
those actions either.
+
+## Implementation Traceability
+
+| TLA+ Action | Java Source |
+|---|---|
+| `AdminStartFailover(c)` | 
`HAGroupStoreManager.initiateFailoverOnActiveCluster()` L375-400 |

Review Comment:
   Agreed, I think file level citation is sufficient. The robots like to pin to 
line numbers but I can ask them to be removed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to