hanahmily opened a new pull request, #1111:
URL: https://github.com/apache/skywalking-banyandb/pull/1111

   ### `AwaitRevisionApplied` cluster fan-out
   
   Phase 1's `AwaitRevisionApplied` confirmed only the receiving liaison's 
local cache. Phase 2 Step 2.2 extends it: the receiving liaison now probes a 
frozen watched set of (self + every peer liaison + every data node) in parallel 
and returns `applied=true` only when the min over all members reaches 
`min_revision`. On timeout it returns role-prefixed laggards (`liaison-<name>` 
/ `data-<name>`) so logs are unambiguous when liaisons and data nodes share a 
host.
   
   Builds on:
   - #1108 — per-cluster-member `NodeSchemaStatusService`.
   - #1109 — `queue.Client.NewNodeSchemaStatusClient(node)` for borrowing pub's 
connection pool.
   
   **Membership snapshot**
   
   - Self: `curNode.Metadata.Name`, resolved at call time so PreRun has 
populated `curNode` by then.
   - Peer liaisons: `tir1Client.GetRouteTable().Active` (tier1 = 
liaison-↔-liaison routing per `api/proto/banyandb/database/v1/rpc.proto:946`).
   - Data nodes: `tir2Client.GetRouteTable().Active`.
   - Dedup by `Metadata.Name` with first-seen tier winning. A hybrid host 
running both `Role_ROLE_LIAISON` and `Role_ROLE_DATA` is probed exactly once.
   - Standalone deployments → empty route tables → watched set degenerates to 
`{self}`, same probe loop. No special-case branch.
   
   **Per-member probe**
   
   - Self: in-process `barrierCacheReader.GetMaxModRevision()` — no goroutine, 
no RPC.
   - Peer: `clusterv1.NodeSchemaStatusService.GetMaxRevision` over the 
`*grpc.ClientConn` borrowed from `queue.Client.NewNodeSchemaStatusClient` — no 
parallel connection mesh, inherits pub's auth / TLS / health-check / 
circuit-breaker.
   - `sync.WaitGroup` + indexed result slice. Per-probe context inherits the 
*call-wide* deadline (shared, not divided across N members) so wall-clock stays 
bounded by `req.timeout` regardless of fan-out width.
   
   **Outcome rules**
   
   - `min(rev across all members) >= min_revision` → `applied=true` (early 
exit, typically 1–3 iterations).
   - `codes.Unimplemented` from a peer → treated as ready (assume `max_revision 
= ∞`). A v0.13 liaison rolling onto a v0.11 / v0.12 data-node fleet does not 
deadlock all barrier callers.
   - Any other gRPC error (`Unavailable`, `Internal`, single-probe deadline, …) 
→ per-iteration laggard only. Members appear in the response's `laggards` list 
only on overall call timeout.
   - Snapshot finds zero members AND no self → fail fast with 
`codes.Unavailable: "no active cluster members"` rather than parking.
   - Backoff cadence is unchanged from Phase 1 (10ms init × 1.5, cap 500ms, 
default 5s timeout). The §5.0 sequence-diagram math still holds: ≤ ~10 probes 
per member per 5s budget; most calls converge sub-100 ms.
   
   **Test fixtures (`barrier_cluster_test.go`)**
   
   `fakeNodeStatusClient` implements `clusterv1.NodeSchemaStatusServiceClient` 
with knobs for `maxRev`, optional gRPC error, optional pre-response delay, and 
an optional call-counter ref. `fakeQueueClient` embeds `queue.Client` (nil 
interface; unused methods panic at runtime if a test accidentally invokes them) 
and only overrides `GetRouteTable` / `NewNodeSchemaStatusClient`. 
`clusterFixture.build()` wires `newBarrierServiceCluster` with all four 
closures.
   
   **Tests added (8 of 11 in the locked plan; 3 frozen-snapshot tests defer to 
the next PR)**
   
   - `TestFanOut_AllNodesReady_ReturnsApplied` — happy path
   - `TestFanOut_OneLaggard_ReportsOnTimeout` — single laggard names exactly 
that peer
   - `TestFanOut_AllLaggards_ListsAll` — every member behind, every member 
surfaces
   - `TestFanOut_NodeRPCErrors_CountedAsLaggard` — `Internal` is transient, 
doesn't fail the whole call
   - `TestFanOut_EmptyActiveSet_ReturnsUnavailable` — fail-fast contract
   - `TestFanOut_ProbeIsShortUnary_NoServerWait` — slow probe must not stretch 
wall-clock past the call deadline
   - `TestFanOut_BackoffBounded` — multi-iteration polling stays inside the 
10ms–500ms envelope
   - `TestFanOut_NodeReturnsUnimplemented_TreatedAsReady` — cross-version 
policy regression
   
   **Backwards compatibility**
   
   `newBarrierService(cacheProvider)` (single-arg) is kept and used by Phase 1 
unit tests; without `peerLiaisons`/`dataNodes`/`selfName`, 
`AwaitRevisionApplied` falls back to the legacy in-process loop. Production 
wiring uses `newBarrierServiceCluster` so the cluster path is always exercised 
in real deployments.
   
   **Local CI** (mirrors `.github/workflows/ci.yml`): green on every phase — 
`make license-check / check-req / build / lint / check`, every unit-test 
package, both integration suites (`./test/integration/standalone/...` 28 specs, 
`./test/integration/distributed/...` 6 specs). Distributed run took 12m57s and 
passed 6/6 specs cleanly on first try (no flakes this time).
   
   **Checklist (per `.github/PULL_REQUEST_TEMPLATE`)**
   
   - [x] Non-trivial feature; design doc: refs #1091 (Phase 1 umbrella), #1108 
(Step 2.1), #1109 (Step 2.2 prep).
   - [x] Documentation: function-level comments describe the contract; sequence 
diagram in `.omc/plans/banyandb-schema-tbd-plan.md` §5.0 is the canonical 
reference for the wire flow.
   - [x] Tests added: 8 unit tests in `barrier_cluster_test.go`; existing 19 
barrier tests stay green (Phase 1 fallback path is preserved).
   - [ ] UI-related — N/A.
   - [ ] Closes/resolves/fixes existing issue — N/A; tracking continues under 
#1091's umbrella.
   
   **Out of scope (deferred to follow-ups)**
   
   - Mid-call leave / eviction / late-join handling (frozen-snapshot semantics) 
and the `NodeLaggard.reason` proto field — Phase 2.2 snapshot-semantics PR.
   - Cluster fan-out for `AwaitSchemaApplied` and `AwaitSchemaDeleted` — Steps 
2.3 / 2.4.
   - Distributed integration specs §6.12a/b/d — gated on `PauseDataNodeWatch` 
from Step 1.0 (deferred per the locked plan).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to