hanahmily opened a new pull request, #1109:
URL: https://github.com/apache/skywalking-banyandb/pull/1109
### Phase 2.2 prep — borrow `NodeSchemaStatusServiceClient` over
`queue.Client`'s existing connection pool
The Phase 2 cluster-barrier fan-out (Step 2.2, follow-up PR) probes peer
liaisons and data nodes via the per-node `clusterv1.NodeSchemaStatusService`
that landed in #1108. Rather than open a parallel connection mesh, the fan-out
borrows the `*grpc.ClientConn` that `pub` already maintains for each tier —
same conns, same auth, same health-checks, same lifecycle.
This PR is the enabling shim. No callers yet; the fan-out itself lands in a
follow-up.
**What changes**
- `banyand/queue/queue.go`: add `NewNodeSchemaStatusClient(node)` to the
`Client` interface, mirroring the existing `NewChunkedSyncClient` shape. Also
add a sentinel `ErrNotImplemented` for backends with no peer to dial.
- `banyand/queue/pub/pub.go`: implement by calling `connMgr.GetClient(node)`
and wrapping `c.conn` with `clusterv1.NewNodeSchemaStatusServiceClient`. ~10
LOC, identical pattern to `NewChunkedSyncClientWithConfig` (line 543).
- `banyand/queue/local.go`: stub returns `ErrNotImplemented` so the
standalone path naturally degrades to in-process self-probing without a type
switch on the queue client.
- `banyand/queue/pipeline_mock.go`: regenerated. Gitignored (per
`.gitignore` line 60), so not committed; CI/dev regen on first build.
- `CHANGES.md`: open a "Schema consistency (Phase 2 in progress)" sub-group
under `## 0.11.0`. Phase 2 work continues to land in 0.11.0 until that section
closes; subsequent Step 2.2 / 2.3 / 2.4 PRs accrete sub-bullets here.
**Why ride pub's existing pools instead of opening a separate mesh**
- Avoids doubling the per-node connection count.
- Inherits the existing auth interceptor chain, TLS reloader, and circuit
breaker.
- Inherits node add/remove events from `KindNode` schema watch, so the
barrier's view of cluster membership is already up to date with whatever pub
sees.
- The barrier RPCs are tiny (`GetMaxRevisionRequest{}` is empty; per-call ~1
ms turnaround), so HTTP/2 stream contention with normal queue traffic is
rounding error at expected fan-out widths (≤ ~10 polls × N peers per 5-second
budget). Will re-evaluate at Step 2.8 if metrics surface contention.
**Local CI** (mirrors `.github/workflows/ci.yml`): green on all phases —
`make license-check / check-req / build / lint / check`, every unit-test
package (`banyand/...`, `bydbctl/...`, `pkg/...`, `fodc/...`), both integration
suites (`./test/integration/standalone/...` 28 specs,
`./test/integration/distributed/...` 6 specs). Two flakes recovered on retry
(`health_check_test.go:145` HTTP-server-port race; `query_gate.go:190`
distributed schema race that Phase 2.2 itself is supposed to close), both
unrelated to this prep change.
**Checklist (per `.github/PULL_REQUEST_TEMPLATE`)**
- [x] Non-trivial feature; design doc: refs #1091 (Phase 1 umbrella) and
#1108 (Phase 2 Step 2.1).
- [x] Documentation: interface comments describe contract; CHANGES.md
updated under existing 0.11.0 section.
- [ ] Tests added: this PR is a pure-internal interface addition with no
behavioural change. The fan-out follow-up brings 11 unit tests + distributed
integration.
- [ ] UI-related — N/A.
- [ ] Closes/resolves/fixes existing issue — N/A; tracking continues under
#1091's umbrella.
**Out of scope (deferred to follow-ups)**
- Cluster-barrier fan-out for `AwaitRevisionApplied` — Step 2.2.
- Fan-out for `AwaitSchemaApplied` / `AwaitSchemaDeleted` — Steps 2.3 / 2.4.
- Distributed integration specs §6.12a/b/d — gated on `PauseDataNodeWatch`
helper from Step 1.0 (not yet implemented).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]