(skywalking-banyandb) 02/02: docs(changes): record data-node NodeSchemaStatusService exposure in CHANGES.md

hanahmily Thu, 07 May 2026 17:00:08 -0700

This is an automated email from the ASF dual-hosted git repository.

hanahmily pushed a commit to branch phase-2-cp5-march
in repository https://gitbox.apache.org/repos/asf/skywalking-banyandb.git


commit 53933c0b6a1f8828fe8e09ced4c5c2555ebe701e
Author: Hongtao Gao <[email protected]>
AuthorDate: Thu May 7 23:59:47 2026 +0000

    docs(changes): record data-node NodeSchemaStatusService exposure in 
CHANGES.md
    
    Refines the §6.12 spec authoring entry now that the liaison-pause
    path is the only working approach (the deferred reason is the
    global notifiedModRevision watermark race, not data-node service
    exposure as previously noted), and adds two new bullets covering
    the queue.Server.SetNodeSchemaStatusRepo decoupling and the
    GetMaxRevision aggregation repair.
    
    via [HAPI](https://hapi.run)
    
    Co-Authored-By: HAPI <[email protected]>
    Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
---
 CHANGES.md | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/CHANGES.md b/CHANGES.md
index 4141adca1..db242ee10 100644
--- a/CHANGES.md
+++ b/CHANGES.md
@@ -45,8 +45,10 @@ Release Notes.
   - Add observability for the schema-consistency cluster (Step 2.7, §A17): 
`schema_await_revision_applied_duration_seconds{result}`, 
`schema_await_schema_applied_duration_seconds{result}`, 
`schema_await_schema_deleted_duration_seconds{result}` track barrier latency by 
outcome (`applied` / `timeout` / `invalid_argument` / `error`); 
`schema_barrier_laggard_nodes_total{barrier,role,node}` decodes the 
`<role>-<Metadata.Name>` laggard identifier so dashboards can break out which 
member fell b [...]
   - Add the schema-barrier CP-6 SLO load harness (Step 2.8) under 
`test/load/schema_barrier/`, runnable via `make load-test-barrier`. The harness 
brings up an in-process 3 data node + 1 liaison cluster, drives 100 concurrent 
`AwaitRevisionApplied` callers + 10 `Group.Update` ops/sec, and reports p50 / 
p95 / p99 / max from client-side per-call duration after a 1-minute warm-up + 
5-minute measurement window. Client-side latency is bounded above by the 
server-side histogram so the SLO check [...]
   - Land `pkg/test/setup.PauseDataNodeWatch` / `ResumeDataNodeWatch` (Step 1.0 
follow-up): the helpers replace the `ErrWatchControlNotImplemented` stub with a 
working hook into `property.SchemaRegistry` so cluster-only specs can drive a 
single data node to fall behind the cluster while the rest stays in sync. The 
data node's `handleWatchEvent`, `processInitialResourceFromProperty`, and 
`handleDeletion` paths each gate events into a per-registry queue while paused; 
resume drains the queue [...]
-  - Extend the watch-control binding to liaison processes 
(`pkg/test/setup.startLiaisonNode`) and add `helpers.SharedContext.LiaisonAddr` 
so cluster-only specs can pause the receiving liaison's own `SchemaRegistry`. 
The cluster barrier's `selfName` probe reads through that SR, so pausing it 
surfaces a laggard via the public `AwaitX` RPCs without needing 
`NodeSchemaStatusService` exposed on data-node ports — the in-process 
distributed harness does not currently host that service on data n [...]
-  - Author §6.12 cluster-barrier integration specs 
(`test/cases/schema/barrier_cluster.go`): §6.12b (`AwaitSchemaApplied`) and 
§6.12c (`AwaitSchemaDeleted`) pin the public-API contract that a paused 
receiving liaison surfaces a non-empty `laggards` list and that resume drains 
the queue so the per-key barrier converges. §6.12a (`AwaitRevisionApplied`) and 
§6.12d (cross-barrier recovery) are checked in as `PIt` (pending) — the queue 
replay runs and the per-key barriers converge, but the gl [...]
+  - Extend the watch-control binding to liaison processes 
(`pkg/test/setup.startLiaisonNode`) and add `helpers.SharedContext.LiaisonAddr` 
so cluster-only specs can pause the receiving liaison's own `SchemaRegistry`. 
The cluster barrier's `selfName` probe reads through that SR, so pausing it 
surfaces a laggard via the public `AwaitX` RPCs.
+  - Author §6.12 cluster-barrier integration specs 
(`test/cases/schema/barrier_cluster.go`): §6.12b (`AwaitSchemaApplied`) and 
§6.12c (`AwaitSchemaDeleted`) pin the public-API contract that a paused 
receiving liaison surfaces a non-empty `laggards` list and that resume drains 
the queue so the per-key barrier converges. §6.12a (`AwaitRevisionApplied`) and 
§6.12d (cross-barrier recovery) are checked in as `PIt` (pending): the 
laggard-detection assertion passes but the post-resume `AwaitRev [...]
+  - Expose `cluster.v1.NodeSchemaStatusService` on data-node gRPC ports. 
Decouple the registration in `banyand/queue/sub/server.go`'s `Serve()` so 
`fodc.v1.GroupLifecycleService` (liaison-only by design) and 
`NodeSchemaStatusService` (per-node by design) are gated independently: the new 
`queue.Server.SetNodeSchemaStatusRepo(metadata.Service)` setter wires the 
per-node service without dragging along the liaison-shaped 
`GroupLifecycleService`. Liaison startup (`pkg/cmdsetup/liaison.go`) ca [...]
+  - Repair the `GetMaxRevision` aggregation on the per-node 
`NodeSchemaStatusService` (`banyand/metadata/schema/property/node_status.go`). 
The previous implementation returned `min(schemaCache.notifiedModRevision, 
NodeRepoRegistry.LatestModRevision)`, but `LatestModRevision` aggregated 
per-service `schemaRepo` watermarks via `min` — and each `schemaRepo` only 
advances on events for its own catalog (`pkg/schema/init.go:72` filters by 
`g.Catalog`), so the min was perpetually pinned to the  [...]
 
 ### Bug Fixes

(skywalking-banyandb) 02/02: docs(changes): record data-node NodeSchemaStatusService exposure in CHANGES.md

Reply via email to