mochengqian opened a new pull request, #967:
URL: https://github.com/apache/dubbo-go-pixiu/pull/967

   ## Summary
   
   Closes #957.
   
   `ClusterStore.prepareClusterConfig` rebuilt `Config.ConsistentHash.Hash` 
eagerly on every `AddCluster` / `UpdateCluster` / `SetEndpoint` / 
`DeleteEndpoint`. After #932, that field is only consumed by the **legacy** 
(non-snapshot) balancer path — all bundled balancers are snapshot-aware and 
read the snapshot's own healthy hash 
(`EndpointSnapshot.HealthyConsistentHash`). So the eager rebuild is dead work 
on the common path, and for a large Maglev table it is milliseconds of wasted 
CPU on every endpoint churn (e.g. Nacos instances flapping).
   
   This PR builds the Config-level hash lazily, only when a legacy balancer 
first needs it.
   
   ## Change
   
   - `prepareClusterConfig` invalidates the hash (`c.ConsistentHash.Hash = 
nil`) instead of rebuilding it. Setting it to nil keeps the legacy path correct 
after endpoint changes — the next legacy pick rebuilds from current endpoints 
rather than serving a stale ring.
   - New `ClusterConfig.EnsureConsistentHash()` builds the hash on first use 
(nil-check) and reuses it after.
   - The legacy branch of `pickEndpoint` calls `EnsureConsistentHash()` under 
the existing `legacyPickMu`, before the `config := *context.Config` copy. The 
in-`Handler` nil fallback in Maglev/RingHash stays as a safety net.
   
   ### Why a nil-check, not `sync.Once`
   
   The issue suggested a value `consistentHashOnce sync.Once` field. That trips 
`go vet` copylocks at the legacy path's existing `config := *context.Config` 
shallow copy (verified). Since the legacy path is already serialized by 
`legacyPickMu`, a plain nil-check is sufficient and vet-clean. A fresh 
`ClusterConfig` from `CloneStore` (whose `Hash` interface does not survive the 
yaml round-trip, so it deserializes to nil) rebuilds independently on its first 
legacy pick — the same fresh-store-rebuilds-independently property the issue 
asked for.
   
   ## Risk
   
   Per the issue: only code that reads `c.ConsistentHash.Hash` *before* a 
balancer `Handler` runs would now see nil. Confirmed the only non-test readers 
are `maglev_hash.go` / `ring_hash.go` (both inside `Handler`, both with a 
build-on-the-fly nil fallback) and `ConsistentHashForHealthyEndpoints` 
(snapshot path, has its own fallback). No xDS / adapter / configcenter / 
hotreload code reads the field.
   
   ## Benchmark
   
   `BenchmarkSetEndpoint` (added), 64-endpoint Maglev cluster, 
`MaglevTableSize=65537`, metadata-only re-registration (mirrors discovery 
churn):
   
   | | ns/op | B/op | allocs/op |
   |---|---|---|---|
   | before | 1087 | 1721 | 3 |
   | after | 413 | 481 | 3 |
   
   ~62% faster, ~72% less memory per mutation. The win scales with 
`MaglevTableSize`.
   
   ## Test plan
   
   - [x] New `TestClusterManager_SetEndpointDefersConsistentHashRebuild` — 
SetEndpoint churn triggers **zero** hash builds (counting 
`ConsistentHashInitFunc`); the legacy path's `EnsureConsistentHash` builds 
**exactly once** across repeated calls.
   - [x] Updated the two existing consistent-hash assertions 
(`SetEndpoint...RebuildsConsistentHash`, 
`DeleteEndpoint...RepairsRuntimeAndConsistentHash`) to trigger the lazy build 
before inspecting `Config.ConsistentHash.Hash` directly.
   - [x] `go test -race ./pkg/model/... ./pkg/server/... 
./pkg/cluster/loadbalancer/...`
   - [x] `go vet` clean on all three packages (confirms no copylocks regression)
   - [x] pre-push local-CI passed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to