mochengqian opened a new pull request, #923:
URL: https://github.com/apache/dubbo-go-pixiu/pull/923

   ### Description
   
   Implements Issue #905 Step 4 only: switch cluster lookup to O(1).
   
   Issue #905 step progress:
   - [x] Add tests and benchmarks
   - [x] Fix current correctness issues
   - [x] Tighten runtime consistency
   - [x] Switch cluster lookup to O(1)
   - [ ] Introduce healthy endpoint snapshots
   - [ ] Optimize simple LB hot path
   - [ ] Optimize consistent-hash LB last
   
   ### Why
   
   `ClusterManager` read paths still resolve clusters by scanning 
`store.Config`.
   
   After Step 3, `clustersMap` is now maintained as the coherent runtime 
cluster source: write paths keep `clustersMap[name].Config` aligned with the 
current cluster config, repair stale runtime entries during store swap, and 
stop replaced runtime clusters.
   
   That makes this PR the narrow next step: move cluster lookup on request read 
paths from O(n) config scanning to O(1) runtime map lookup.
   
   ### Root Cause:
   
   `ClusterManager.getCluster()` still iterated over `store.Config` by cluster 
name.
   
   That means `PickEndpoint()`, `PickNextEndpoint()`, and `GetEndpointByID()` 
paid linear lookup cost before reaching load balancing or endpoint selection 
logic.
   
   ### PR Goal:
   
   Use `clustersMap` as the runtime lookup source for cluster read paths.
   
   Keep public behavior unchanged while making cluster name resolution O(1).
   
   ### Success Criteria:
   
   `PickEndpoint()` no longer linearly scans `store.Config` to find a cluster.
   
   `PickNextEndpoint()` no longer linearly scans `store.Config` to find a 
cluster.
   
   `GetEndpointByID()` follows the same runtime lookup path.
   
   Missing clusters still return `nil`.
   
   Existing load-balancer behavior is unchanged.
   
   ### Paths this PR should affect:
   
   pkg/server/cluster_manager.go
   
   pkg/server/cluster_manager_test.go
   
   ### Revise
   
   Changing `getCluster`: lookup `cm.store.clustersMap[clusterName]` and return 
`runtimeCluster.Config`.
   
   Adding regression tests: proves `PickEndpoint`, `PickNextEndpoint`, and 
`GetEndpointByID` use runtime map state even when `store.Config` contains stale 
config data.
   
   ### Benchmark Evidence
   
   Command:
   
   `go test ./pkg/server -run '^$' -bench '^BenchmarkClusterLookupSerial$' 
-benchmem -count=3`
   
   The table below uses the average of 3 local benchmark runs, comparing 
`upstream/develop` with this branch:
   
   | Cluster count | Before avg | After avg | Meaning |
   | --- | ---: | ---: | --- |
   | 1 | ~3.36 ns/op | ~4.70 ns/op | map lookup has tiny fixed overhead |
   | 32 | ~18.38 ns/op | ~6.75 ns/op | no longer grows with cluster scan cost |
   | 256 | ~115.60 ns/op | ~6.73 ns/op | lookup cost stays nearly flat |
   | 1024 | ~780.87 ns/op | ~7.23 ns/op | O(n) scan removed |
   
   Both before and after remain `0 B/op, 0 allocs/op`.
   
   The single-cluster case is slightly slower because a map lookup has more 
fixed overhead than scanning one item, but the issue target is multi-cluster 
request-path scalability. At 1024 clusters, lookup drops from hundreds of ns/op 
to single-digit ns/op and stays effectively flat as cluster count grows.
   
   ### Tests
   
   `go test ./pkg/server`
   
   `go test ./pkg/cluster/...`
   
   `go test -race ./pkg/server -run 'TestClusterManager'`
   
   `go test ./pkg/server -run '^$' -bench '^BenchmarkClusterLookupSerial$' 
-benchmem -count=3`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to