NeverENG opened a new pull request, #3373:
URL: https://github.com/apache/dubbo-go/pull/3373
### Description
Fixes # issue3354
## What this PR does
Fixes #3354. Hardens application-level **service-app mapping** (`interface
-> app names`)
registration so it is correct under concurrent providers, and gives it a
proper retry policy.
完善应用级 service-app mapping 的写入一致性、重试与去重。
## Background
The mapping value is a comma-separated set of application names stored under
a single
interface key, shared by all providers of that interface. Registration is
therefore a
read-modify-write, and the previous implementations had several reliability
gaps.
## Changes
### 1. Optimistic concurrency across all backends (no more lost updates)
Concurrent appends no longer clobber each other:
- **etcd**: `Get`+`Put` → `GetValAndRev` + `UpdateWithRev` (CAS on
`ModRevision`), `Create` for first write.
- **zookeeper**: keeps versioned `SetContent`, now surfaces version
conflicts instead of swallowing them.
- **nacos**: adds `CasMd5` optimistic lock.
Each backend wraps its native conflict (`ErrCompareFail` / `ErrBadVersion` /
`ErrNodeExists` /
nacos publish failure) into a shared `report.ErrMappingCASConflict` sentinel
via `%w`.
### 2. Graded retry (was: fixed loop, no backoff)
`registerWithRetry` retries **only** CAS conflicts (`errors.Is`) with
exponential backoff + jitter,
and returns permanent errors (network/auth) immediately instead of burning
the whole retry budget.
原来任何错误都空转重试 10 次且无 sleep,现在按错误类型分级重试。
### 3. Extract shared logic + fix two hidden bugs
- `report.MergeServiceAppMapping`: whole-element dedup. Fixes the
`strings.Contains` substring
false positive (registering `order` was wrongly treated as present when
`order-service` existed)
and the leading-comma bug (`"" + "," + app` → `",app"`).
- `report.DecodeServiceAppNames`: parse into a set, skipping empty elements.
### 4. Listener cleanup
- **zookeeper**: implemented removal via `CacheListener.RemoveKeyListeners`
(was a silent `return nil`
that leaked listeners).
- **etcd**: documents the mapping listener as unsupported instead of
silently succeeding.
### 5. Tests
- Unit tests for the merge/decode helpers (incl. the substring and
empty-value regressions).
- A concurrency test that **reproduces the lost-update bug** with the naive
read-modify-write and
**proves CAS preserves every writer** (200 writers / 20 concurrent
readers). Passes under `-race`.
## Known limitation (documented in code)
Nacos `CasMd5` is an optimistic **UPDATE** and cannot guard the first
**INSERT** (Nacos has no
create-if-absent primitive), so the initial concurrent registration of a
brand-new interface can
still race. **etcd and zookeeper are not affected.** Left as a documented
limitation; can be revisited
if Nacos exposes a SETNX-style primitive.
## Test
go test -race ./metadata/report/... ./metadata/mapping/...
### Checklist
- [x] I confirm the target branch is `develop`
- [x] Code has passed local testing
- [x] I have added tests that prove my fix is effective or that my feature
works
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]