[PR] [opt](cloud) cache cluster id per query and drop redundant locks on getBackendId hot path [doris]

via GitHub Mon, 25 May 2026 08:25:38 -0700


liaoxin01 opened a new pull request, #63636:
URL: https://github.com/apache/doris/pull/63636


   ## Proposed changes
   
   Issue Number: close #xxx
   
   ### Problem
   
   `CloudReplica.getBackendId` was the dominant FE hotspot in cloud-mode 
query/load planning. On a profiled query it accounted for **65.6% of total FE 
CPU samples** (async-profiler). Every replica in the plan re-ran the full 
cluster-id resolution pipeline even though the resolved cluster id is identical 
for every tablet in the same request.
   
   Call breakdown from the flame graph:
   
   ```
   Tablet.getNormalReplicaBackendPathMap            68.6%
   └─ CloudReplica.getBackendId                     65.6%
      └─ getCurrentClusterId                        61.5%
         ├─ getCloudClusterIdByName                 35.6%
         │  ├─ getCloudClusterNames                 24.7%   (rw-lock + 
ArrayList + stream.sorted)
         │  └─ waitForAutoStart                      9.7%   
(StopWatch.start/stop even on NORMAL)
         ├─ getPhysicalCluster                      14.5%
         │  └─ getComputeGroupByName                14.5%   (rw-lock around 
ConcurrentHashMap)
         └─ getCloudStatusByName                     8.3%
            └─ getCloudStatusByIdNoLock             14.0%   (two stream passes, 
String.valueOf per BE)
   ```
   
   `ReadLock.unlock` consistently outweighed `ReadLock.lock` in the profile -- 
a classic cache-line bouncing signature, meaning the read-lock CAS was already 
operating in the non-linear regime where adding concurrent threads 
disproportionately hurts throughput. With high tablet counts the call rate 
(`tablets × replicas × concurrent_queries`) easily reached 100k+ lock/unlock 
per second per FE, putting the FE one nudge away from a metastable collapse.
   
   ### Changes
   
   - **`CloudReplica.getCurrentClusterId`** becomes `public static` so callers 
can resolve once per request. Adds `getBackendIdWithClusterId(String)` that 
bypasses the per-replica pipeline.
   - **`OlapTableSink.createLocation`** and 
**`FrontendServiceImpl.{createPartition, replacePartition}`** resolve the 
cluster id once before iterating tablets and pass it down via the new 
`CloudTablet.getNormalReplicaBackendPathMap(String)` overload. For a 10k-tablet 
query this collapses 10k full pipelines into 1.
   - **`CloudSystemInfoService.getComputeGroupByName`**: drop the rw-lock 
around `ConcurrentHashMap` reads, merge `containsKey+get` into a single `get`, 
guard the debug log behind `isDebugEnabled` (the map `toString` is expensive).
   - **`CloudSystemInfoService.containsCloudCluster`** (new): cheap existence 
check that replaces `getCloudClusterNames().contains(name)`. Avoids an 
ArrayList copy + stream filter + natural sort + collect under a read lock for a 
single existence query.
   - **`CloudSystemInfoService.getCloudStatusByIdNoLock`**: single-pass loop 
with a precomputed `NORMAL` constant in place of two stream pipelines that 
re-evaluated `String.valueOf(NORMAL)` per backend.
   - **`CloudSystemInfoService.waitForAutoStart`**: fast-path return when the 
cluster is already `NORMAL`, skipping the `withTemporaryNereidsTimeout` wrap 
and the `StopWatch.start/stop` inside `waitForClusterToResume` (the while loop 
never executed in that state but the wrapping still ran per call -- ~3% of 
total FE CPU in the profile).
   
   ### Behavior
   
   Semantics preserved on all paths:
   - `waitForAutoStart` NORMAL fast-path: the original code already had 
`existAliveBe = true` as initializer, so `waitForClusterToResume` was a no-op 
for NORMAL clusters.
   - `getComputeGroupByName` without rw-lock: the brief window between a 
rename's two map updates can now return `null`; same as the existing read-only 
paths everywhere else in this class that already access these maps without 
locks.
   - `containsCloudCluster` matches `getCloudClusterNames().contains(name)` for 
non-empty `name` (empty-name filtering in `getCloudClusterNames` was for the 
returned list, not for `.contains` semantics).
   
   ### Further comments
   
   The cluster id resolution is hoisted only on the no-BE-endpoint paths (the 
hot ones in the profile). The endpoint-resolved path in `FrontendServiceImpl` 
already has its own resolution logic and is untouched.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior:
       - [ ] Yes
       - [x] No
       - [ ] I don't know
   2. Has unit tests been added:
       - [ ] Yes
       - [x] No Need
       - [ ] Same as the original logic
   3. Has document been added or modified:
       - [ ] Yes
       - [x] No Need
   4. Does it need to update dependencies:
       - [ ] Yes
       - [x] No
   5. Are there any changes that cannot be rolled back:
       - [ ] Yes (If Yes, please explain WHY)
       - [x] No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] [opt](cloud) cache cluster id per query and drop redundant locks on getBackendId hot path [doris]

Reply via email to