Copilot commented on code in PR #61289:
URL: https://github.com/apache/doris/pull/61289#discussion_r2924616074
##########
fe/fe-core/src/main/java/org/apache/doris/cloud/catalog/CloudReplica.java:
##########
@@ -73,11 +73,22 @@ public class CloudReplica extends Replica implements
GsonPostProcessable {
private static final Random rand = new Random();
+ // Intern pool for cluster ID strings to avoid millions of duplicate
String instances.
+ // Typically only a handful of distinct cluster IDs exist in the system.
+ private static final ConcurrentHashMap<String, String> CLUSTER_ID_POOL =
new ConcurrentHashMap<>();
+
+ private static String internClusterId(String clusterId) {
+ if (clusterId == null) {
+ return null;
+ }
+ String existing = CLUSTER_ID_POOL.putIfAbsent(clusterId, clusterId);
+ return existing != null ? existing : clusterId;
+ }
Review Comment:
The static `CLUSTER_ID_POOL` is unbounded and will retain all distinct
cluster IDs for the lifetime of the JVM. If cluster IDs can be unbounded (e.g.,
come from external inputs or can churn over time), this becomes a memory leak
and can negate the intended savings. Consider using a bounded cache (e.g., max
size + eviction), or weak-value/weak-key interning (if available in the
codebase) so unused IDs can be reclaimed; at minimum, document/enforce that
cluster IDs are from a small, fixed set.
##########
fe/fe-core/src/main/java/org/apache/doris/cloud/catalog/CloudReplica.java:
##########
@@ -99,6 +110,34 @@ public CloudReplica(long replicaId, Long backendId,
ReplicaState state, long ver
this.idx = idx;
}
+ private ConcurrentHashMap<String, Long> getOrCreatePrimaryMap() {
+ ConcurrentHashMap<String, Long> map = primaryClusterToBackend;
+ if (map == null) {
+ synchronized (this) {
+ map = primaryClusterToBackend;
+ if (map == null) {
+ map = new ConcurrentHashMap<>(2);
+ primaryClusterToBackend = map;
+ }
+ }
+ }
+ return map;
+ }
+
+ private Map<String, Pair<Long, Long>> getOrCreateSecondaryMap() {
+ Map<String, Pair<Long, Long>> map = secondaryClusterToBackends;
+ if (map == null) {
+ synchronized (this) {
+ map = secondaryClusterToBackends;
+ if (map == null) {
+ map = new ConcurrentHashMap<>(2);
+ secondaryClusterToBackends = map;
+ }
+ }
+ }
+ return map;
Review Comment:
`secondaryClusterToBackends` is typed as `Map`, so after Gson
deserialization it may be a non-concurrent implementation (e.g.,
`LinkedTreeMap`). In that case it will be non-null and
`getOrCreateSecondaryMap()` will not replace it, and subsequent concurrent
reads/writes can become unsafe. A concrete fix is to (1) declare the field as
`ConcurrentHashMap<String, Pair<Long, Long>>` (and have
`getOrCreateSecondaryMap()` return `ConcurrentHashMap`), and/or (2) in
`gsonPostProcess()` normalize any deserialized map into a `ConcurrentHashMap`
(similar to what’s done for `primaryClusterToBackend`).
```suggestion
private ConcurrentHashMap<String, Pair<Long, Long>>
getOrCreateSecondaryMap() {
Map<String, Pair<Long, Long>> map = secondaryClusterToBackends;
if (map instanceof ConcurrentHashMap) {
return (ConcurrentHashMap<String, Pair<Long, Long>>) map;
}
synchronized (this) {
map = secondaryClusterToBackends;
if (map instanceof ConcurrentHashMap) {
return (ConcurrentHashMap<String, Pair<Long, Long>>) map;
}
ConcurrentHashMap<String, Pair<Long, Long>> concurrentMap = new
ConcurrentHashMap<>(2);
if (map != null) {
concurrentMap.putAll(map);
}
secondaryClusterToBackends = concurrentMap;
return concurrentMap;
}
```
##########
fe/fe-core/src/main/java/org/apache/doris/cloud/catalog/CloudReplica.java:
##########
@@ -73,11 +73,22 @@ public class CloudReplica extends Replica implements
GsonPostProcessable {
private static final Random rand = new Random();
+ // Intern pool for cluster ID strings to avoid millions of duplicate
String instances.
+ // Typically only a handful of distinct cluster IDs exist in the system.
+ private static final ConcurrentHashMap<String, String> CLUSTER_ID_POOL =
new ConcurrentHashMap<>();
+
+ private static String internClusterId(String clusterId) {
+ if (clusterId == null) {
+ return null;
+ }
+ String existing = CLUSTER_ID_POOL.putIfAbsent(clusterId, clusterId);
+ return existing != null ? existing : clusterId;
+ }
+
private Map<String, List<Long>> memClusterToBackends = null;
// clusterId, secondaryBe, changeTimestamp
- private Map<String, Pair<Long, Long>> secondaryClusterToBackends
- = new ConcurrentHashMap<String, Pair<Long, Long>>();
+ private volatile Map<String, Pair<Long, Long>> secondaryClusterToBackends;
Review Comment:
`secondaryClusterToBackends` is typed as `Map`, so after Gson
deserialization it may be a non-concurrent implementation (e.g.,
`LinkedTreeMap`). In that case it will be non-null and
`getOrCreateSecondaryMap()` will not replace it, and subsequent concurrent
reads/writes can become unsafe. A concrete fix is to (1) declare the field as
`ConcurrentHashMap<String, Pair<Long, Long>>` (and have
`getOrCreateSecondaryMap()` return `ConcurrentHashMap`), and/or (2) in
`gsonPostProcess()` normalize any deserialized map into a `ConcurrentHashMap`
(similar to what’s done for `primaryClusterToBackend`).
```suggestion
private volatile ConcurrentHashMap<String, Pair<Long, Long>>
secondaryClusterToBackends;
```
##########
fe/fe-core/src/main/java/org/apache/doris/cloud/catalog/CloudReplica.java:
##########
@@ -655,14 +716,24 @@ public void gsonPostProcess() throws IOException {
this.getId(), this.primaryClusterToBackends,
this.primaryClusterToBackend);
}
if (primaryClusterToBackends != null) {
+ ConcurrentHashMap<String, Long> map = getOrCreatePrimaryMap();
for (Map.Entry<String, List<Long>> entry :
primaryClusterToBackends.entrySet()) {
String clusterId = entry.getKey();
List<Long> beIds = entry.getValue();
if (beIds != null && !beIds.isEmpty()) {
- primaryClusterToBackend.put(clusterId, beIds.get(0));
+ map.put(internClusterId(clusterId), beIds.get(0));
}
}
this.primaryClusterToBackends = null;
}
Review Comment:
This eagerly allocates `primaryClusterToBackend` whenever
`primaryClusterToBackends` is non-null, even if all entries have empty/null
backend lists (no effective data to migrate). To preserve the memory-saving
goal, consider deferring `getOrCreatePrimaryMap()` until the first time you
actually encounter a non-empty `beIds` (i.e., allocate only when you’re about
to `put`).
##########
fe/fe-core/src/main/java/org/apache/doris/cloud/catalog/CloudReplica.java:
##########
@@ -316,8 +356,9 @@ private long getBackendIdImpl(String clusterId) throws
ComputeGroupException {
backendId = memClusterToBackends.get(clusterId).get(indexRand);
}
- if (!replicaEnough && !allowColdRead &&
primaryClusterToBackend.containsKey(clusterId)) {
- backendId = primaryClusterToBackend.get(clusterId);
+ ConcurrentHashMap<String, Long> priMap = primaryClusterToBackend;
+ if (!replicaEnough && !allowColdRead && priMap != null &&
priMap.containsKey(clusterId)) {
+ backendId = priMap.get(clusterId);
Review Comment:
This does two hash lookups (`containsKey` then `get`). Since
`ConcurrentHashMap` disallows null values, you can do a single `get` and check
for null (or use `getOrDefault`) to reduce overhead on a hot path.
```suggestion
if (!replicaEnough && !allowColdRead && priMap != null) {
Long primaryBackendId = priMap.get(clusterId);
if (primaryBackendId != null) {
backendId = primaryBackendId;
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]