github-actions[bot] commented on code in PR #63298:
URL: https://github.com/apache/doris/pull/63298#discussion_r3290506272


##########
fe/fe-core/src/main/java/org/apache/doris/catalog/MaterializedIndex.java:
##########
@@ -109,23 +112,54 @@ public Tablet getTablet(long tabletId) {
         return idToTablets.get(tabletId);
     }
 
-    public void clearTabletsForRestore() {
+    public synchronized void clearTabletsForRestore() {
         idToTablets.clear();
-        tablets.clear();
+        tablets = new ArrayList<>(); // volatile write
     }
 
-    public void addTablet(Tablet tablet, TabletMeta tabletMeta) {
+    public synchronized void addTablet(Tablet tablet, TabletMeta tabletMeta) {
         addTablet(tablet, tabletMeta, false);
     }
 
-    public void addTablet(Tablet tablet, TabletMeta tabletMeta, boolean 
isRestore) {
-        idToTablets.put(tablet.getId(), tablet);
-        tablets.add(tablet);
+    // Writers are synchronized on this index to prevent concurrent 
lost-update:
+    // some callers (e.g. InternalCatalog.createTablets) do NOT hold the 
OlapTable
+    // write lock when adding tablets.
+    // Copy-on-write keeps readers CME-safe without locking; for bulk creation 
use
+    // appendTablets(...) so the per-index tablets list is copied once per 
batch
+    // instead of once per tablet.
+    public synchronized void addTablet(Tablet tablet, TabletMeta tabletMeta, 
boolean isRestore) {
+        appendTabletsInternal(Collections.singletonList(tablet));
         if (!isRestore) {
             Env.getCurrentInvertedIndex().addTablet(tablet.getId(), 
tabletMeta);
         }
     }
 
+    // Bulk-publish: append the given tablets to this index's tablets list in a
+    // single copy-on-write (O(existing + batch) instead of O(n^2) over n
+    // single-tablet adds inside a synchronized block).
+    //
+    // Does NOT touch TabletInvertedIndex. Bulk-creation callers register 
tablets
+    // in TabletInvertedIndex eagerly inside their per-tablet loop because
+    // Tablet.addReplica(...) (non-restore) requires the tablet to already be
+    // present in the inverted index; only the per-index list copy is expensive
+    // enough to be worth batching.
+    public synchronized void appendTablets(Collection<Tablet> newTablets) {
+        appendTabletsInternal(newTablets);
+    }
+
+    private void appendTabletsInternal(Collection<Tablet> newTablets) {
+        if (newTablets.isEmpty()) {
+            return;
+        }
+        List<Tablet> next = new ArrayList<>(tablets.size() + 
newTablets.size());
+        next.addAll(tablets);
+        for (Tablet tablet : newTablets) {
+            idToTablets.put(tablet.getId(), tablet);
+            next.add(tablet);

Review Comment:
   This still mutates `idToTablets` in place while `getTablet()` reads the same 
plain `HashMap` without synchronization. The PR makes the ordered `tablets` 
list a volatile copy-on-write snapshot, but lookup readers such as 
scheduler/report/proc paths use `getTablet()` and do not get that protection; 
at the same time the new comments explicitly document writers that do not 
necessarily hold the `OlapTable` write lock. A concurrent 
`appendTablets()`/`clearTabletsForRestore()` can therefore race with 
`getTablet()` on `HashMap`, which is exactly the kind of metadata reader race 
this change is trying to remove. Please publish `idToTablets` with the same 
snapshot discipline (for example a volatile copied map updated together with 
`tablets`) or otherwise make all reads/writes consistently 
synchronized/concurrent.



##########
fe/fe-core/src/main/java/org/apache/doris/catalog/LocalTablet.java:
##########
@@ -31,16 +31,16 @@
 import org.apache.logging.log4j.Logger;
 
 import java.util.ArrayList;
+import java.util.Collections;
 import java.util.Comparator;
-import java.util.Iterator;
 import java.util.List;
 import java.util.stream.LongStream;
 
 public class LocalTablet extends Tablet {
     private static final Logger LOG = LogManager.getLogger(LocalTablet.class);
 
     @SerializedName(value = "rs", alternate = {"replicas"})
-    private List<Replica> replicas;
+    private volatile List<Replica> replicas;
     @SerializedName(value = "lastCheckTime")

Review Comment:
   Making `replicas` volatile is not enough while methods still read the field 
directly more than once instead of capturing one snapshot. For example, 
`getRemoteDataSize()` iterates `replicas` looking for `cooldownReplicaId`, then 
rereads `replicas` for `stream().max(...).get()`; a concurrent 
`deleteReplica*()` can publish an empty/different list between those reads and 
make this throw or compute from a different snapshot. Other direct-field 
readers such as `getReplicaByBackendId()`, `equals()`, and 
`readyToBeRepaired()` also bypass the `getReplicas()` snapshot convention. 
Please update these readers to use a single local `List<Replica> snapshot = 
replicas` (or `getReplicas()`) per method so the copy-on-write guarantee 
actually covers all replica readers, not only external iteration.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to