dataroaring commented on code in PR #62117:
URL: https://github.com/apache/doris/pull/62117#discussion_r3064355862


##########
fe/fe-foundation/src/test/java/org/apache/doris/foundation/util/ConcurrentLong2ObjectHashMapTest.java:
##########
@@ -392,6 +392,70 @@ void testConcurrentIterationDuringModification() throws 
Exception {
         Assertions.assertEquals(0, errors.get());
     }
 
+    // ---- Deadlock-safety tests ----
+
+    @Test
+    void testForEachWithMutatingCallbackDoesNotDeadlock() throws Exception {
+        // Before the fix, forEach held a read lock and the callback calling 
put() would try
+        // to acquire a write lock on the same segment, causing a 
read-to-write upgrade deadlock.
+        ConcurrentLong2ObjectHashMap<String> map = new 
ConcurrentLong2ObjectHashMap<>(1);
+        map.put(1L, "one");
+        map.put(2L, "two");
+
+        // Run in a separate thread with a timeout so the test fails fast 
instead of hanging
+        ExecutorService executor = Executors.newSingleThreadExecutor();
+        Future<?> future = executor.submit(() -> {
+            map.forEach((ConcurrentLong2ObjectHashMap.LongObjConsumer<String>) 
(k, v) -> {
+                map.put(k + 1000L, v + "-copy");
+            });
+        });
+        // Should complete within 5 seconds; if it deadlocks, get() will time 
out
+        future.get(5, java.util.concurrent.TimeUnit.SECONDS);
+        executor.shutdown();

Review Comment:
   Addressed in commit 7f590a53. The test now uses try/catch/finally: on 
timeout or exception the Future is cancelled, and `executor.shutdownNow()` is 
called in the finally block to prevent thread leaks.
   
   — ThinkOps 🤖



##########
fe/fe-foundation/src/test/java/org/apache/doris/foundation/util/ConcurrentLong2LongHashMapTest.java:
##########
@@ -401,6 +401,65 @@ void testConcurrentIterationDuringModification() throws 
Exception {
         Assertions.assertEquals(0, errors.get());
     }
 
+    // ---- Deadlock-safety tests ----
+
+    @Test
+    void testForEachWithMutatingCallbackDoesNotDeadlock() throws Exception {
+        ConcurrentLong2LongHashMap map = new ConcurrentLong2LongHashMap(1);
+        map.put(1L, 100L);
+        map.put(2L, 200L);
+
+        ExecutorService executor = Executors.newSingleThreadExecutor();
+        Future<?> future = executor.submit(() -> {
+            map.forEach((ConcurrentLong2LongHashMap.LongLongConsumer) (k, v) 
-> {
+                map.put(k + 1000L, v + 1);
+            });
+        });
+        future.get(5, java.util.concurrent.TimeUnit.SECONDS);
+        executor.shutdown();

Review Comment:
   Addressed in commit 7f590a53. Same fix applied: try/catch/finally with 
`future.cancel(true)` on exception and `executor.shutdownNow()` in finally.
   
   — ThinkOps 🤖



##########
fe/fe-foundation/src/main/java/org/apache/doris/foundation/util/ConcurrentLong2ObjectHashMap.java:
##########
@@ -289,36 +316,48 @@ public void putAll(Map<? extends Long, ? extends V> m) {
     // to ensure the check-then-act is atomic within a segment's write lock.
 
     public V computeIfAbsent(long key, LongFunction<? extends V> 
mappingFunction) {
+        checkNotInCallback();
         Segment<V> seg = segmentFor(key);
         seg.lock.writeLock().lock();
         try {
             V val = seg.map.get(key);
             if (val != null || seg.map.containsKey(key)) {
                 return val;
             }
-            V newValue = mappingFunction.apply(key);
-            if (newValue != null) {
-                seg.map.put(key, newValue);
+            inCallback.set(Boolean.TRUE);
+            try {
+                V newValue = mappingFunction.apply(key);
+                if (newValue != null) {
+                    seg.map.put(key, newValue);
+                }
+                return newValue;
+            } finally {
+                inCallback.set(Boolean.FALSE);
             }

Review Comment:
   Addressed in commit 7f590a53. Changed from `inCallback.set(Boolean.FALSE)` 
to `inCallback.remove()` in all finally blocks to fully clear the ThreadLocal 
entry.
   
   — ThinkOps 🤖



##########
fe/fe-foundation/src/main/java/org/apache/doris/foundation/util/ConcurrentLong2LongHashMap.java:
##########
@@ -302,13 +326,20 @@ public long addTo(long key, long increment) {
     }
 
     public long computeIfAbsent(long key, LongUnaryOperator mappingFunction) {
+        checkNotInCallback();
         Segment seg = segmentFor(key);
         seg.lock.writeLock().lock();
         try {
             if (seg.map.containsKey(key)) {
                 return seg.map.get(key);
             }
-            long newValue = mappingFunction.applyAsLong(key);
+            inCallback.set(Boolean.TRUE);
+            long newValue;
+            try {
+                newValue = mappingFunction.applyAsLong(key);
+            } finally {
+                inCallback.set(Boolean.FALSE);
+            }

Review Comment:
   Addressed in commit 7f590a53. All `set(Boolean.FALSE)` calls replaced with 
`inCallback.remove()` in both map implementations.
   
   — ThinkOps 🤖



##########
fe/fe-foundation/src/main/java/org/apache/doris/foundation/util/ConcurrentLong2ObjectHashMap.java:
##########
@@ -53,13 +53,25 @@
  * interfaces (computeIfAbsent, computeIfPresent, compute, merge, putIfAbsent, 
replace, remove)
  * are overridden to ensure atomicity within a segment.
  *
+ * <p><b>Callback restriction:</b> The mapping/remapping functions passed to 
{@code computeIfAbsent},
+ * {@code computeIfPresent}, {@code compute}, and {@code merge} <em>must 
not</em> attempt to update
+ * any other mappings of this map. This restriction is enforced at runtime: 
reentrant access from a
+ * callback throws {@link IllegalStateException}. Violation may also cause 
deadlock if callbacks
+ * attempt cross-segment updates from multiple threads.

Review Comment:
   Addressed in commit 64db7d27. `checkNotInCallback()` is now called at the 
top of all read methods too: `get`, `getOrDefault`, `containsKey`, 
`containsValue`, `size`, and `isEmpty`. A dedicated test 
`testReentrantReadFromCallbackThrowsIllegalStateException` verifies that 
`get()` from within a `compute` callback throws ISE.
   
   — ThinkOps 🤖



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to