Copilot commented on code in PR #62117:
URL: https://github.com/apache/doris/pull/62117#discussion_r3034931847


##########
fe/fe-foundation/src/test/java/org/apache/doris/foundation/util/ConcurrentLong2LongHashMapTest.java:
##########
@@ -401,6 +401,65 @@ void testConcurrentIterationDuringModification() throws 
Exception {
         Assertions.assertEquals(0, errors.get());
     }
 
+    // ---- Deadlock-safety tests ----
+
+    @Test
+    void testForEachWithMutatingCallbackDoesNotDeadlock() throws Exception {
+        ConcurrentLong2LongHashMap map = new ConcurrentLong2LongHashMap(1);
+        map.put(1L, 100L);
+        map.put(2L, 200L);
+
+        ExecutorService executor = Executors.newSingleThreadExecutor();
+        Future<?> future = executor.submit(() -> {
+            map.forEach((ConcurrentLong2LongHashMap.LongLongConsumer) (k, v) 
-> {
+                map.put(k + 1000L, v + 1);
+            });
+        });
+        future.get(5, java.util.concurrent.TimeUnit.SECONDS);
+        executor.shutdown();

Review Comment:
   The executor is shut down only after `future.get(...)` completes 
successfully. If the get times out or throws, the executor thread may leak and 
interfere with the rest of the test suite. Use try/finally, cancel the future 
on failure, and call `shutdownNow()` (and optionally `awaitTermination`).
   ```suggestion
           try {
               future.get(5, java.util.concurrent.TimeUnit.SECONDS);
           } catch (Exception e) {
               future.cancel(true);
               throw e;
           } finally {
               executor.shutdownNow();
               executor.awaitTermination(5, 
java.util.concurrent.TimeUnit.SECONDS);
           }
   ```



##########
fe/fe-foundation/src/main/java/org/apache/doris/foundation/util/ConcurrentLong2ObjectHashMap.java:
##########
@@ -289,36 +316,48 @@ public void putAll(Map<? extends Long, ? extends V> m) {
     // to ensure the check-then-act is atomic within a segment's write lock.
 
     public V computeIfAbsent(long key, LongFunction<? extends V> 
mappingFunction) {
+        checkNotInCallback();
         Segment<V> seg = segmentFor(key);
         seg.lock.writeLock().lock();
         try {
             V val = seg.map.get(key);
             if (val != null || seg.map.containsKey(key)) {
                 return val;
             }
-            V newValue = mappingFunction.apply(key);
-            if (newValue != null) {
-                seg.map.put(key, newValue);
+            inCallback.set(Boolean.TRUE);
+            try {
+                V newValue = mappingFunction.apply(key);
+                if (newValue != null) {
+                    seg.map.put(key, newValue);
+                }
+                return newValue;
+            } finally {
+                inCallback.set(Boolean.FALSE);
             }

Review Comment:
   Using `ThreadLocal` and setting it back to `Boolean.FALSE` can still leave 
an entry in long-lived thread pools (and may delay cleanup after the map is 
GC'd). Consider using `inCallback.remove()` in the finally block instead of 
`set(Boolean.FALSE)` so the per-thread value is cleared entirely when leaving 
the callback.



##########
fe/fe-foundation/src/test/java/org/apache/doris/foundation/util/ConcurrentLong2ObjectHashMapTest.java:
##########
@@ -392,6 +392,70 @@ void testConcurrentIterationDuringModification() throws 
Exception {
         Assertions.assertEquals(0, errors.get());
     }
 
+    // ---- Deadlock-safety tests ----
+
+    @Test
+    void testForEachWithMutatingCallbackDoesNotDeadlock() throws Exception {
+        // Before the fix, forEach held a read lock and the callback calling 
put() would try
+        // to acquire a write lock on the same segment, causing a 
read-to-write upgrade deadlock.
+        ConcurrentLong2ObjectHashMap<String> map = new 
ConcurrentLong2ObjectHashMap<>(1);
+        map.put(1L, "one");
+        map.put(2L, "two");
+
+        // Run in a separate thread with a timeout so the test fails fast 
instead of hanging
+        ExecutorService executor = Executors.newSingleThreadExecutor();
+        Future<?> future = executor.submit(() -> {
+            map.forEach((ConcurrentLong2ObjectHashMap.LongObjConsumer<String>) 
(k, v) -> {
+                map.put(k + 1000L, v + "-copy");
+            });
+        });
+        // Should complete within 5 seconds; if it deadlocks, get() will time 
out
+        future.get(5, java.util.concurrent.TimeUnit.SECONDS);
+        executor.shutdown();

Review Comment:
   The executor is only shut down on the success path. If `future.get(...)` 
times out or throws, the single-thread executor can remain alive and keep the 
test JVM from exiting. Wrap the submit/get in try/finally, cancel the Future on 
failure, and use `shutdownNow()` (optionally `awaitTermination`) to ensure 
cleanup.
   ```suggestion
           Future<?> future = null;
           try {
               future = executor.submit(() -> {
                   
map.forEach((ConcurrentLong2ObjectHashMap.LongObjConsumer<String>) (k, v) -> {
                       map.put(k + 1000L, v + "-copy");
                   });
               });
               // Should complete within 5 seconds; if it deadlocks, get() will 
time out
               future.get(5, java.util.concurrent.TimeUnit.SECONDS);
           } finally {
               if (future != null && !future.isDone()) {
                   future.cancel(true);
               }
               executor.shutdownNow();
               executor.awaitTermination(5, 
java.util.concurrent.TimeUnit.SECONDS);
           }
   ```



##########
fe/fe-foundation/src/main/java/org/apache/doris/foundation/util/ConcurrentLong2LongHashMap.java:
##########
@@ -52,12 +52,20 @@
  * <p><b>Important:</b> All compound operations from both {@link Long2LongMap} 
and {@link Map}
  * interfaces (computeIfAbsent, computeIfPresent, compute, merge, mergeLong, 
putIfAbsent,
  * replace, remove) are overridden to ensure atomicity within a segment.
+ *
+ * <p><b>Callback restriction:</b> The mapping/remapping functions passed to 
{@code computeIfAbsent},
+ * {@code computeIfPresent}, {@code compute}, {@code merge}, and {@code 
mergeLong} <em>must not</em>
+ * attempt to update any other mappings of this map. This restriction is 
enforced at runtime:
+ * reentrant access from a callback throws {@link IllegalStateException}. 
Violation may also cause
+ * deadlock if callbacks attempt cross-segment updates from multiple threads.

Review Comment:
   The callback guard only prevents reentrant *writes*. A compute/merge 
callback can still call read APIs (`get`, `containsKey`, etc.) for other keys, 
which can deadlock between segments when two threads each hold a segment 
write-lock and try to read the other segment. To truly turn these deadlocks 
into fail-fast errors, consider guarding read methods too (or strengthen the 
documented restriction to forbid any map access from callbacks, not just 
updates).
   ```suggestion
    * access this map in any way while executing, including reads such as 
{@code get},
    * {@code containsKey}, or any attempt to update other mappings. Reentrant 
callback access is
    * rejected with {@link IllegalStateException}; any map access from a 
callback is unsupported
    * because cross-segment reads or writes from multiple threads may deadlock.
   ```



##########
fe/fe-foundation/src/main/java/org/apache/doris/foundation/util/ConcurrentLong2LongHashMap.java:
##########
@@ -302,13 +326,20 @@ public long addTo(long key, long increment) {
     }
 
     public long computeIfAbsent(long key, LongUnaryOperator mappingFunction) {
+        checkNotInCallback();
         Segment seg = segmentFor(key);
         seg.lock.writeLock().lock();
         try {
             if (seg.map.containsKey(key)) {
                 return seg.map.get(key);
             }
-            long newValue = mappingFunction.applyAsLong(key);
+            inCallback.set(Boolean.TRUE);
+            long newValue;
+            try {
+                newValue = mappingFunction.applyAsLong(key);
+            } finally {
+                inCallback.set(Boolean.FALSE);
+            }

Review Comment:
   `ThreadLocal` values set back to `Boolean.FALSE` still leave an entry in the 
thread's ThreadLocalMap (which can be undesirable with long-lived thread 
pools). Consider calling `inCallback.remove()` in the finally block instead of 
`set(Boolean.FALSE)` to fully clear the thread-local state on exit.



##########
fe/fe-foundation/src/main/java/org/apache/doris/foundation/util/ConcurrentLong2ObjectHashMap.java:
##########
@@ -53,13 +53,25 @@
  * interfaces (computeIfAbsent, computeIfPresent, compute, merge, putIfAbsent, 
replace, remove)
  * are overridden to ensure atomicity within a segment.
  *
+ * <p><b>Callback restriction:</b> The mapping/remapping functions passed to 
{@code computeIfAbsent},
+ * {@code computeIfPresent}, {@code compute}, and {@code merge} <em>must 
not</em> attempt to update
+ * any other mappings of this map. This restriction is enforced at runtime: 
reentrant access from a
+ * callback throws {@link IllegalStateException}. Violation may also cause 
deadlock if callbacks
+ * attempt cross-segment updates from multiple threads.

Review Comment:
   The reentrancy guard only blocks *writes* from compute/merge callbacks. 
Callbacks can still call `get/containsKey/...` on other keys, which can 
deadlock across segments (e.g., T1 holds segA write-lock then reads segB; T2 
holds segB write-lock then reads segA). If the intent is to fail-fast instead 
of risking deadlock, consider also checking `inCallback` in read methods (or 
explicitly documenting that callbacks must not access the map at all, even for 
reads).
   ```suggestion
    * {@code computeIfPresent}, {@code compute}, and {@code merge} <em>must 
not</em> access this map
    * reentrantly in any way, including both updates and reads (for example 
{@code get},
    * {@code containsKey}, or any other operation on another key). Runtime 
checks reject reentrant
    * mutating access from a callback by throwing {@link 
IllegalStateException}; however, even
    * read-only cross-segment access from callbacks can deadlock when multiple 
threads hold locks on
    * different segments. Callbacks must therefore treat this map as off-limits 
for the duration of
    * the callback.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to