fresh-borzoni commented on PR #145:
URL: https://github.com/apache/fluss-rust/pull/145#issuecomment-3735044414

     Made more thorough audit:
     
     ### Fix 1: `ready()` (lines 173-212) - DashMap iterator across await
     The DashMap iterator is held while calling `bucket_ready().await`. Since 
`bucket_ready()` internally awaits on Mutex locks, the DashMap shard guard 
stays alive across async suspension points, blocking concurrent operations on 
the same shard.
   
     **Fix:** Snapshot the data into `Vec<(TablePath, Vec<(BucketId, 
Arc<Mutex<...>>)>)>` first, then iterate and await on the snapshot after all 
DashMap guards are dropped.
   
     ### Fix 2: `bucket_ready()` (lines 214-244) - signature change to work 
with snapshot
     Changed from taking `&BucketAndWriteBatches` (which borrows from DashMap 
entry) to taking an owned `Vec<(BucketId, Arc<Mutex<...>>)>` snapshot. This 
ensures no DashMap references survive to the await points inside the method.
   
     ### Fix 3: `drain_batches_for_one_node()` (lines 290-365) - Mutex held 
across loop
     The `nodes_drain_index` Mutex is locked once at the start and held 
throughout the entire loop, including across `deque.lock().await` calls. This 
unnecessarily serializes all drain operations (even for different nodes) and 
creates lock ordering issues.
   
     **Fix:** Split into three short critical sections: (1) lock and read start 
index, (2) do drain loop with no lock, (3) lock and write back final index.
   
     ### Fix 4: `await_flush_completion()` (lines 437-444) - RwLock guard 
across await
     The RwLock read guard on `incomplete_batches` is held while awaiting on 
result handles. This blocks any concurrent `remove_incomplete_batches()` writes 
for the entire flush duration, causing writer starvation or potential deadlock.
   
     **Fix:** Clone handles into a Vec before awaiting - same pattern as fixes 
above.
   
     All four fixes follow the "snapshot/clone Arc, drop guard, then await" 
pattern established in this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to