neilconway commented on code in PR #20672:
URL: https://github.com/apache/datafusion/pull/20672#discussion_r2884372912


##########
datafusion/physical-plan/src/spill/spill_pool.rs:
##########
@@ -1343,6 +1368,88 @@ mod tests {
         Ok(())
     }
 
+    /// Regression test for data loss when multiple writer clones are used.
+    ///
+    /// `SpillPoolWriter` is `Clone`, and in non-preserve-order repartitioning
+    /// mode all input partition tasks share clones of the same writer. Before
+    /// the fix, `Drop` unconditionally set `writer_dropped = true` even when
+    /// other clones were still alive. This caused the `SpillPoolReader` to
+    /// return EOF prematurely, silently losing every batch written by the
+    /// remaining writers.

Review Comment:
   [ Nit, optional ] I personally try to avoid comments that are very specific 
to the previous (buggy) implementation, because that implementation won't exist 
for future readers, and the current / future state of the code might drift 
without the comments being updated. So I'd personally remove references to "the 
bug" or the exact previous behavior, and talk more about what a correct 
implementation ought to do and the properties of a correct implementation that 
we're checking for.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to