alamb commented on code in PR #12302:
URL: https://github.com/apache/datafusion/pull/12302#discussion_r1742637563


##########
datafusion/physical-plan/src/sorts/merge.rs:
##########
@@ -97,6 +100,10 @@ pub(crate) struct SortPreservingMergeStream<C: 
CursorValues> {
 
     /// number of rows produced
     produced: usize,
+
+    /// Unitiated partitions. They are stored in a vector to keep them in

Review Comment:
   Can you please document what an "uninitiated partition" means in this 
context? I think it means  partitions whose streams that have been polled 
haven't been ready yet



##########
datafusion/physical-plan/src/sorts/merge.rs:
##########
@@ -156,12 +164,22 @@ impl<C: CursorValues> SortPreservingMergeStream<C> {
         }
         // try to initialize the loser tree
         if self.loser_tree.is_empty() {
-            // Ensure all non-exhausted streams have a cursor from which
-            // rows can be pulled
-            for i in 0..self.streams.partitions() {
-                if let Err(e) = ready!(self.maybe_poll_stream(cx, i)) {

Review Comment:
   It looks to me like the issue is that the `ready!` macro `return`s if the 
poll isn't ready -- and thus the other streams aren't polled. 
   
   I wonder if simply changing this to not use `ready!` would work, and be a 
smaller change?
   
   Like could you change this to be the following instead? I think that would 
ensure each stream is polled
   
   ```rust
                   if let Poll::Ready(Err(e)) = 
ready!(self.maybe_poll_stream(cx, i)) {



##########
datafusion/physical-plan/src/sorts/merge.rs:
##########
@@ -156,12 +164,22 @@ impl<C: CursorValues> SortPreservingMergeStream<C> {
         }
         // try to initialize the loser tree
         if self.loser_tree.is_empty() {
-            // Ensure all non-exhausted streams have a cursor from which
-            // rows can be pulled
-            for i in 0..self.streams.partitions() {
-                if let Err(e) = ready!(self.maybe_poll_stream(cx, i)) {
-                    self.aborted = true;
-                    return Poll::Ready(Some(Err(e)));
+            // Ensure all non-exhausted streams have a cursor from which rows 
can be pulled

Review Comment:
   This comment implies to me that the code would / should `poll` *all* the 
streams. However, the code seems to ensure now that only streams that had 
previously not returned `Ready` for a poll are now polled. 
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to