alamb commented on code in PR #12302:
URL: https://github.com/apache/datafusion/pull/12302#discussion_r1742637563
##########
datafusion/physical-plan/src/sorts/merge.rs:
##########
@@ -97,6 +100,10 @@ pub(crate) struct SortPreservingMergeStream<C:
CursorValues> {
/// number of rows produced
produced: usize,
+
+ /// Unitiated partitions. They are stored in a vector to keep them in
Review Comment:
Can you please document what an "uninitiated partition" means in this
context? I think it means partitions whose streams that have been polled
haven't been ready yet
##########
datafusion/physical-plan/src/sorts/merge.rs:
##########
@@ -156,12 +164,22 @@ impl<C: CursorValues> SortPreservingMergeStream<C> {
}
// try to initialize the loser tree
if self.loser_tree.is_empty() {
- // Ensure all non-exhausted streams have a cursor from which
- // rows can be pulled
- for i in 0..self.streams.partitions() {
- if let Err(e) = ready!(self.maybe_poll_stream(cx, i)) {
Review Comment:
It looks to me like the issue is that the `ready!` macro `return`s if the
poll isn't ready -- and thus the other streams aren't polled.
I wonder if simply changing this to not use `ready!` would work, and be a
smaller change?
Like could you change this to be the following instead? I think that would
ensure each stream is polled
```rust
if let Poll::Ready(Err(e)) =
ready!(self.maybe_poll_stream(cx, i)) {
##########
datafusion/physical-plan/src/sorts/merge.rs:
##########
@@ -156,12 +164,22 @@ impl<C: CursorValues> SortPreservingMergeStream<C> {
}
// try to initialize the loser tree
if self.loser_tree.is_empty() {
- // Ensure all non-exhausted streams have a cursor from which
- // rows can be pulled
- for i in 0..self.streams.partitions() {
- if let Err(e) = ready!(self.maybe_poll_stream(cx, i)) {
- self.aborted = true;
- return Poll::Ready(Some(Err(e)));
+ // Ensure all non-exhausted streams have a cursor from which rows
can be pulled
Review Comment:
This comment implies to me that the code would / should `poll` *all* the
streams. However, the code seems to ensure now that only streams that had
previously not returned `Ready` for a poll are now polled.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]