bert-beyondloops commented on code in PR #19785:
URL: https://github.com/apache/datafusion/pull/19785#discussion_r2690473559
##########
datafusion/physical-plan/src/filter.rs:
##########
@@ -767,10 +749,27 @@ impl Stream for FilterExecStream {
mut self: Pin<&mut Self>,
cx: &mut Context<'_>,
) -> Poll<Option<Self::Item>> {
- let poll;
let elapsed_compute =
self.metrics.baseline_metrics.elapsed_compute().clone();
loop {
+ // If there is a completed batch ready, return it
+ if let Some(batch) = self.batch_coalescer.next_completed_batch() {
+ self.metrics.selectivity.add_part(batch.num_rows());
+ let poll = Poll::Ready(Some(Ok(batch)));
+ return self.metrics.baseline_metrics.record_poll(poll);
+ }
+
+ if self.batch_coalescer.is_finished() {
+ // If input is done and no batches are ready, return None to
signal end of stream.
+ let poll = Poll::Ready(None);
+ return self.metrics.baseline_metrics.record_poll(poll);
Review Comment:
Indeed, this wasn't consistent. But honestly, you know this (it only records
information when there is a batch) by looking at the internals of the
record_poll.
In my opinion, you should call the record_poll everywhere but this is
another discussion as such :-)
I'll adapt.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]