pepijnve opened a new issue, #16193:
URL: https://github.com/apache/datafusion/issues/16193

   ### Describe the bug
   
   Canceling queries is done by dropping the corresponding `RecordBatchStream`. 
This can be done using tokio's `switch!` macro as can be seen in the CLI 
application's handling of ctrl-c. In order for this to work, the 
`RecordBatchStream` does need to return a pending poll result ever now and then 
so that tokio can await the result. Typically this works fine thanks to the 
`run_input` logic in `CoalesceExec`.
   
   When setting the number of target partitions to 1 though, `CoalesceExec` is 
not emitted by the query planner. A simple aggregate query results in the 
following plan:
   
   ```
   > explain verbose select sum(size) from t;
   +---------------+-------------------------------+
   | plan_type     | plan                          |
   +---------------+-------------------------------+
   | physical_plan | ┌───────────────────────────┐ |
   |               | │       AggregateExec       │ |
   |               | │    --------------------   │ |
   |               | │     aggr: sum(t.size)     │ |
   |               | │        mode: Single       │ |
   |               | └─────────────┬─────────────┘ |
   |               | ┌─────────────┴─────────────┐ |
   |               | │       DataSourceExec      │ |
   |               | │    --------------------   │ |
   |               | │          files: 1         │ |
   |               | │        format: bff        │ |
   |               | └───────────────────────────┘ |
   |               |                               |
   +---------------+-------------------------------+
   ```
   
   Because the streams created by `AggregateExec` consume their input in a loop 
directly in their poll implementations rather than using a 
`RecordBatchReceiverStream` or something similar, calls to poll have a tendency 
to block for an extended period of time. This prevents interrupting query 
execution.
   
   ### To Reproduce
   
   _No response_
   
   ### Expected behavior
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to