crepererum opened a new issue #1103:
URL: https://github.com/apache/arrow-datafusion/issues/1103


   **Describe the bug**
   Canceling futures in Rust is usually quite easy: just `drop` them. However 
DataFusion has a few places where it uses `tokio::spawn` to create new, 
independent execution flows and forgets to wire up the `JoinHandle`s in a way 
that dropping the owning future/stream would also drop the sub-flow.
   
   The sub-flow is usually connected using a 
`futures::channel::oneshot::channel`, so dropping the receiver side (the 
original future/stream) leads to error on the sender side (the sub-flow). 
Furthermore the sender side can test if the channel was cancelled. However this 
only works as long as the control flow on the sender side actually tries to 
send something. For long-running external IO (e.g. pulling data from an object 
store) or UDFs, this cancellation can be a bit late.
   
   **To Reproduce**
   1. Create a `ExecutionPlan` that emit streams that are forever `Pending`
   2. Use this plan as an input for `SortExec` (for example, other operators 
also have this issue)
   3. Start collecting the output of `SortExec`, it will be `Pending`
   4. Drop the entire `SortExec`
   5. The input (the plan created in step 1) will never be dropped
   
   **Expected behavior**
   At least after a short while, the input should be dropped as well even when 
it is pending forever.
   
   **Additional context**
   This affects master commit `90a4c8435152841117c957a74ea4d046815d6c6a`.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to