JerAguilon commented on code in PR #37839:
URL: https://github.com/apache/arrow/pull/37839#discussion_r1342077527
##########
cpp/src/arrow/acero/asof_join_node.cc:
##########
@@ -1679,6 +1704,15 @@ class AsofJoinNode : public ExecNode {
const Ordering& ordering() const override { return ordering_; }
Status InputReceived(ExecNode* input, ExecBatch batch) override {
+ // InputReceived may be called after execution was finished. Pushing it to
the
+ // InputState may cause the BackPressureController to pause the input,
causing a
+ // deadlock
Review Comment:
Yes, the `forceShutdown` is still necessary. there's nothing stopping this
order of events:
1. We receive enough data to finish the as of join.
2. Right before we finish processing and shut down the worker thread, lots
of batches come in from input A. Input A pauses
3. We shut down the thread, and input A can't be unpaused
Put another way, `forceShutdown` keeps us from deadlocking when we ingest
unneeded data before the worker thread exits. And this block keeps us from
deadlocking when we ingest unneeded data after the worker thread exits.
But your comment change suggestions sound good to me
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]