> 4) control is not returned to the processing thread

Yes, it looks like the current implementation does not return control
to the processing thread, but I think this is correct, or at least "as
designed".  The thread will be used to continue iterating the source.

> control is not returned to the processing thread, and instead
> blocks when
> marking the backpressure_future_ as finished.

As Sasha said, the call to "MarkFinished" will then run callbacks.
One of those callbacks (the only one in this case) then continues to
iterate from the source, doing the work that was originally started by
the call to StartProducing.

Generally, the code will only go so far before it creates a new thread
task and then control will eventually return.  However, if you are
running without an executor, then there are no thread tasks, all
callbacks run immediately in the thread calling mark finished, and it
can be rather hard to understand the logic.  I'll try and draw up some
sequence diagrams as an example for this and Li Jin's earlier question
regarding ordering.

> Anyway, I wouldn’t currently rely on PauseProducing/ResumeProducing 
> unfortunately. It is currently not tested anywhere as far as I can tell and 
> ignored by a lot of nodes (such as HashJoinNode).

We do test, and rely, on the PauseProducing/ResumeProducing mechanics
to implement back-pressure for the datasets API.  This limits plans to
scan->filter->project->sink and all of these nodes have been tested to
accurately work with backpressure.  I think you're free to experiment
with it.  I agree however, that backpressure could maybe be a more
minor concern until some of the scheduler improvements are available.


On Wed, Jul 20, 2022 at 3:13 PM Sasha Krassovsky
<krassovskysa...@gmail.com> wrote:
>
> Hi,
> Futures run callbacks on the thread that marks then as finished. It seems 
> that inside of the Source node’s generator loop does add a callback 
> (https://github.com/iChauster/arrow/blob/asof_join2/cpp/src/arrow/compute/exec/source_node.cc#L130
>  
> <https://github.com/iChauster/arrow/blob/asof_join2/cpp/src/arrow/compute/exec/source_node.cc#L130>)
>  which continues the loop. I’m not entirely sure myself how this code works 
> (this generator + control flow thing is very opaque), but my guess is that’s 
> what’s causing it.
>
> One further note, copying a Future<> actually maintains a reference to the 
> same underlying future, which may also be unexpected at first. Specifically 
> in your code, doing Future<> to_finish = backpressure_future_; 
> to_finish.MarkFinished(); is equivalent to just 
> backpressure_future_.MarkFinished().
>
> Anyway, I wouldn’t currently rely on PauseProducing/ResumeProducing 
> unfortunately. It is currently not tested anywhere as far as I can tell and 
> ignored by a lot of nodes (such as HashJoinNode). Michal and I have some work 
> in progress involving a new scheduler with first-class support for back 
> pressure.
>
> Sasha
>
> > On Jul 20, 2022, at 1:49 PM, Ivan Chau <ivan.m.c...@gmail.com> wrote:
> >
> > backpressure_future_
>

Reply via email to