westonpace commented on a change in pull request #9941:
URL: https://github.com/apache/arrow/pull/9941#discussion_r609956429
##########
File path: cpp/src/arrow/util/async_generator.h
##########
@@ -1247,10 +1286,23 @@ class BackgroundGenerator {
waiting_future.MarkFinished(next);
}
}
+ {
+ auto guard = state->mutex.Lock();
+ state->running_at_all = false;
+ if (state->finished) {
+ state->task_finished.MarkFinished();
+ }
+ }
}
};
std::shared_ptr<State> state_;
+ // state_ is held by both the generator and the background thread so it
won't be cleaned
+ // up when all consumer references are relinquished. cleanup_ is only held
by the
+ // generator so it will be destructed when the last consumer reference is
gone. We use
+ // this to cleanup / stop the background generator in case the consuming end
stops
+ // listening (e.g. due to a downstream error)
+ std::shared_ptr<Cleanup> cleanup_;
Review comment:
I started down this route and ran into a bit of difficulty. The
challenge is that I'd want to release the weak pointer while I do slow
operations (otherwise I'm back at the original problem where the background
thread keeps the state alive forever). So right now the code look something
like...
```
std::shared_ptr<State> current_state = state_.lock();
auto next = current_state->it.Next();
```
I could do something like...
```
std::shared_ptr<State> current_state = state_.lock();
auto& it = current_state->it;
current_state.reset();
auto next = it.Next();
```
...and it would actually be valid. The state destructor would be waiting on
task_finished before it returns so even though state might be in the middle of
destruction it would not complete destruction until later.
In other words, when I realize I need to abandon I can't immediately delete
state. I have to signal the background worker to terminate immediately, wait
for it to terminate, and then delete state.
So it works but I feel like the resulting code would be even harder to
reason about because it would contain things that look invalid (accessing
references obtained off a pointer after I give up the lifecycle tracking for
that pointer) even though they just so happen to be valid. That would
complicate things for future readers I feel.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]