mapleFU commented on code in PR #40722:
URL: https://github.com/apache/arrow/pull/40722#discussion_r1546488445
##########
cpp/src/arrow/util/async_util.h:
##########
@@ -277,6 +278,8 @@ class ARROW_EXPORT ThrottledAsyncTaskScheduler : public
AsyncTaskScheduler {
/// Allows task to be submitted again. If there is a max_concurrent_cost
limit then
/// it will still apply.
virtual void Resume() = 0;
+ /// Return the number of tasks queued but not yet submitted
+ virtual std::size_t QueueSize() = 0;
Review Comment:
Would this better a `std::size_t QueueSize() const`?
##########
cpp/src/arrow/dataset/dataset_writer.cc:
##########
@@ -549,11 +566,14 @@ class DatasetWriter::DatasetWriterImpl {
WriteAndCheckBackpressure(std::move(batch), directory, prefix);
if (!has_room.is_finished()) {
// We don't have to worry about sequencing backpressure here since
- // task_group_ serves as our sequencer. If batches continue to
arrive after
- // we pause they will queue up in task_group_ until we free up and
call
- // Resume
+ // task_group_ serves as our sequencer. If batches continue to
arrive
+ // after we pause they will queue up in task_group_ until we free
up and
+ // call Resume
pause_callback_();
- return has_room.Then([this] { resume_callback_(); });
+ paused_ = true;
+ return has_room.Then([this] { ResumeIfNeeded(); });
+ } else {
+ ResumeIfNeeded();
}
Review Comment:
So this is because when `has_room` is finished when `paused_`,
`resume_callback_` is not called?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]