mapleFU commented on code in PR #40722:
URL: https://github.com/apache/arrow/pull/40722#discussion_r1547138479
##########
cpp/src/arrow/dataset/dataset_writer.cc:
##########
@@ -549,11 +566,14 @@ class DatasetWriter::DatasetWriterImpl {
WriteAndCheckBackpressure(std::move(batch), directory, prefix);
if (!has_room.is_finished()) {
// We don't have to worry about sequencing backpressure here since
- // task_group_ serves as our sequencer. If batches continue to
arrive after
- // we pause they will queue up in task_group_ until we free up and
call
- // Resume
+ // task_group_ serves as our sequencer. If batches continue to
arrive
+ // after we pause they will queue up in task_group_ until we free
up and
+ // call Resume
pause_callback_();
- return has_room.Then([this] { resume_callback_(); });
+ paused_ = true;
+ return has_room.Then([this] { ResumeIfNeeded(); });
+ } else {
+ ResumeIfNeeded();
}
Review Comment:
Usally mutex is used with mutable. But this LGTM either
##########
cpp/src/arrow/util/async_util.h:
##########
@@ -277,6 +278,8 @@ class ARROW_EXPORT ThrottledAsyncTaskScheduler : public
AsyncTaskScheduler {
/// Allows task to be submitted again. If there is a max_concurrent_cost
limit then
/// it will still apply.
virtual void Resume() = 0;
+ /// Return the number of tasks queued but not yet submitted
+ virtual std::size_t QueueSize() = 0;
Review Comment:
Usally mutex is used with mutable. But this LGTM either, I don't have strong
preference too
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]