github-actions[bot] commented on code in PR #64779:
URL: https://github.com/apache/doris/pull/64779#discussion_r3465629486
##########
be/src/io/fs/s3_file_writer.cpp:
##########
@@ -152,13 +152,35 @@ Status S3FileWriter::close(bool non_block) {
_async_close_pack = std::make_unique<AsyncCloseStatusPack>();
_async_close_pack->future = _async_close_pack->promise.get_future();
s3_file_writer_async_close_queuing << 1;
- return
ExecEnv::GetInstance()->non_block_close_thread_pool()->submit_func([&]() {
+ Status submit_status = Status::OK();
+ DBUG_EXECUTE_IF("S3FileWriter.close.submit_async_close.inject_error", {
+ submit_status =
Status::IOError("S3FileWriter.close.submit_async_close.inject_error");
+ });
+ if (submit_status.ok()) {
+ submit_status =
+
ExecEnv::GetInstance()->non_block_close_thread_pool()->submit_func([&]() {
+ s3_file_writer_async_close_queuing << -1;
+ s3_file_writer_async_close_processing << 1;
+ _st = _close_impl();
+ _async_close_pack->promise.set_value(_st);
+ s3_file_writer_async_close_processing << -1;
+ });
+ }
+ if (!submit_status.ok()) {
Review Comment:
Blocking: this fallback assumes that a non-OK result from `submit_func()`
means the close lambda was not retained. That is not guaranteed by
`ThreadPool::do_submit`: it moves the runnable into `token->_entries`, may
enqueue the token, and increments `_total_queued_tasks` before calling
`create_thread()`. If that lazy thread creation fails while the pool has no
workers, `do_submit` returns the error without removing the queued task. This
path is reachable because thread-pool initialization deliberately ignores
min-thread creation failures.
Here the fallback clears `_async_close_pack`, restores the state, and runs
`_close_impl()` synchronously. If a later submit creates a worker, the
previously queued lambda can still run, capture this writer, and dereference
the now-null `_async_close_pack` or race with/destruct after the synchronous
close. The new `_submit_upload_buffer()` helper has the same assumption when it
calls `_complete_part_task_callback(st)` after a failed submit even though a
retained `UploadFileBuffer` can still later execute callbacks that capture this
writer.
Please either make the submit path used here guarantee that non-OK means the
runnable was not retained, for example by rolling back the queued task on
post-enqueue worker-creation failure, or use a different API/contract before
manually completing the task and falling back to synchronous close.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]