pitrou commented on issue #36523:
URL: https://github.com/apache/arrow/issues/36523#issuecomment-1624892343

   > Unfortunately, I think there is actually a potential deadlock here, just 
not the one that TSAN is reporting. As part of HandleUploadOutcome it seems we 
are first locking the upload state and then locking the client holder (M2 then 
M0). As part of UploadPart we are first locking the client holder and then 
locking the upload state (M0 then M2). 
   
   A reasonable fix is to lock the client holder only as long as necessary. I 
should have done that from the first start:
   ```diff
   diff --git a/cpp/src/arrow/filesystem/s3fs.cc 
b/cpp/src/arrow/filesystem/s3fs.cc
   index f05d89963c..79b8716017 100644
   --- a/cpp/src/arrow/filesystem/s3fs.cc
   +++ b/cpp/src/arrow/filesystem/s3fs.cc
   @@ -1527,8 +1527,6 @@ class ObjectOutputStream final : public 
io::OutputStream {
    
      Status UploadPart(const void* data, int64_t nbytes,
                        std::shared_ptr<Buffer> owned_buffer = nullptr) {
   -    ARROW_ASSIGN_OR_RAISE(auto client_lock, holder_->Lock());
   -
        S3Model::UploadPartRequest req;
        req.SetBucket(ToAwsString(path_.bucket));
        req.SetKey(ToAwsString(path_.key));
   @@ -1538,7 +1536,8 @@ class ObjectOutputStream final : public 
io::OutputStream {
    
        if (!background_writes_) {
          req.SetBody(std::make_shared<StringViewStream>(data, nbytes));
   -      auto outcome = client_lock->UploadPart(req);
   +      ARROW_ASSIGN_OR_RAISE(auto client_lock, holder_->Lock());
   +      auto outcome = std::move(client_lock)->UploadPart(req);
          if (!outcome.IsSuccess()) {
            return UploadPartError(req, outcome);
          } else {
   @@ -1562,12 +1561,13 @@ class ObjectOutputStream final : public 
io::OutputStream {
              upload_state_->pending_parts_completed = Future<>::Make();
            }
          }
   +      ARROW_ASSIGN_OR_RAISE(auto client_lock, holder_->Lock());
          // XXX This callback returns Aws::Utils::Outcome, it cannot easily 
call
          // `holder->Lock()` which returns arrow::Result.
          ARROW_ASSIGN_OR_RAISE(
              auto fut,
              SubmitIO(io_context_, [client_lock = std::move(client_lock), 
req]() mutable {
   -            return client_lock->UploadPart(req);
   +            return std::move(client_lock)->UploadPart(req);
              }));
          // The closure keeps the buffer and the upload state alive
          auto state = upload_state_;
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to