felipecrv commented on PR #41232:
URL: https://github.com/apache/arrow/pull/41232#issuecomment-2062514844

   > Oh, another problem here is bool closed() is weird after this patch
   
   @mapleFU that weirdness is related to what I was referring to in the thread 
above. What `closed_` means after you called `CloseAsync`? It's not `false` 
(immediately) and even if the I/O thread sets it to `true` in the background, 
there is no guarantee the thread that called `CloseAsync` **can see** that 
mutation made by the other thread that called `Close`.
   
   I took a look at the history of `CloseAsync`. This implementation was added 
to fix hanging of the dataset loads.
   
   
https://github.com/apache/arrow/blob/73386cb806a429875f1846b69e21beccfd339b21/cpp/src/arrow/io/interfaces.cc#L70-L73
   
   It's called from `Future<> FileWriter::Finish`
   
   
https://github.com/apache/arrow/blob/73386cb806a429875f1846b69e21beccfd339b21/cpp/src/arrow/dataset/file_base.cc#L355-L360
   
   Note that `this` is captured by pointer in `CloseAsync` and relies on the 
caller keeping the shared_ptr alive until the `Future<>` completes.
   
   To fix this kind of problem, you added `enabled_shared_from_this` on 
`FileInterface`
   
   
https://github.com/apache/arrow/blob/73386cb806a429875f1846b69e21beccfd339b21/cpp/src/arrow/io/interfaces.h#L99
   
   A complement to that change to avoid potential user-after-free (UAF) in 
`FileInterface::CloseAsync`
   
   ```diff
   -   return DeferNotOk( 
   -       default_io_context().executor()->Submit([this]() { return Close(); 
})); 
   +   return DeferNotOk(default_io_context().executor()->Submit(
   +       [self = shared_from_this()]() { return self->Close(); }));
   ```
   
   Back to the data-race on `this->closed_` described above. I don't think it's 
common that after calling `CloseAsync`, the handle is used, but this would give 
us at least a runtime hint that we're doing something wrong (could be made 
debug-only).
   
   ```diff
     Future<> FileInterface::CloseAsync() {
   +   // To avoid data races, after CloseAsync is called, this instance should 
not
   +   // be used anymore. File handle implementations are not thread-safe -- 
they
   +   // are supposed to be used from a single thread. To catch *potential* 
misuse,
   +   // we log a WARNING if the caller still holds more than this reference 
to the file.
   +   auto use_count = weak_from_this().use_count();
   +   if (use_count > 1) {
   +     ARROW_LOG(WARNING) << "All shared_ptr references to the file should be 
"
   +                           "released before calling CloseAsync. At least "
   +                        << use_count - 1 << " references are still alive.";
   +   }
       return DeferNotOk(default_io_context().executor()->Submit(
           [self = shared_from_this()]() { return self->Close(); }));
     }
   ```
   
   @bkietz, what you think?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to