felipecrv commented on PR #41232: URL: https://github.com/apache/arrow/pull/41232#issuecomment-2062514844
> Oh, another problem here is bool closed() is weird after this patch @mapleFU that weirdness is related to what I was referring to in the thread above. What `closed_` means after you called `CloseAsync`? It's not `false` (immediately) and even if the I/O thread sets it to `true` in the background, there is no guarantee the thread that called `CloseAsync` **can see** that mutation made by the other thread that called `Close`. I took a look at the history of `CloseAsync`. This implementation was added to fix hanging of the dataset loads. https://github.com/apache/arrow/blob/73386cb806a429875f1846b69e21beccfd339b21/cpp/src/arrow/io/interfaces.cc#L70-L73 It's called from `Future<> FileWriter::Finish` https://github.com/apache/arrow/blob/73386cb806a429875f1846b69e21beccfd339b21/cpp/src/arrow/dataset/file_base.cc#L355-L360 Note that `this` is captured by pointer in `CloseAsync` and relies on the caller keeping the shared_ptr alive until the `Future<>` completes. To fix this kind of problem, you added `enabled_shared_from_this` on `FileInterface` https://github.com/apache/arrow/blob/73386cb806a429875f1846b69e21beccfd339b21/cpp/src/arrow/io/interfaces.h#L99 A complement to that change to avoid potential user-after-free (UAF) in `FileInterface::CloseAsync` ```diff - return DeferNotOk( - default_io_context().executor()->Submit([this]() { return Close(); })); + return DeferNotOk(default_io_context().executor()->Submit( + [self = shared_from_this()]() { return self->Close(); })); ``` Back to the data-race on `this->closed_` described above. I don't think it's common that after calling `CloseAsync`, the handle is used, but this would give us at least a runtime hint that we're doing something wrong (could be made debug-only). ```diff Future<> FileInterface::CloseAsync() { + // To avoid data races, after CloseAsync is called, this instance should not + // be used anymore. File handle implementations are not thread-safe -- they + // are supposed to be used from a single thread. To catch *potential* misuse, + // we log a WARNING if the caller still holds more than this reference to the file. + auto use_count = weak_from_this().use_count(); + if (use_count > 1) { + ARROW_LOG(WARNING) << "All shared_ptr references to the file should be " + "released before calling CloseAsync. At least " + << use_count - 1 << " references are still alive."; + } return DeferNotOk(default_io_context().executor()->Submit( [self = shared_from_this()]() { return self->Close(); })); } ``` @bkietz, what you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
