felipecrv commented on code in PR #39904: URL: https://github.com/apache/arrow/pull/39904#discussion_r1478618063
########## cpp/src/arrow/filesystem/azurefs.cc: ########## @@ -942,6 +972,183 @@ FileInfo FileInfoFromBlob(std::string_view container, return info; } +/// \brief RAII-style guard for releasing a lease on a blob or container. +/// +/// The guard should be constructed right after a successful BlobLeaseClient::Acquire() +/// call. Use std::optional<LeaseGuard> to declare a guard in outer scope and construct it +/// later with std::optional::emplace(...). +/// +/// Leases expire automatically, but explicit release means concurrent clients or +/// ourselves when trying new operations on the same blob or container don't have +/// to wait for the lease to expire by itself. +/// +/// Learn more about leases at +/// https://learn.microsoft.com/en-us/rest/api/storageservices/lease-blob +class LeaseGuard { + public: + using SteadyClock = std::chrono::steady_clock; + + private: + /// \brief The time when the lease expires or is broken. + /// + /// The lease is not guaranteed to be valid until this time, but it is guaranteed to + /// be expired after this time. In other words, this is an overestimation of + /// the true time_point. + SteadyClock::time_point break_or_expires_at_; + const std::unique_ptr<Blobs::BlobLeaseClient> lease_client_; + bool release_attempt_pending_ = true; + + /// \brief The latest known expiry time of a lease guarded by this class + /// that failed to be released or was forgotten by calling Forget(). + static std::atomic<SteadyClock::time_point> latest_known_expiry_time_; + + /// \brief The maximum lease duration supported by Azure Storage. + static constexpr std::chrono::seconds kMaxLeaseDuration{60}; + + public: + LeaseGuard(std::unique_ptr<Blobs::BlobLeaseClient> lease_client, Review Comment: What you mean by "consumer" and "more internally" here? The Arrow implementation is internal compared to the software using Arrow. Leases are a common Distributed Systems pattern [1] and the multi-step operations being performed here would have almost unpredictable outcomes in the presence of concurrent clients. Without concurrent mutators, they are very cheap (lead to no delays at all) and with concurrent mutators, they lead to outcomes we and users can reason about. Note that I often use the lease acquisition as an existence check I would have to do anyways. [1] https://martinfowler.com/articles/patterns-of-distributed-systems/lease.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
