westonpace commented on a change in pull request #11977:
URL: https://github.com/apache/arrow/pull/11977#discussion_r770943221
##########
File path: cpp/src/arrow/filesystem/s3fs.cc
##########
@@ -1732,44 +1752,43 @@ class S3FileSystem::Impl : public
std::enable_shared_from_this<S3FileSystem::Imp
auto outcome = client_->HeadObject(req);
if (outcome.IsSuccess()) {
- *out = true;
- return Status::OK();
+ return true;
}
if (!backend_) {
SaveBackend(outcome.GetError());
Review comment:
Is `SaveBackend` an expensive operation? If we move this check up above
`if (backend_ && *backend_ == S3Backend::Minio)` it seems we could avoid
potentially recursing by always getting it right the first time. But I might
be missing something.
##########
File path: cpp/src/arrow/filesystem/s3fs.cc
##########
@@ -1720,7 +1720,27 @@ class S3FileSystem::Impl : public
std::enable_shared_from_this<S3FileSystem::Imp
// a non-empty "directory". This is a Minio-specific quirk, but we need
// to handle it for unit testing.
- Status IsEmptyDirectory(const std::string& bucket, const std::string& key,
bool* out) {
+ // If this method is called after HEAD on "bucket/key" already returned a
404,
+ // can pass the given outcome to spare a spurious HEAD call.
+ Result<bool> IsEmptyDirectory(
+ const std::string& bucket, const std::string& key,
+ const S3Model::HeadObjectOutcome* previous_outcome = nullptr) {
+ if (previous_outcome) {
+ // Fetch the backend from the previous error
+ DCHECK(!previous_outcome->IsSuccess());
+ if (!backend_) {
+ SaveBackend(previous_outcome->GetError());
+ DCHECK(backend_);
+ }
+ if (backend_ != S3Backend::Minio) {
+ // HEAD already returned a 404, nothing more to do
+ return false;
+ }
+ }
+
+ // We come here in one of two situations:
+ // - we don't know the backend and there is no previous outcome
Review comment:
```suggestion
// - there is no previous outcome
```
##########
File path: cpp/src/arrow/filesystem/s3fs.cc
##########
@@ -1720,7 +1720,27 @@ class S3FileSystem::Impl : public
std::enable_shared_from_this<S3FileSystem::Imp
// a non-empty "directory". This is a Minio-specific quirk, but we need
// to handle it for unit testing.
- Status IsEmptyDirectory(const std::string& bucket, const std::string& key,
bool* out) {
+ // If this method is called after HEAD on "bucket/key" already returned a
404,
+ // can pass the given outcome to spare a spurious HEAD call.
+ Result<bool> IsEmptyDirectory(
+ const std::string& bucket, const std::string& key,
+ const S3Model::HeadObjectOutcome* previous_outcome = nullptr) {
Review comment:
I don't know if this is one we enforce or not but the style guide
prefers `util::optional<S3Model::HeadObjectOutcome>` (well, technically
std::optional) instead of `const *`
https://google.github.io/styleguide/cppguide.html#Inputs_and_Outputs
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]