felipecrv commented on code in PR #43286:
URL: https://github.com/apache/arrow/pull/43286#discussion_r1681273394
##########
cpp/src/arrow/filesystem/s3fs.cc:
##########
@@ -3072,8 +3078,11 @@ Status S3FileSystem::Move(const std::string& src, const
std::string& dest) {
}
RETURN_NOT_OK(impl_->CopyObject(src_path, dest_path));
RETURN_NOT_OK(impl_->DeleteObject(src_path.bucket, src_path.key));
- // Source parent may be implicitly deleted if it became empty, recreate it
- return impl_->EnsureParentExists(src_path);
+ if (options().create_missing_dirs_on_delete) {
+ // Source parent may be implicitly deleted if it became empty, recreate it
+ return impl_->EnsureParentExists(src_path);
+ }
Review Comment:
Other parts of the filesystem implementation have an operation that "checks
if a directory exists". That operation usually considers that keys with a
prefix imply the existence of the directory even if a marker doesn't exist.
For instance, if the object store has only this object:
`dir0/myfile.parquet`
`DirExists?("dir0")` returns `true` because there is at least one object key
with `dir0` as the prefix.
The problem now is that `DeleteFile("dir0/myfile.parquet")` (w/ this new
option == true) will cause `DirExists?("dir0")` to return `false` after the
last file is deleted. Which breaks the expectation of a filesystem.
But if you modify `DirExists?("dir0")` to return `true` only if a marker
exists (w/ this new option == true) it will increase the chances that a marker
is always created *before* files are added to the directory.
`CreateDir("dir0")` creates the `dir0/` marker. Creating
`"dir0/myfile.parquet"` checks the directory exists (by checking the existence
of the marker). Taking the object store to this list of keys:
```
dir0/
dir0/myfile.parquet
```
Now it's totally safe to delete `myfile.parquet` without caring about `dir0`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]