wombatu-kun commented on code in PR #16656:
URL: https://github.com/apache/iceberg/pull/16656#discussion_r3353441023
##########
aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java:
##########
@@ -331,10 +331,10 @@ public Iterable<FileInfo> listPrefix(String prefix) {
PrefixedS3Client client = clientForStoragePath(prefix);
S3URI uri = new S3URI(prefix,
client.s3FileIOProperties().bucketToAccessPointMapping());
- if (uri.useS3DirectoryBucket()
- &&
client.s3FileIOProperties().isS3DirectoryBucketListPrefixAsDirectory()) {
- uri = uri.toDirectoryPath();
- }
+ // Always normalize the prefix to end with "/" to prevent matching sibling
prefixes.
Review Comment:
`SupportsPrefixOperations.listPrefix` documents that object stores "may
allow for arbitrary prefixes"; this forces directory-only semantics for
general-purpose buckets with no opt-out, while `GCSFileIO.listPrefix`
(GCSFileIO.java:298) and `ADLSFileIO.listPrefix` (ADLSFileIO.java:219) still
pass the raw prefix - so the sibling-match behavior persists there and S3 now
diverges. Iceberg's own orphan-file listing already appends "/" before calling
listPrefix (FileSystemWalker.java:68), so the remaining benefit is external/STS
callers. Consider normalizing consistently across FileIOs or documenting why
only S3 changes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]