bolma-lila commented on PR #16656: URL: https://github.com/apache/iceberg/pull/16656#issuecomment-4622770610
Thanks @wombatu-kun for the careful review — all three points are fair, and digging into them led me to conclude this change isn't the right fix, so I'm going to close it. **Context on what I was actually chasing:** the concrete failure was a `403` from an STS-scoped session policy (Lakekeeper vends credentials scoped to `s3:prefix = "<table>/*"`). The runtime stack trace was `S3FileIO.deletePrefix` ← Airbyte's `IcebergTableCleaner.clearTable`, which passes a table location *without* a trailing slash. Because `deletePrefix(p)` forwards `p` straight into `listPrefix(p)`, the unslashed prefix reached `ListObjectsV2` and got denied. **Why this PR isn't needed:** that's an Airbyte-caller problem, and it's fixed at the caller in [airbytehq/airbyte#78624](https://github.com/airbytehq/airbyte/pull/78624) by normalizing the location before `deletePrefix`. Tracing the rest of the paths: - Iceberg-core's only internal `listPrefix` caller during writes is via `ResolvingFileIO` (pass-through) — and nothing in the commit/append path lists or deletes prefixes. - `FileSystemWalker` (orphan-file removal) already appends `/` before calling `listPrefix`, so it's unaffected either way. So as you noted, the only remaining beneficiary would be external/STS callers — and for those, the right place is the caller (as Iceberg already does in `FileSystemWalker`), not an unconditional change in `S3FileIO` that: - leaves `s3.directory-bucket.list-prefix-as-directory` and `S3URI.useS3DirectoryBucket()` dead (your point #1), and - diverges S3 from `GCSFileIO`/`ADLSFileIO`, which still pass raw prefixes (your point #2). Closing this in favor of the caller-side fix. If there's appetite for sibling-prefix hardening as defense-in-depth, I'd be happy to revisit it as a properly flag-gated, cross-FileIO change in a separate PR. Thanks again. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
