bolma-lila commented on PR #16656:
URL: https://github.com/apache/iceberg/pull/16656#issuecomment-4622770610

   Thanks @wombatu-kun for the careful review — all three points are fair, and 
digging into them led me to conclude this change isn't the right fix, so I'm 
going to close it.
   
   **Context on what I was actually chasing:** the concrete failure was a `403` 
from an STS-scoped session policy (Lakekeeper vends credentials scoped to 
`s3:prefix = "<table>/*"`). The runtime stack trace was `S3FileIO.deletePrefix` 
← Airbyte's `IcebergTableCleaner.clearTable`, which passes a table location 
*without* a trailing slash. Because `deletePrefix(p)` forwards `p` straight 
into `listPrefix(p)`, the unslashed prefix reached `ListObjectsV2` and got 
denied.
   
   **Why this PR isn't needed:** that's an Airbyte-caller problem, and it's 
fixed at the caller in 
[airbytehq/airbyte#78624](https://github.com/airbytehq/airbyte/pull/78624) by 
normalizing the location before `deletePrefix`. Tracing the rest of the paths:
   - Iceberg-core's only internal `listPrefix` caller during writes is via 
`ResolvingFileIO` (pass-through) — and nothing in the commit/append path lists 
or deletes prefixes.
   - `FileSystemWalker` (orphan-file removal) already appends `/` before 
calling `listPrefix`, so it's unaffected either way.
   
   So as you noted, the only remaining beneficiary would be external/STS 
callers — and for those, the right place is the caller (as Iceberg already does 
in `FileSystemWalker`), not an unconditional change in `S3FileIO` that:
   - leaves `s3.directory-bucket.list-prefix-as-directory` and 
`S3URI.useS3DirectoryBucket()` dead (your point #1), and
   - diverges S3 from `GCSFileIO`/`ADLSFileIO`, which still pass raw prefixes 
(your point #2).
   
   Closing this in favor of the caller-side fix. If there's appetite for 
sibling-prefix hardening as defense-in-depth, I'd be happy to revisit it as a 
properly flag-gated, cross-FileIO change in a separate PR. Thanks again.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to