cassio-paesleme opened a new pull request, #1078:
URL: https://github.com/apache/iceberg-go/pull/1078

   ## Summary
   
   The blob `FileIO` implementation assumes all file operations target a single 
S3 bucket (the warehouse bucket opened at init time). When Iceberg's 
`write.metadata.path` table property points to a different bucket (e.g. a 
dedicated versioned metadata bucket), `defaultKeyExtractor` strips the wrong 
bucket prefix and the file lands in the warehouse bucket under a mangled key. 
Readers following the absolute S3 URI in `metadata.json` get a 404.
   
   **Concrete failure mode**: Setting `write.metadata.path = 
s3://metadata-bucket/db/table/` causes manifest lists (`snap-*.avro`) and 
manifest files (`*-m*.avro`) to be written to 
`s3://warehouse-bucket/metadata-bucket/db/table/snap-*.avro` instead of 
`s3://metadata-bucket/db/table/snap-*.avro`. The `metadata.json` records the 
correct URI, but the bytes are in the wrong place.
   
   ## Changes
   
   - Add `resolveBucket()` to `blobFileIO` which parses the full S3 URI and 
routes to the correct bucket
   - Primary bucket (warehouse) uses the fast path with no map lookup
   - Secondary buckets are opened lazily via a `BucketOpener` callback and 
cached with `sync.RWMutex`
   - Update `Open`, `NewWriter`, `WriteFile`, `Remove`, and `DeleteFiles` to 
use `resolveBucket`
   - Wire S3 scheme registration to pass a `BucketOpener` that reuses the same 
AWS config
   - Backward compatible: callers without a `BucketOpener` get the same legacy 
behavior
   
   ## Test plan
   
   - [x] `TestMultiBucketWriteAndRead` - writes to two memblob buckets via 
different S3 URIs, verifies files land in the correct bucket and can be read 
back
   - [x] `TestMultiBucketDelete` - verifies `Remove` and `DeleteFiles` route to 
the correct bucket
   - [x] `TestMultiBucketOpenerCaching` - verifies the opener is called once 
per bucket name
   - [x] `TestMultiBucketFallbackWithoutOpener` - verifies backward 
compatibility when no opener is set
   - [x] All existing tests pass unchanged
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to