wombatu-kun opened a new pull request, #16499: URL: https://github.com/apache/iceberg/pull/16499
## Summary Closes #16480 `GCSFileIO.internalDeleteFiles` partitioned `BlobId`s into fixed-size batches and then selected the GCS `Storage` client once per batch from only the first object's path. When a single `GCSFileIO` is configured with multiple per-prefix `StorageCredential`s (vended-credentials flow), a batch that crossed prefix boundaries was issued in full through whichever client matched the first object — sending the rest of the batch through the wrong credentials. The fix groups `BlobId`s by their `PrefixedStorage` client (via the existing longest-prefix-match `clientForStoragePath` helper) before partitioning into batches, and uses each client's own `deleteBatchSize`. No public API or exception contract changes. This mirrors how `S3FileIO.deleteFiles` already groups by bucket before batching. ## Tests Two new unit tests in `TestGCSFileIO`: - `deleteFilesRoutesToCorrectClientPerPrefix` — interleaves objects across two credential-prefixed buckets and asserts that each per-prefix `Storage` client receives only its own `BlobId`s. - `deleteFilesBatchesPerClient` — sets a small `gcs.delete.batch-size` and asserts that batches stay per-client and never mix `BlobId`s from two prefixes. Confirmed locally with `./gradlew :iceberg-gcp:test --tests "org.apache.iceberg.gcp.gcs.TestGCSFileIO"` (26 tests, 0 failures). The new tests also fail when run against the previous code, confirming they catch the bug. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
