difin commented on code in PR #6474:
URL: https://github.com/apache/hive/pull/6474#discussion_r3495783220
##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveTableUtil.java:
##########
@@ -244,6 +244,7 @@ public static Table deserializeTable(Configuration config,
String name) {
table = readTableObjectFromFile(location, config);
}
checkAndSetIoConfig(config, table);
+ IcebergVendedCredentialUtil.applyFromJobConf(table, config);
Review Comment:
I tested in the Gravitino q-test: you are correct — vended credentials in
`table.io().properties()` survive ser/de to executors. We still zneed to
extract those credentials and stage them in Hive job configuration for two
reasons:
1. **Hadoop S3A.** Iceberg FileIO and Hadoop S3A do not share state; S3A
only reads Configuration, and Tez/LLAP use S3A for `s3://` paths. At compile
time (`configureInputJobProperties` / `configureInputJobCredentials`) we have
`TableDesc`, not the final task `JobConf`, so vended credentials are staged in
`jobProperties` and `jobSecrets` per HIVE-20651. At job submit, those maps are
merged into `JobConf` / Hadoop Credentials and sent to executors before tasks
run.
2. **Iceberg FileIO endpoint override.** `applyFromJobConf` in
`HiveTableUtil.deserializeTable` is still needed when REST catalogs vend S3
connection settings for their own network while Hive runs elsewhere. In the
Gravitino q-test, Gravitino (in Docker) vends
`s3.endpoint`=`http://minio:9000`, which tasks on the host cannot reach;
session config sets a reachable endpoint (`http://<host>:9000`).
`applyFromJobConf` takes vended credentials (from job conf or
`FileIO.properties()`), replaces only endpoint and path-style with session
values, leaves access keys unchanged, and installs the result on the FileIO via
`setCredentials()`. In my debug, deserialized `table.io().properties()` still
had `http://minio:9000`, so this step is required even when vended keys survive
ser/de.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]