[ 
https://issues.apache.org/jira/browse/HIVE-29640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-29640:
----------------------------------
    Labels: pull-request-available  (was: )

> ADD JAR from a non-default fs fails on Tez due to missing delegation token
> --------------------------------------------------------------------------
>
>                 Key: HIVE-29640
>                 URL: https://issues.apache.org/jira/browse/HIVE-29640
>             Project: Hive
>          Issue Type: Bug
>            Reporter: KWON BYUNGCHANG
>            Priority: Major
>              Labels: pull-request-available
>
> h2. Problem
> In a Kerberized cluster, a query that pulls in jars from an HDFS
> namenode other than `fs.defaultFS` fails when the execution engine is
> Tez:
> ```
> SET hive.execution.engine=tez;
> ADD JAR hdfs://other-nn/libs/my-udf.jar;
> SELECT my_udf(...) FROM t;
> ```
> The jar is distributed to Tez containers as an AM-local resource via
> the distributed cache. The container tries to localize it from
> `hdfs://other-nn/...`, finds no HDFS delegation token for `other-nn`
> in its `Credentials`, and fails resource localization. The query
> aborts before any task runs.
> `fs.defaultFS` jars work fine because Tez/Hadoop's standard code path
> issues a token for the default namenode on its own.
> h2. Root cause
> `TezClientUtils.setupAMLocalResources` does not fetch HDFS delegation
> tokens for AM-local resources — it expects the caller to provide them
> via `AMCredentials`. HS2 (`TezSessionState`) currently passes only the
> LLAP credentials and never enumerates the non-defaultFS namenodes
> referenced by `ADD JAR` / `ADD FILE` resources, so the AM ends up
> without a token for those namenodes.
> h2. Fix
> Before handing local resources to TezClient, walk the common local
> resource map, collect every distinct non-`fs.defaultFS` HDFS namenode
> referenced, fetch delegation tokens for those namenodes via
> `TokenCache.obtainTokensForNamenodes`, and merge them into the
> credentials passed to TezClient (alongside any existing LLAP
> credentials).
> Implementation lives in a new helper
> `TezSessionState#createLocalResourceCredentialsExcludingDefaultFS` and
> filters out:
> - Resources on `fs.defaultFS` (Tez/Hadoop issues that token already;
>   duplicate issuance adds latency and NameNode heap pressure).
> h2. Repro
> 1. Kerberized HS2 with `hive.execution.engine=tez`.
> 2. From beeline:
>    ```
>    ADD JAR hdfs://other-nn/path/to/udf.jar;
>    CREATE TEMPORARY FUNCTION my_udf AS '…';
>    SELECT my_udf(col) FROM tbl;
>    ```
>    where `other-nn` is a federated namenode distinct from
>    `fs.defaultFS`.
> 3. Expected: query runs.
>    Actual: localization fails on the AM/container with a missing
>    delegation token error for `other-nn`.
> h2. Compatibility
> - Behaviour is unchanged when all `ADD JAR` resources live on
>   `fs.defaultFS` or on the local filesystem.
> - Non-Kerberized clusters are unaffected (token issuance is a no-op).
> - No new configuration. No new dependencies.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to