KWON BYUNGCHANG created HIVE-29640:
--------------------------------------

             Summary: ADD JAR from a non-default fs fails on Tez due to missing 
delegation token
                 Key: HIVE-29640
                 URL: https://issues.apache.org/jira/browse/HIVE-29640
             Project: Hive
          Issue Type: Bug
            Reporter: KWON BYUNGCHANG


h2. Problem
In a Kerberized cluster, a query that pulls in jars from an HDFS
namenode other than `fs.defaultFS` fails when the execution engine is
Tez:

```
SET hive.execution.engine=tez;
ADD JAR hdfs://other-nn/libs/my-udf.jar;
SELECT my_udf(...) FROM t;
```

The jar is distributed to Tez containers as an AM-local resource via
the distributed cache. The container tries to localize it from
`hdfs://other-nn/...`, finds no HDFS delegation token for `other-nn`
in its `Credentials`, and fails resource localization. The query
aborts before any task runs.

`fs.defaultFS` jars work fine because Tez/Hadoop's standard code path
issues a token for the default namenode on its own.

h2. Root cause
`TezClientUtils.setupAMLocalResources` does not fetch HDFS delegation
tokens for AM-local resources — it expects the caller to provide them
via `AMCredentials`. HS2 (`TezSessionState`) currently passes only the
LLAP credentials and never enumerates the non-defaultFS namenodes
referenced by `ADD JAR` / `ADD FILE` resources, so the AM ends up
without a token for those namenodes.

h2. Fix
Before handing local resources to TezClient, walk the common local
resource map, collect every distinct non-`fs.defaultFS` HDFS namenode
referenced, fetch delegation tokens for those namenodes via
`TokenCache.obtainTokensForNamenodes`, and merge them into the
credentials passed to TezClient (alongside any existing LLAP
credentials).

Implementation lives in a new helper
`TezSessionState#createLocalResourceCredentialsExcludingDefaultFS` and
filters out:

- Resources on `fs.defaultFS` (Tez/Hadoop issues that token already;
  duplicate issuance adds latency and NameNode heap pressure).

h2. Repro

1. Kerberized HS2 with `hive.execution.engine=tez`.
2. From beeline:
   ```
   ADD JAR hdfs://other-nn/path/to/udf.jar;
   CREATE TEMPORARY FUNCTION my_udf AS '…';
   SELECT my_udf(col) FROM tbl;
   ```
   where `other-nn` is a federated namenode distinct from
   `fs.defaultFS`.
3. Expected: query runs.
   Actual: localization fails on the AM/container with a missing
   delegation token error for `other-nn`.

h2. Compatibility
- Behaviour is unchanged when all `ADD JAR` resources live on
  `fs.defaultFS` or on the local filesystem.
- Non-Kerberized clusters are unaffected (token issuance is a no-op).
- No new configuration. No new dependencies.





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to