Hi,

I'm trying to deploy a kerberized Hadoop cluster with Hive on Tez.  Hadoop
version is 3.3.4
Recently, I've encountered an issue where if Hadoop YARN WebProxy receives
a request, the remoteUser is always set to null. Because of that, the app
behind the proxy (Tez in my case) also has remoteUser set to null, which
results in 401 Access Denied.
Before running the request I make sure to have a valid kerberos ticket with
kinit.

In service logs it shows up as:

```
2023-09-04 18:02:44,441 INFO
org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: null is
accessing unchecked
http://host:50000/ui/ws/v2/tez/dagInfo?counters=*&dagID=1&_=1693843360458
which is the app master GUI of application_1693843078038_0002 owned by hive
```

which comes from here in proxy code
https://github.com/apache/hadoop/blob/rel/release-3.3.4/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/WebAppProxyServlet.java#L514

I've also run the proxy with debugger attached and the issue still occurs.

The issue is present no matter if WebProxy runs from Resource Manager or as
a standalone daemon.

In my core-site.xml I have hadoop.security.authentication set to kerberos
and hadoop.security.authorization set to yes. When running WebProxy as a
daemon (not as a part of RM) I had yarn.web-proxy.* properties set in
yarn-site.xml

Could anybody help with this? WebProxy documentation is very slim and going
through Jira I couldn't find anybody with a similar issue.

Reply via email to