[ https://issues.apache.org/jira/browse/TEZ-4638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18001375#comment-18001375 ]
Dong0829 commented on TEZ-4638: ------------------------------- [~zhangbutao] thanks for your info. I think its different issue. For the issue you mentioned, its Tez am has issue to communicate with HDFS, but for this issue, its the submitDAG is using RPC user UGI instead of Tez am UGI to talk with HDFS and face the issue. For the fix, I used the tez AM UGI instead. > Client authenticate failure when using Kerberos if there is big DAG plan > needed HDFS > ------------------------------------------------------------------------------------ > > Key: TEZ-4638 > URL: https://issues.apache.org/jira/browse/TEZ-4638 > Project: Apache Tez > Issue Type: Bug > Affects Versions: 0.10.2 > Reporter: Dong0829 > Priority: Major > Attachments: TEZ-4638.patch > > > Whenever the DAG plan is big and exceed the limit, the DAG plan will be > uploaded to HDFS. After TEZ AM gets this request, it will need to go to HDFS > to get the data, but in kerberos cluster, it will face below error: > {quote}{{10.239.88.12:0. Failed on local exception: java.io.IOException: > org.apache.hadoop.security.AccessControlException: Client cannot authenticate > via:[TOKEN, KERBEROS] > at > java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > .... > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.submitDAG(DAGClientAMProtocolBlockingPBServerImpl.java:172) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:8519) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server.processCall(ProtobufRpcEngine.java:484) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:595) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1227) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1226) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1145) > at > java.base/java.security.AccessController.doPrivileged(AccessController.java:712) > at java.base/javax.security.auth.Subject.doAs(Subject.java:439) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3388)}} > {quote} > For the RCA, its because the submitDAG request is handled by the RPC Sever, > and the hadoop server will use remote RPC client user as the current UGI > using doAs (as above stack) > For the remote UGI, it has no context for the Tez AM which has the tokens > including KMS, HDFS and so on, so when it talking to the HDFS, it will fail. -- This message was sent by Atlassian Jira (v8.20.10#820010)