[
https://issues.apache.org/jira/browse/HIVE-29477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18061555#comment-18061555
]
László Bodor commented on HIVE-29477:
-------------------------------------
mainly, I'm about to apply the old patch
https://issues.apache.org/jira/secure/attachment/12939601/HIVE-20547.01.patch
to current master and resolve conflicts, there's going to be many
> Introduce codepath for Tez external sessions discovered by Zookeeper
> --------------------------------------------------------------------
>
> Key: HIVE-29477
> URL: https://issues.apache.org/jira/browse/HIVE-29477
> Project: Hive
> Issue Type: Bug
> Reporter: László Bodor
> Assignee: László Bodor
> Priority: Major
>
> Given the exception described in TEZ-4686:
> [^hs2_stacktrace.txt]
> {code:java}
> Caused by: java.lang.NullPointerException: Cannot invoke
> "org.apache.tez.client.registry.AMRecord.getApplicationId()" because
> "this.amRecord" is null
> at
> org.apache.tez.client.registry.zookeeper.ZkFrameworkClient.createApplication(ZkFrameworkClient.java:114)
> at
> org.apache.tez.client.TezClient.createApplication(TezClient.java:1103)
> at org.apache.tez.client.TezClient.start(TezClient.java:399)
> at
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.startSessionAndContainers(TezSessionState.java:488)
> at
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternalUnsafe(TezSessionState.java:406)
> at
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:297)
> at
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.openInternal(TezSessionPoolSession.java:122)
> at
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:250)
> at
> org.apache.hadoop.hive.ql.exec.tez.TezTask.ensureSessionHasResources(TezTask.java:481)
> at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:232)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
> at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105)
> {code}
> this is related to the assumption made in TEZ-4007:
> [https://github.com/apache/tez/blob/17546aa680e6f9a52411fe6a66c7a26de76e53a6/tez-api/src/main/java/org/apache/tez/client/registry/zookeeper/ZkFrameworkClient.java#L91]
> So the point of this issue is: {*}how to acquire an application id{*}, and
> this is closely related to the standalone zookeeper mode in Tez.
> What actually happens in Tez Yarn world is clearly show in the above
> exception (just replace ZkFrameworkClient with TezYarnClient):
> {code}
> TezSessionState.openInternal -> TezCient.start ->
> FrameworkClient.createApplication
> {code}
> This is true in case of Yarn, where createApplication actually goes to the
> ResourceManager, which then starts an application and returns an application
> id.
> In case of Zookeeper-based Tez AM registry, an Application id (which is an
> artificial one, looks like an yarn application id for backward compatibility)
> should rather be discovered from a registry client, and than it's passed to
> the TezClient to make the actual Framework client aware (which is a
> ZkFrameworkClient) and able get its status from zookeeper.
> To implement this, the whole external session abstraction should make its way
> up to the TezSessionState, which can utilize the TezClient accordingly.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)