[ https://issues.apache.org/jira/browse/HIVE-12650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15126293#comment-15126293 ]
Xuefu Zhang commented on HIVE-12650: ------------------------------------ Hi [~lirui], since application master in the context of Hive on Spark takes a container from yarn. In a busy cluster, spark-submit may wait up to spark.yarn.am.waitTime to launch the master. On the other hand, Hive waits for hive.spark.client.server.connect.timeout before declaring that the remote driver is not connecting back. If the latter is less than the former, it's possible that Hive prematurely disconnects, causing an unstable condition. [~joyoungzh...@gmail.com] had a description of the problem in the user list. I think we need at least to make hive.spark.client.server.connect.timeout greater than spark.yarn.am.waitTime by default. To further guard against the problem, Hive can increase hive.spark.client.server.connect.timeout automatically based on the value of spark.yarn.am.waitTime; [~vanzin], please share your thoughts as well. > Increase default value of hive.spark.client.server.connect.timeout to exceeds > spark.yarn.am.waitTime > ---------------------------------------------------------------------------------------------------- > > Key: HIVE-12650 > URL: https://issues.apache.org/jira/browse/HIVE-12650 > Project: Hive > Issue Type: Bug > Affects Versions: 1.1.1, 1.2.1 > Reporter: JoneZhang > Assignee: Xuefu Zhang > > I think hive.spark.client.server.connect.timeout should be set greater than > spark.yarn.am.waitTime. The default value for > spark.yarn.am.waitTime is 100s, and the default value for > hive.spark.client.server.connect.timeout is 90s, which is not good. We can > increase it to a larger value such as 120s. -- This message was sent by Atlassian JIRA (v6.3.4#6332)