Gopal V created TEZ-2728:
----------------------------
Summary: Wrap IPC connection Exception as SessionNotRunning - RM
crash
Key: TEZ-2728
URL: https://issues.apache.org/jira/browse/TEZ-2728
Project: Apache Tez
Issue Type: Bug
Affects Versions: 0.6.2, 0.5.4, 0.7.0, 0.8.0
Reporter: Gopal V
Assignee: Hitesh Shah
Crashing the RM when a query session is open and restarting it does not result
in a recoverable state for a Hive session.
{code}
2015-08-17T22:34:21,981 INFO [main]: ipc.Client
(Client.java:handleConnectionFailure(885)) - Retrying connect to server:
cn042-10.l42scl.hortonworks.com/172.19.128.42:10200. Already tried 48 time(s);
retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
sleepTime=1000 MILLISECONDS)
2015-08-17T22:34:22,982 INFO [main]: ipc.Client
(Client.java:handleConnectionFailure(885)) - Retrying connect to server:
cn042-10.l42scl.hortonworks.com/172.19.128.42:10200. Already tried 49 time(s);
retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50,
sleepTime=1000 MILLISECONDS)
2015-08-17T22:34:22,987 ERROR [main]: exec.Task (TezTask.java:execute(195)) -
Failed to execute tez graph.
java.net.ConnectException: Call From
cn041-10.l42scl.hortonworks.com/172.19.128.41 to
cn042-10.l42scl.hortonworks.com:10200 failed on connection exception:
java.net.ConnectException: Connection refused; For more details see:
http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method) ~[?:1.8.0_51]
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
~[?:1.8.0_51]
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
~[?:1.8.0_51]
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
~[?:1.8.0_51]
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
~[hadoop-common-2.8.0-20150722.003145-873.jar:?]
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
~[hadoop-common-2.8.0-20150722.003145-873.jar:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1444)
~[hadoop-common-2.8.0-20150722.003145-873.jar:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1371)
~[hadoop-common-2.8.0-20150722.003145-873.jar:?]
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
~[hadoop-common-2.8.0-20150722.003145-873.jar:?]
at com.sun.proxy.$Proxy41.getApplicationReport(Unknown Source) ~[?:?]
at
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationHistoryProtocolPBClientImpl.getApplicationReport(ApplicationHistoryProtocolPBClientImpl.java:108)
~[hadoop-yarn-common-2.8.0-20150721.221214-843.jar:?]
at
org.apache.hadoop.yarn.client.api.impl.AHSClientImpl.getApplicationReport(AHSClientImpl.java:101)
~[hadoop-yarn-client-2.8.0-20150721.221233-841.jar:?]
at
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:442)
~[hadoop-yarn-client-2.8.0-20150721.221233-841.
jar:?]
at
org.apache.tez.client.TezYarnClient.getApplicationReport(TezYarnClient.java:89)
~[tez-api-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
at
org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:835)
~[tez-api-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:713)
~[tez-api-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:723)
~[tez-api-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
at org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:453)
~[tez-api-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
at org.apache.tez.client.TezClient.submitDAG(TezClient.java:391)
~[tez-api-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.tez.TezTask.submit(TezTask.java:409)
~[hive-exec-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)