Gopal V created TEZ-2728:
----------------------------

             Summary: Wrap IPC connection Exception as SessionNotRunning - RM 
crash
                 Key: TEZ-2728
                 URL: https://issues.apache.org/jira/browse/TEZ-2728
             Project: Apache Tez
          Issue Type: Bug
    Affects Versions: 0.6.2, 0.5.4, 0.7.0, 0.8.0
            Reporter: Gopal V
            Assignee: Hitesh Shah


Crashing the RM when a query session is open and restarting it does not result 
in a recoverable state for a Hive session.

{code}
2015-08-17T22:34:21,981 INFO  [main]: ipc.Client 
(Client.java:handleConnectionFailure(885)) - Retrying connect to server: 
cn042-10.l42scl.hortonworks.com/172.19.128.42:10200. Already tried 48 time(s); 
retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, 
sleepTime=1000 MILLISECONDS)
2015-08-17T22:34:22,982 INFO  [main]: ipc.Client 
(Client.java:handleConnectionFailure(885)) - Retrying connect to server: 
cn042-10.l42scl.hortonworks.com/172.19.128.42:10200. Already tried 49 time(s); 
retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, 
sleepTime=1000 MILLISECONDS)
2015-08-17T22:34:22,987 ERROR [main]: exec.Task (TezTask.java:execute(195)) - 
Failed to execute tez graph.
java.net.ConnectException: Call From 
cn041-10.l42scl.hortonworks.com/172.19.128.41 to 
cn042-10.l42scl.hortonworks.com:10200 failed on connection exception: 
java.net.ConnectException: Connection refused; For more details see:  
http://wiki.apache.org/hadoop/ConnectionRefused
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
Method) ~[?:1.8.0_51]
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
 ~[?:1.8.0_51]
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 ~[?:1.8.0_51]
        at java.lang.reflect.Constructor.newInstance(Constructor.java:422) 
~[?:1.8.0_51]
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792) 
~[hadoop-common-2.8.0-20150722.003145-873.jar:?]
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732) 
~[hadoop-common-2.8.0-20150722.003145-873.jar:?]
        at org.apache.hadoop.ipc.Client.call(Client.java:1444) 
~[hadoop-common-2.8.0-20150722.003145-873.jar:?]
        at org.apache.hadoop.ipc.Client.call(Client.java:1371) 
~[hadoop-common-2.8.0-20150722.003145-873.jar:?]
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
 ~[hadoop-common-2.8.0-20150722.003145-873.jar:?]
        at com.sun.proxy.$Proxy41.getApplicationReport(Unknown Source) ~[?:?]
        at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationHistoryProtocolPBClientImpl.getApplicationReport(ApplicationHistoryProtocolPBClientImpl.java:108)
 ~[hadoop-yarn-common-2.8.0-20150721.221214-843.jar:?]
        at 
org.apache.hadoop.yarn.client.api.impl.AHSClientImpl.getApplicationReport(AHSClientImpl.java:101)
 ~[hadoop-yarn-client-2.8.0-20150721.221233-841.jar:?]
        at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:442)
 ~[hadoop-yarn-client-2.8.0-20150721.221233-841.
jar:?]
        at 
org.apache.tez.client.TezYarnClient.getApplicationReport(TezYarnClient.java:89) 
~[tez-api-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
        at 
org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:835) 
~[tez-api-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
        at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:713) 
~[tez-api-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
        at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:723) 
~[tez-api-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
        at org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:453) 
~[tez-api-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
        at org.apache.tez.client.TezClient.submitDAG(TezClient.java:391) 
~[tez-api-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT]
        at org.apache.hadoop.hive.ql.exec.tez.TezTask.submit(TezTask.java:409) 
~[hive-exec-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
{code}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to