Jonathan Eagles created TEZ-2064:
------------------------------------

             Summary: SessionNotRunning Exception not thrown is all cases
                 Key: TEZ-2064
                 URL: https://issues.apache.org/jira/browse/TEZ-2064
             Project: Apache Tez
          Issue Type: Bug
            Reporter: Jonathan Eagles
            Priority: Critical


Hive handles SessionNotRunning during submitDAG() and restarts the tez-session
if it receives one. In YHIVE-15, we did not receive that and the query failed. 
In some scenarios the Application will fall out of the RM's knowledge and a 
ApplicationNotFound exception is received instead.

Here are my asks.

1. TezClient.submitDAG()/stop() should return SessionNotRunning exception if
application is expired. Basically any API which currently returns
SessionNotRunning should handle the app-not-found scenario.

2. It would help if TezClient.getAppMasterStatus() can return
TezAppMasterStatus.SHUTDOWN if tez-session-application does not exist in RM.
That way, as a precaution, applications could check before submitting DAG's.

3. I think it might be better if verifySessionStateForSubmission() checks the
app Status every time instead of checking sessionStarted. I am not sure about
side-effects, but will leave that to your decision.


If 3 takes time, we can pursue that later. It would really help to get 1 & 2 in
the next tez release, especially for busy grids.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to