Jonathan Eagles created TEZ-2064:
------------------------------------
Summary: SessionNotRunning Exception not thrown is all cases
Key: TEZ-2064
URL: https://issues.apache.org/jira/browse/TEZ-2064
Project: Apache Tez
Issue Type: Bug
Reporter: Jonathan Eagles
Priority: Critical
Hive handles SessionNotRunning during submitDAG() and restarts the tez-session
if it receives one. In YHIVE-15, we did not receive that and the query failed.
In some scenarios the Application will fall out of the RM's knowledge and a
ApplicationNotFound exception is received instead.
Here are my asks.
1. TezClient.submitDAG()/stop() should return SessionNotRunning exception if
application is expired. Basically any API which currently returns
SessionNotRunning should handle the app-not-found scenario.
2. It would help if TezClient.getAppMasterStatus() can return
TezAppMasterStatus.SHUTDOWN if tez-session-application does not exist in RM.
That way, as a precaution, applications could check before submitting DAG's.
3. I think it might be better if verifySessionStateForSubmission() checks the
app Status every time instead of checking sessionStarted. I am not sure about
side-effects, but will leave that to your decision.
If 3 takes time, we can pursue that later. It would really help to get 1 & 2 in
the next tez release, especially for busy grids.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)