[ https://issues.apache.org/jira/browse/TEZ-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15373330#comment-15373330 ]
Hitesh Shah edited comment on TEZ-3335 at 7/12/16 5:58 PM: ----------------------------------------------------------- Seems like a bug in YARN that should be fixed too? Where if the RM does not know about it, it means app has completed with final state/status unknown and therefore either the RM or AHS should inject some state denoting completion? was (Author: hitesh): Seems like a bug in YARN that should be fixed too? Where if the RM does not know about it, it means app has completed with final state/status unknown? > DAG client thinks app is still running when app status is null > -------------------------------------------------------------- > > Key: TEZ-3335 > URL: https://issues.apache.org/jira/browse/TEZ-3335 > Project: Apache Tez > Issue Type: Bug > Affects Versions: 0.7.1 > Reporter: Jason Lowe > > When an RM restarts without recovering apps (i.e.: either work-preserving is > not enabled or state store was removed) and the YARN application history is > enabled then YarnClient can return an application report with the app status > as null. The RM doesn't know about the application, so the client redirects > to the AHS. The AHS knows the app started at some point but will never > received a finished event, hence the null app status. > The DAG client fails to detect this scenario and believes the app is still > running, so for example Hive clients will continue to hammer for status on an > app that doesn't exist. -- This message was sent by Atlassian JIRA (v6.3.4#6332)