[ https://issues.apache.org/jira/browse/HIVE-24068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Prasanth Jayachandran resolved HIVE-24068. ------------------------------------------ Resolution: Fixed PR merged to master. Thanks [~kgyrtkirk] for the review! > Add re-execution plugin for handling DAG submission and unmanaged AM failures > ----------------------------------------------------------------------------- > > Key: HIVE-24068 > URL: https://issues.apache.org/jira/browse/HIVE-24068 > Project: Hive > Issue Type: Bug > Affects Versions: 4.0.0 > Reporter: Prasanth Jayachandran > Assignee: Prasanth Jayachandran > Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > DAG submission failure can also happen in environments where AM container > died causing DNS issues. DAG submissions are safe to retry as the DAG hasn't > started execution yet. There are retries at getSession and submitDAG level > individually but some submitDAG failure has to retry getSession as well as AM > could be unreachable, this can be handled in re-execution plugin. > There is already AM loss retry execution plugin but it only handles managed > AMs. It can be extended to handle unmanaged AMs as well. -- This message was sent by Atlassian Jira (v8.3.4#803005)