[
https://issues.apache.org/jira/browse/OOZIE-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shwetha G S updated OOZIE-1722:
-------------------------------
Fix Version/s: (was: 4.1.0)
4.2
> When an ApplicationMaster restarts, it restarts the launcher job
> ----------------------------------------------------------------
>
> Key: OOZIE-1722
> URL: https://issues.apache.org/jira/browse/OOZIE-1722
> Project: Oozie
> Issue Type: Improvement
> Affects Versions: trunk
> Reporter: Robert Kanter
> Assignee: Robert Kanter
> Fix For: 4.2
>
> Attachments: OOZIE-1722.patch, OOZIE-1722.patch, OOZIE-1722.patch,
> OOZIE-1722.patch, OOZIE-1722.patch
>
>
> When using Yarn, there are some situations in which the ApplicationMaster can
> be restarted (e.g. RM failover, the AM dies and another attempt is made,
> etc).
> When this happens, it starts the launcher job again, which will start over.
> So, if that launcher has already launched a job, we'll end up with two
> instances of the same job, which can be problematic. For example, if you
> have a Pig action, the Pig client might run a job, but then the launcher gets
> restarted by an AM restart and launches that same job again.
> We don't have a way of "re-attaching" to previously launched jobs; however,
> with YARN-1461 and MAPREDUCE-5699, we can use yarn tags to find anything the
> launcher previously launched that's running and kill them. We still have to
> start over, but at least we're not running two instances of a job at the same
> time.
> Here's what we can do for each action type:
> - Pig, Sqoop, Hive
> -- Kill previously launched jobs and start over
> - MapReduce (different because of the optimization)
> -- Exit launcher if a previously launched job already exists
> - Java, Shell
> -- No out-of-the-box support for this
> -- Like with other things, the Java action can take advantage of this like
> Pig, Sqoop, and Hive if the user adds some code
> - DistCp
> -- Not supported
> - SSH, Email
> -- N/A
> The yarn tags won't be available until Hadoop 2.4.0, but is in the nightly
> (i.e. Hadoop 3.0.0-SNAPSHOT); and its obviously not in Hadoop 1.x. To be
> able to use the Yarn methods and the new methods for tagging, we can add a
> new type of Hadooplib called "Hadoop Utils" where we can put classes that are
> specific to a specific version of Hadoop; the other implementations can have
> dummy versions. For example, in the Hadoop-2 Hadoop Utils, we can put a
> method foo() that calls some yarn stuff but in the Hadoop-1 Hadoop Utils, the
> foo() method would either do the equivalent in MR1 or a no-op. So for now, I
> put some methods in the Hadoop-3 Hadoop Utils that use the tags and the
> Hadoop-1, Hadoop-2, and Hadoop-23 Hadoop Utils all have dummy implementations
> that don't do anything (so the existing behavior is preserved). The Hadoop
> Utils modules will allow us to take advantage of Hadoop 2 only features in
> the future, while still being able to compile against Hadoop 1; so it's not
> just limited to this feature.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)