I was planning on talking about this at the Oozie BoF session yesterday,
but I got stuck at the YARN BoF session.  Anyway, I've attached a few
slides I had prepared, though it's mostly the same info as in my previous
email and on OOZIE-1770.

Please let me know what you think.

- Robert

On Fri, May 1, 2015 at 1:26 PM, Robert Kanter <[email protected]> wrote:

> Hi all,
>
> This is more a longer term idea, but could always be started sooner in a
> feature branch.  I think a great flagship feature for Oozie 5 would be
> running Oozie on YARN instead of MapReduce.  As you all know, the launcher
> MR job is essentially a big hack and adds all kinds of overhead and
> complications.  If we used AMs to run jobs directly in YARN containers
> instead, that would give us so many advantages; plus, YARN's purpose is
> exactly what we've been hijacking MR for.
>
> This is obviously a large feature, and will require a major revamping of
> most of the action types, all the checking code, etc.  It would be great if
> we could all work on this.  We're still discussing internally how much time
> we (Cloudera) can devote to this, but I wanted to gauge what others thought
> of this idea and if you'd be interested in working on it.
>
> Karthik and I already posted a hacky proof of concept that we worked on
> during a Hackathon to OOZIE-1770
> <https://issues.apache.org/jira/browse/OOZIE-1770>, called OYA (Oozie on
> YARN); though I imagine the patch needs some tweaking to apply cleanly at
> this point.  I think that can serve as a basis for how it would work
> (Karthik is working on getting the AM pool into YARN itself so that
> wouldn't be needed anymore); the JIRA also has a list of things to
> do/improve still.
>
> Once we have Oozie on YARN, we'd be able to finally fix some of the
> long-standing pain points in Oozie, which I'm pretty excited about:
> - Displaying the logs from the launcher inside Oozie!  Yarn has an API
> call for this.
> - Full control over the classpath of the launcher: we can make the
> sharelib optional if the user has the necessary jars installed on all nodes
> - We can run actions more similarly to how users run them (by calling
> their wrapper scripts instead of their Java Main's directly), which should
> cut down on the "It works from the CLI but not from Oozie" problems
>
> Please let me know what you think.
>
> thanks
> - Robert
>

Reply via email to