Apologies if this has been asked before, but I can't figure out how to
search the archives of this mailing list and 20 minutes of googling yielded
no useful results.

I'm on a team that uses Cascading to do our MapReduce flows.  However, we
are investigating using Oozie to do additional types of actions (hive,
shell, etc.) and use its scheduler.  For this to work, we'll need to be
able to run a Cascading job as an oozie action.  Which is what I can't
figure out how to do.

Typically to run a Cascading job, we'll do this:

hadoop jar mycascading_uberjar.jar com.company.MyCascadingFlow arg1 arg2
arg3 argN

My first thought was to use an oozie map-reduce action, since I run this
with "hadoop jar" and Cascading creates MRs under the hood, but the oozie
map-reduce action wants things like mapred.mapper.class
and mapred.reducer.class.  Well MyCascadingFlow runs two dozen different
mappers and a few different reducers!

What is the best way to do this?  The java action seems wrong since it
won't run it with "hadoop jar".  Which leaves me with just a shell action
and putting the "hadoop jar ...." line in a shell script and invoking it.

Other ideas?

-Michael

Reply via email to