Re: Tools to manage workflows on Spark

2015-03-01 Thread Qiang Cao
rkContext or RDD, or leverage caching or temp table, going into another >> Oozie action. You could either save output to file or put all Spark >> processing into one Oozie action. >> >> --- Original Message --- >> >> From: "Mayur Rustagi" >&

Re: Tools to manage workflows on Spark

2015-02-28 Thread Qiang Cao
Not sure if it is being actively maintained. >> >> On Sat, Feb 28, 2015 at 6:26 PM, Qiang Cao wrote: >> >>> Thanks for the pointer, Ashish! I was also looking at Spork >>> https://github.com/sigmoidanalytics/spork Pig-on-Spark), but wasn't >>> sure i

Re: Tools to manage workflows on Spark

2015-02-28 Thread Qiang Cao
to get the idea for my implementation - > > > http://mail-archives.apache.org/mod_mbox/oozie-user/201404.mbox/%3CCAHCsPn-0Grq1rSXrAZu35yy_i4T=fvovdox2ugpcuhkwmjp...@mail.gmail.com%3E > > > > On Feb 28, 2015, at 3:25 PM, Qiang Cao wrote: > > Thanks, Ashish! Is Oozie integ

Re: Tools to manage workflows on Spark

2015-02-28 Thread Qiang Cao
Thanks, Ashish! Is Oozie integrated with Spark? I knew it can accommodate some Hadoop jobs. On Sat, Feb 28, 2015 at 6:07 PM, Ashish Nigam wrote: > Qiang, > Did you look at Oozie? > We use oozie to run spark jobs in production. > > > On Feb 28, 2015, at 2:45 PM, Qiang

Tools to manage workflows on Spark

2015-02-28 Thread Qiang Cao
Hi Everyone, We need to deal with workflows on Spark. In our scenario, each workflow consists of multiple processing steps. Among different steps, there could be dependencies. I'm wondering if there are tools available that can help us schedule and manage workflows on Spark. I'm looking for somet