Hi Steve, Why would you ever do that? You are suggesting the use of a CI tool as a workflow and orchestration engine.
Regards, Gourav Sengupta On Fri, Apr 7, 2017 at 4:07 PM, Steve Loughran <ste...@hortonworks.com> wrote: > If you have Jenkins set up for some CI workflow, that can do scheduled > builds and tests. Works well if you can do some build test before even > submitting it to a remote cluster > > On 7 Apr 2017, at 10:15, Sam Elamin <hussam.ela...@gmail.com> wrote: > > Hi Shyla > > You have multiple options really some of which have been already listed > but let me try and clarify > > Assuming you have a spark application in a jar you have a variety of > options > > You have to have an existing spark cluster that is either running on EMR > or somewhere else. > > *Super simple / hacky* > Cron job on EC2 that calls a simple shell script that does a spart submit > to a Spark Cluster OR create or add step to an EMR cluster > > *More Elegant* > Airflow/Luigi/AWS Data Pipeline (Which is just CRON in the UI ) that will > do the above step but have scheduling and potential backfilling and error > handling(retries,alerts etc) > > AWS are coming out with glue <https://aws.amazon.com/glue/> soon that > does some Spark jobs but I do not think its available worldwide just yet > > Hope I cleared things up > > Regards > Sam > > > On Fri, Apr 7, 2017 at 6:05 AM, Gourav Sengupta <gourav.sengu...@gmail.com > > wrote: > >> Hi Shyla, >> >> why would you want to schedule a spark job in EC2 instead of EMR? >> >> Regards, >> Gourav >> >> On Fri, Apr 7, 2017 at 1:04 AM, shyla deshpande <deshpandesh...@gmail.com >> > wrote: >> >>> I want to run a spark batch job maybe hourly on AWS EC2 . What is the >>> easiest way to do this. Thanks >>> >> >> > >