Hi Shyla You have multiple options really some of which have been already listed but let me try and clarify
Assuming you have a spark application in a jar you have a variety of options You have to have an existing spark cluster that is either running on EMR or somewhere else. *Super simple / hacky* Cron job on EC2 that calls a simple shell script that does a spart submit to a Spark Cluster OR create or add step to an EMR cluster *More Elegant* Airflow/Luigi/AWS Data Pipeline (Which is just CRON in the UI ) that will do the above step but have scheduling and potential backfilling and error handling(retries,alerts etc) AWS are coming out with glue <https://aws.amazon.com/glue/> soon that does some Spark jobs but I do not think its available worldwide just yet Hope I cleared things up Regards Sam On Fri, Apr 7, 2017 at 6:05 AM, Gourav Sengupta <gourav.sengu...@gmail.com> wrote: > Hi Shyla, > > why would you want to schedule a spark job in EC2 instead of EMR? > > Regards, > Gourav > > On Fri, Apr 7, 2017 at 1:04 AM, shyla deshpande <deshpandesh...@gmail.com> > wrote: > >> I want to run a spark batch job maybe hourly on AWS EC2 . What is the >> easiest way to do this. Thanks >> > >