Hi Sam, Thank you for the reply.
What do you mean by I doubt people run spark in a. Single EC2 instance, certainly not in production I don't think What is wrong in having a data pipeline on EC2 that reads data from kafka, processes using spark and outputs to cassandra? Please explain. Thanks -Anna On Wed, Apr 26, 2017 at 2:22 PM, Sam Elamin <hussam.ela...@gmail.com> wrote: > Hi Anna > > There are a variety of options for launching spark clusters. I doubt > people run spark in a. Single EC2 instance, certainly not in production I > don't think > > I don't have enough information of what you are trying to do but if you > are just trying to set things up from scratch then I think you can just use > EMR which will create a cluster for you and attach a zeppelin instance as > well > > > You can also use databricks for ease of use and very little management but > you will pay a premium for that abstraction > > > Regards > Sam > On Wed, 26 Apr 2017 at 22:02, anna stax <annasta...@gmail.com> wrote: > >> I need to setup a spark cluster for Spark streaming and scheduled batch >> jobs and adhoc queries. >> Please give me some suggestions. Can this be done in standalone mode. >> >> Right now we have a spark cluster in standalone mode on AWS EC2 running >> spark streaming application. Can we run spark batch jobs and zeppelin on >> the same. Do we need a better resource manager like Mesos? >> >> Are there any companies or individuals that can help in setting this up? >> >> Thank you. >> >> -Anna >> >