You can just cap the cores used per job. http://spark.apache.org/docs/latest/spark-standalone.html
http://spark.apache.org/docs/latest/spark-standalone.html#resource-scheduling On Thu, Apr 27, 2017 at 1:07 AM, vincent gromakowski <vincent.gromakow...@gmail.com> wrote: > Spark standalone is not multi tenant you need one clusters per job. Maybe > you can try fair scheduling and use one cluster but i doubt it will be prod > ready... > > Le 27 avr. 2017 5:28 AM, "anna stax" <annasta...@gmail.com> a écrit : >> >> Thanks Cody, >> >> As I already mentioned I am running spark streaming on EC2 cluster in >> standalone mode. Now in addition to streaming, I want to be able to run >> spark batch job hourly and adhoc queries using Zeppelin. >> >> Can you please confirm that a standalone cluster is OK for this. Please >> provide me some links to help me get started. >> >> Thanks >> -Anna >> >> On Wed, Apr 26, 2017 at 7:46 PM, Cody Koeninger <c...@koeninger.org> >> wrote: >>> >>> The standalone cluster manager is fine for production. Don't use Yarn >>> or Mesos unless you already have another need for it. >>> >>> On Wed, Apr 26, 2017 at 4:53 PM, anna stax <annasta...@gmail.com> wrote: >>> > Hi Sam, >>> > >>> > Thank you for the reply. >>> > >>> > What do you mean by >>> > I doubt people run spark in a. Single EC2 instance, certainly not in >>> > production I don't think >>> > >>> > What is wrong in having a data pipeline on EC2 that reads data from >>> > kafka, >>> > processes using spark and outputs to cassandra? Please explain. >>> > >>> > Thanks >>> > -Anna >>> > >>> > On Wed, Apr 26, 2017 at 2:22 PM, Sam Elamin <hussam.ela...@gmail.com> >>> > wrote: >>> >> >>> >> Hi Anna >>> >> >>> >> There are a variety of options for launching spark clusters. I doubt >>> >> people run spark in a. Single EC2 instance, certainly not in >>> >> production I >>> >> don't think >>> >> >>> >> I don't have enough information of what you are trying to do but if >>> >> you >>> >> are just trying to set things up from scratch then I think you can >>> >> just use >>> >> EMR which will create a cluster for you and attach a zeppelin instance >>> >> as >>> >> well >>> >> >>> >> >>> >> You can also use databricks for ease of use and very little management >>> >> but >>> >> you will pay a premium for that abstraction >>> >> >>> >> >>> >> Regards >>> >> Sam >>> >> On Wed, 26 Apr 2017 at 22:02, anna stax <annasta...@gmail.com> wrote: >>> >>> >>> >>> I need to setup a spark cluster for Spark streaming and scheduled >>> >>> batch >>> >>> jobs and adhoc queries. >>> >>> Please give me some suggestions. Can this be done in standalone mode. >>> >>> >>> >>> Right now we have a spark cluster in standalone mode on AWS EC2 >>> >>> running >>> >>> spark streaming application. Can we run spark batch jobs and zeppelin >>> >>> on the >>> >>> same. Do we need a better resource manager like Mesos? >>> >>> >>> >>> Are there any companies or individuals that can help in setting this >>> >>> up? >>> >>> >>> >>> Thank you. >>> >>> >>> >>> -Anna >>> > >>> > >> >> > --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org