Re: help/suggestions to setup spark cluster

2017-04-27 Thread Cody Koeninger
You can just cap the cores used per job.

http://spark.apache.org/docs/latest/spark-standalone.html

http://spark.apache.org/docs/latest/spark-standalone.html#resource-scheduling

On Thu, Apr 27, 2017 at 1:07 AM, vincent gromakowski
 wrote:
> Spark standalone is not multi tenant you need one clusters per job. Maybe
> you can try fair scheduling and use one cluster but i doubt it will be prod
> ready...
>
> Le 27 avr. 2017 5:28 AM, "anna stax"  a écrit :
>>
>> Thanks Cody,
>>
>> As I already mentioned I am running spark streaming on EC2 cluster in
>> standalone mode. Now in addition to streaming, I want to be able to run
>> spark batch job hourly and adhoc queries using Zeppelin.
>>
>> Can you please confirm that a standalone cluster is OK for this. Please
>> provide me some links to help me get started.
>>
>> Thanks
>> -Anna
>>
>> On Wed, Apr 26, 2017 at 7:46 PM, Cody Koeninger 
>> wrote:
>>>
>>> The standalone cluster manager is fine for production.  Don't use Yarn
>>> or Mesos unless you already have another need for it.
>>>
>>> On Wed, Apr 26, 2017 at 4:53 PM, anna stax  wrote:
>>> > Hi Sam,
>>> >
>>> > Thank you for the reply.
>>> >
>>> > What do you mean by
>>> > I doubt people run spark in a. Single EC2 instance, certainly not in
>>> > production I don't think
>>> >
>>> > What is wrong in having a data pipeline on EC2 that reads data from
>>> > kafka,
>>> > processes using spark and outputs to cassandra? Please explain.
>>> >
>>> > Thanks
>>> > -Anna
>>> >
>>> > On Wed, Apr 26, 2017 at 2:22 PM, Sam Elamin 
>>> > wrote:
>>> >>
>>> >> Hi Anna
>>> >>
>>> >> There are a variety of options for launching spark clusters. I doubt
>>> >> people run spark in a. Single EC2 instance, certainly not in
>>> >> production I
>>> >> don't think
>>> >>
>>> >> I don't have enough information of what you are trying to do but if
>>> >> you
>>> >> are just trying to set things up from scratch then I think you can
>>> >> just use
>>> >> EMR which will create a cluster for you and attach a zeppelin instance
>>> >> as
>>> >> well
>>> >>
>>> >>
>>> >> You can also use databricks for ease of use and very little management
>>> >> but
>>> >> you will pay a premium for that abstraction
>>> >>
>>> >>
>>> >> Regards
>>> >> Sam
>>> >> On Wed, 26 Apr 2017 at 22:02, anna stax  wrote:
>>> >>>
>>> >>> I need to setup a spark cluster for Spark streaming and scheduled
>>> >>> batch
>>> >>> jobs and adhoc queries.
>>> >>> Please give me some suggestions. Can this be done in standalone mode.
>>> >>>
>>> >>> Right now we have a spark cluster in standalone mode on AWS EC2
>>> >>> running
>>> >>> spark streaming application. Can we run spark batch jobs and zeppelin
>>> >>> on the
>>> >>> same. Do we need a better resource manager like Mesos?
>>> >>>
>>> >>> Are there any companies or individuals that can help in setting this
>>> >>> up?
>>> >>>
>>> >>> Thank you.
>>> >>>
>>> >>> -Anna
>>> >
>>> >
>>
>>
>

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: help/suggestions to setup spark cluster

2017-04-27 Thread vincent gromakowski
Spark standalone is not multi tenant you need one clusters per job. Maybe
you can try fair scheduling and use one cluster but i doubt it will be prod
ready...

Le 27 avr. 2017 5:28 AM, "anna stax"  a écrit :

> Thanks Cody,
>
> As I already mentioned I am running spark streaming on EC2 cluster in
> standalone mode. Now in addition to streaming, I want to be able to run
> spark batch job hourly and adhoc queries using Zeppelin.
>
> Can you please confirm that a standalone cluster is OK for this. Please
> provide me some links to help me get started.
>
> Thanks
> -Anna
>
> On Wed, Apr 26, 2017 at 7:46 PM, Cody Koeninger 
> wrote:
>
>> The standalone cluster manager is fine for production.  Don't use Yarn
>> or Mesos unless you already have another need for it.
>>
>> On Wed, Apr 26, 2017 at 4:53 PM, anna stax  wrote:
>> > Hi Sam,
>> >
>> > Thank you for the reply.
>> >
>> > What do you mean by
>> > I doubt people run spark in a. Single EC2 instance, certainly not in
>> > production I don't think
>> >
>> > What is wrong in having a data pipeline on EC2 that reads data from
>> kafka,
>> > processes using spark and outputs to cassandra? Please explain.
>> >
>> > Thanks
>> > -Anna
>> >
>> > On Wed, Apr 26, 2017 at 2:22 PM, Sam Elamin 
>> wrote:
>> >>
>> >> Hi Anna
>> >>
>> >> There are a variety of options for launching spark clusters. I doubt
>> >> people run spark in a. Single EC2 instance, certainly not in
>> production I
>> >> don't think
>> >>
>> >> I don't have enough information of what you are trying to do but if you
>> >> are just trying to set things up from scratch then I think you can
>> just use
>> >> EMR which will create a cluster for you and attach a zeppelin instance
>> as
>> >> well
>> >>
>> >>
>> >> You can also use databricks for ease of use and very little management
>> but
>> >> you will pay a premium for that abstraction
>> >>
>> >>
>> >> Regards
>> >> Sam
>> >> On Wed, 26 Apr 2017 at 22:02, anna stax  wrote:
>> >>>
>> >>> I need to setup a spark cluster for Spark streaming and scheduled
>> batch
>> >>> jobs and adhoc queries.
>> >>> Please give me some suggestions. Can this be done in standalone mode.
>> >>>
>> >>> Right now we have a spark cluster in standalone mode on AWS EC2
>> running
>> >>> spark streaming application. Can we run spark batch jobs and zeppelin
>> on the
>> >>> same. Do we need a better resource manager like Mesos?
>> >>>
>> >>> Are there any companies or individuals that can help in setting this
>> up?
>> >>>
>> >>> Thank you.
>> >>>
>> >>> -Anna
>> >
>> >
>>
>
>


Re: help/suggestions to setup spark cluster

2017-04-26 Thread anna stax
Thanks Cody,

As I already mentioned I am running spark streaming on EC2 cluster in
standalone mode. Now in addition to streaming, I want to be able to run
spark batch job hourly and adhoc queries using Zeppelin.

Can you please confirm that a standalone cluster is OK for this. Please
provide me some links to help me get started.

Thanks
-Anna

On Wed, Apr 26, 2017 at 7:46 PM, Cody Koeninger  wrote:

> The standalone cluster manager is fine for production.  Don't use Yarn
> or Mesos unless you already have another need for it.
>
> On Wed, Apr 26, 2017 at 4:53 PM, anna stax  wrote:
> > Hi Sam,
> >
> > Thank you for the reply.
> >
> > What do you mean by
> > I doubt people run spark in a. Single EC2 instance, certainly not in
> > production I don't think
> >
> > What is wrong in having a data pipeline on EC2 that reads data from
> kafka,
> > processes using spark and outputs to cassandra? Please explain.
> >
> > Thanks
> > -Anna
> >
> > On Wed, Apr 26, 2017 at 2:22 PM, Sam Elamin 
> wrote:
> >>
> >> Hi Anna
> >>
> >> There are a variety of options for launching spark clusters. I doubt
> >> people run spark in a. Single EC2 instance, certainly not in production
> I
> >> don't think
> >>
> >> I don't have enough information of what you are trying to do but if you
> >> are just trying to set things up from scratch then I think you can just
> use
> >> EMR which will create a cluster for you and attach a zeppelin instance
> as
> >> well
> >>
> >>
> >> You can also use databricks for ease of use and very little management
> but
> >> you will pay a premium for that abstraction
> >>
> >>
> >> Regards
> >> Sam
> >> On Wed, 26 Apr 2017 at 22:02, anna stax  wrote:
> >>>
> >>> I need to setup a spark cluster for Spark streaming and scheduled batch
> >>> jobs and adhoc queries.
> >>> Please give me some suggestions. Can this be done in standalone mode.
> >>>
> >>> Right now we have a spark cluster in standalone mode on AWS EC2 running
> >>> spark streaming application. Can we run spark batch jobs and zeppelin
> on the
> >>> same. Do we need a better resource manager like Mesos?
> >>>
> >>> Are there any companies or individuals that can help in setting this
> up?
> >>>
> >>> Thank you.
> >>>
> >>> -Anna
> >
> >
>


Re: help/suggestions to setup spark cluster

2017-04-26 Thread Cody Koeninger
The standalone cluster manager is fine for production.  Don't use Yarn
or Mesos unless you already have another need for it.

On Wed, Apr 26, 2017 at 4:53 PM, anna stax  wrote:
> Hi Sam,
>
> Thank you for the reply.
>
> What do you mean by
> I doubt people run spark in a. Single EC2 instance, certainly not in
> production I don't think
>
> What is wrong in having a data pipeline on EC2 that reads data from kafka,
> processes using spark and outputs to cassandra? Please explain.
>
> Thanks
> -Anna
>
> On Wed, Apr 26, 2017 at 2:22 PM, Sam Elamin  wrote:
>>
>> Hi Anna
>>
>> There are a variety of options for launching spark clusters. I doubt
>> people run spark in a. Single EC2 instance, certainly not in production I
>> don't think
>>
>> I don't have enough information of what you are trying to do but if you
>> are just trying to set things up from scratch then I think you can just use
>> EMR which will create a cluster for you and attach a zeppelin instance as
>> well
>>
>>
>> You can also use databricks for ease of use and very little management but
>> you will pay a premium for that abstraction
>>
>>
>> Regards
>> Sam
>> On Wed, 26 Apr 2017 at 22:02, anna stax  wrote:
>>>
>>> I need to setup a spark cluster for Spark streaming and scheduled batch
>>> jobs and adhoc queries.
>>> Please give me some suggestions. Can this be done in standalone mode.
>>>
>>> Right now we have a spark cluster in standalone mode on AWS EC2 running
>>> spark streaming application. Can we run spark batch jobs and zeppelin on the
>>> same. Do we need a better resource manager like Mesos?
>>>
>>> Are there any companies or individuals that can help in setting this up?
>>>
>>> Thank you.
>>>
>>> -Anna
>
>

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: help/suggestions to setup spark cluster

2017-04-26 Thread anna stax
Hi Sam,

Thank you for the reply.

What do you mean by
I doubt people run spark in a. Single EC2 instance, certainly not in
production I don't think

What is wrong in having a data pipeline on EC2 that reads data from kafka,
processes using spark and outputs to cassandra? Please explain.

Thanks
-Anna

On Wed, Apr 26, 2017 at 2:22 PM, Sam Elamin  wrote:

> Hi Anna
>
> There are a variety of options for launching spark clusters. I doubt
> people run spark in a. Single EC2 instance, certainly not in production I
> don't think
>
> I don't have enough information of what you are trying to do but if you
> are just trying to set things up from scratch then I think you can just use
> EMR which will create a cluster for you and attach a zeppelin instance as
> well
>
>
> You can also use databricks for ease of use and very little management but
> you will pay a premium for that abstraction
>
>
> Regards
> Sam
> On Wed, 26 Apr 2017 at 22:02, anna stax  wrote:
>
>> I need to setup a spark cluster for Spark streaming and scheduled batch
>> jobs and adhoc queries.
>> Please give me some suggestions. Can this be done in standalone mode.
>>
>> Right now we have a spark cluster in standalone mode on AWS EC2 running
>> spark streaming application. Can we run spark batch jobs and zeppelin on
>> the same. Do we need a better resource manager like Mesos?
>>
>> Are there any companies or individuals that can help in setting this up?
>>
>> Thank you.
>>
>> -Anna
>>
>


Re: help/suggestions to setup spark cluster

2017-04-26 Thread Sam Elamin
Hi Anna

There are a variety of options for launching spark clusters. I doubt people
run spark in a. Single EC2 instance, certainly not in production I don't
think

I don't have enough information of what you are trying to do but if you are
just trying to set things up from scratch then I think you can just use EMR
which will create a cluster for you and attach a zeppelin instance as well


You can also use databricks for ease of use and very little management but
you will pay a premium for that abstraction


Regards
Sam
On Wed, 26 Apr 2017 at 22:02, anna stax  wrote:

> I need to setup a spark cluster for Spark streaming and scheduled batch
> jobs and adhoc queries.
> Please give me some suggestions. Can this be done in standalone mode.
>
> Right now we have a spark cluster in standalone mode on AWS EC2 running
> spark streaming application. Can we run spark batch jobs and zeppelin on
> the same. Do we need a better resource manager like Mesos?
>
> Are there any companies or individuals that can help in setting this up?
>
> Thank you.
>
> -Anna
>


help/suggestions to setup spark cluster

2017-04-26 Thread anna stax
I need to setup a spark cluster for Spark streaming and scheduled batch
jobs and adhoc queries.
Please give me some suggestions. Can this be done in standalone mode.

Right now we have a spark cluster in standalone mode on AWS EC2 running
spark streaming application. Can we run spark batch jobs and zeppelin on
the same. Do we need a better resource manager like Mesos?

Are there any companies or individuals that can help in setting this up?

Thank you.
-Anna