Re: Driver vs master

2019-10-07 Thread Andrew Melo
On Mon, Oct 7, 2019 at 20:49 ayan guha  wrote:

> HI
>
> I think you are mixing terminologies here. Loosely speaking, Master
> manages worker machines. Each worker machine can run one or more processes.
> A process can be a driver or executor. You submit applications to the
> master. Each application will have driver and executors. Master will decide
> where to put each of them. In cluster mode, master will distribute the
> drivers across the cluster. In client mode, master will try to run the
> driver processes within master's own process. You can launch multiple
> master processes as well and use them for a set of applications - this
> happens when you use YARN. I am not sure how Mesos or K8 works in that
> score though.
>

Right, that's why I initially had the caveat  "This depends on what
master/deploy mode you're using: if it's "local" master and "client mode"
then yes tasks execute in the same JVM as the driver".

The answer depends on the exact setup Amit has and how the application is
configured


> HTH...
>
> Ayan
>
>
>
> On Tue, Oct 8, 2019 at 12:11 PM Andrew Melo  wrote:
>
>>
>>
>> On Mon, Oct 7, 2019 at 19:20 Amit Sharma  wrote:
>>
>>> Thanks Andrew but I am asking specific to driver memory not about
>>> executors memory. We have just one master and if each jobs driver.memory=4g
>>> and master nodes total memory is 16gb then we can not execute more than 4
>>> jobs at a time.
>>
>>
>> I understand that. I think there's a misunderstanding with the
>> terminology, though. Are you running multiple separate spark instances on a
>> single machine or one instance with multiple jobs inside.
>>
>>
>>>
>>> On Monday, October 7, 2019, Andrew Melo  wrote:
>>>
 Hi Amit

 On Mon, Oct 7, 2019 at 18:33 Amit Sharma  wrote:

> Can you please help me understand this. I believe driver programs runs
> on master node

 If we are running 4 spark job and driver memory config is 4g then total
> 16 6b would be used of master node.


 This depends on what master/deploy mode you're using: if it's "local"
 master and "client mode" then yes tasks execute in the same JVM as the
 driver. In this case though, the driver JVM uses whatever much space is
 allocated for the driver regardless of how many threads you have.


 So if we will run more jobs then we need more memory on master. Please
> correct me if I am wrong.
>

 This depends on your application, but in general more threads will
 require more memory.



>
> Thanks
> Amit
>
 --
 It's dark in this basement.

>>> --
>> It's dark in this basement.
>>
>
>
> --
> Best Regards,
> Ayan Guha
>
-- 
It's dark in this basement.


Re: Driver vs master

2019-10-07 Thread ayan guha
HI

I think you are mixing terminologies here. Loosely speaking, Master manages
worker machines. Each worker machine can run one or more processes. A
process can be a driver or executor. You submit applications to the master.
Each application will have driver and executors. Master will decide where
to put each of them. In cluster mode, master will distribute the drivers
across the cluster. In client mode, master will try to run the driver
processes within master's own process. You can launch multiple master
processes as well and use them for a set of applications - this happens
when you use YARN. I am not sure how Mesos or K8 works in that score
though.

HTH...

Ayan



On Tue, Oct 8, 2019 at 12:11 PM Andrew Melo  wrote:

> Hi
>
> On Mon, Oct 7, 2019 at 19:20 Amit Sharma  wrote:
>
>> Thanks Andrew but I am asking specific to driver memory not about
>> executors memory. We have just one master and if each jobs driver.memory=4g
>> and master nodes total memory is 16gb then we can not execute more than 4
>> jobs at a time.
>
>
> I understand that. I think there's a misunderstanding with the
> terminology, though. Are you running multiple separate spark instances on a
> single machine or one instance with multiple jobs inside.
>
>
>>
>> On Monday, October 7, 2019, Andrew Melo  wrote:
>>
>>> Hi Amit
>>>
>>> On Mon, Oct 7, 2019 at 18:33 Amit Sharma  wrote:
>>>
 Can you please help me understand this. I believe driver programs runs
 on master node
>>>
>>> If we are running 4 spark job and driver memory config is 4g then total
 16 6b would be used of master node.
>>>
>>>
>>> This depends on what master/deploy mode you're using: if it's "local"
>>> master and "client mode" then yes tasks execute in the same JVM as the
>>> driver. In this case though, the driver JVM uses whatever much space is
>>> allocated for the driver regardless of how many threads you have.
>>>
>>>
>>> So if we will run more jobs then we need more memory on master. Please
 correct me if I am wrong.

>>>
>>> This depends on your application, but in general more threads will
>>> require more memory.
>>>
>>>
>>>

 Thanks
 Amit

>>> --
>>> It's dark in this basement.
>>>
>> --
> It's dark in this basement.
>


-- 
Best Regards,
Ayan Guha


Re: Driver vs master

2019-10-07 Thread Andrew Melo
Hi

On Mon, Oct 7, 2019 at 19:20 Amit Sharma  wrote:

> Thanks Andrew but I am asking specific to driver memory not about
> executors memory. We have just one master and if each jobs driver.memory=4g
> and master nodes total memory is 16gb then we can not execute more than 4
> jobs at a time.


I understand that. I think there's a misunderstanding with the terminology,
though. Are you running multiple separate spark instances on a single
machine or one instance with multiple jobs inside.


>
> On Monday, October 7, 2019, Andrew Melo  wrote:
>
>> Hi Amit
>>
>> On Mon, Oct 7, 2019 at 18:33 Amit Sharma  wrote:
>>
>>> Can you please help me understand this. I believe driver programs runs
>>> on master node
>>
>> If we are running 4 spark job and driver memory config is 4g then total
>>> 16 6b would be used of master node.
>>
>>
>> This depends on what master/deploy mode you're using: if it's "local"
>> master and "client mode" then yes tasks execute in the same JVM as the
>> driver. In this case though, the driver JVM uses whatever much space is
>> allocated for the driver regardless of how many threads you have.
>>
>>
>> So if we will run more jobs then we need more memory on master. Please
>>> correct me if I am wrong.
>>>
>>
>> This depends on your application, but in general more threads will
>> require more memory.
>>
>>
>>
>>>
>>> Thanks
>>> Amit
>>>
>> --
>> It's dark in this basement.
>>
> --
It's dark in this basement.


Re: Driver vs master

2019-10-07 Thread Amit Sharma
Thanks Andrew but I am asking specific to driver memory not about executors
memory. We have just one master and if each jobs driver.memory=4g and
master nodes total memory is 16gb then we can not execute more than 4 jobs
at a time.

On Monday, October 7, 2019, Andrew Melo  wrote:

> Hi Amit
>
> On Mon, Oct 7, 2019 at 18:33 Amit Sharma  wrote:
>
>> Can you please help me understand this. I believe driver programs runs on
>> master node
>
> If we are running 4 spark job and driver memory config is 4g then total 16
>> 6b would be used of master node.
>
>
> This depends on what master/deploy mode you're using: if it's "local"
> master and "client mode" then yes tasks execute in the same JVM as the
> driver. In this case though, the driver JVM uses whatever much space is
> allocated for the driver regardless of how many threads you have.
>
>
> So if we will run more jobs then we need more memory on master. Please
>> correct me if I am wrong.
>>
>
> This depends on your application, but in general more threads will require
> more memory.
>
>
>
>>
>> Thanks
>> Amit
>>
> --
> It's dark in this basement.
>


Re: Driver vs master

2019-10-07 Thread Andrew Melo
Hi Amit

On Mon, Oct 7, 2019 at 18:33 Amit Sharma  wrote:

> Can you please help me understand this. I believe driver programs runs on
> master node

If we are running 4 spark job and driver memory config is 4g then total 16
> 6b would be used of master node.


This depends on what master/deploy mode you're using: if it's "local"
master and "client mode" then yes tasks execute in the same JVM as the
driver. In this case though, the driver JVM uses whatever much space is
allocated for the driver regardless of how many threads you have.


So if we will run more jobs then we need more memory on master. Please
> correct me if I am wrong.
>

This depends on your application, but in general more threads will require
more memory.



>
> Thanks
> Amit
>
-- 
It's dark in this basement.