Re: newbie: unable to use all my cores and memory

2015-11-20 Thread Andy Davidson
Hi Igor

Thanks . The reason I am using cluster mode is this the stream app must will
run for ever. I am using client mode for my pyspark work

Andy

From:  Igor Berman 
Date:  Friday, November 20, 2015 at 6:22 AM
To:  Andrew Davidson 
Cc:  "user @spark" 
Subject:  Re: newbie: unable to use all my cores and memory

> u've asked total cores to be 2 + 1 for driver(since you are running in cluster
> mode, so it's running on one of the slaves)
> change total cores to be 3*2
> change submit mode to be client - you'll have full utilization
> (btw it's not advisable to use all cores of slave...since there is OS
> processes and other processes...)
> 
> On 20 November 2015 at 02:02, Andy Davidson 
> wrote:
>> I am having a heck of a time figuring out how to utilize my cluster
>> effectively. I am using the stand alone cluster manager. I have a master
>> and 3 slaves. Each machine has 2 cores.
>> 
>> I am trying to run a streaming app in cluster mode and pyspark at the same
>> time.
>> 
>> t1) On my console I see
>> 
>> * Alive Workers: 3
>> * Cores in use: 6 Total, 0 Used
>> * Memory in use: 18.8 GB Total, 0.0 B Used
>> * Applications: 0 Running, 15 Completed
>> * Drivers: 0 Running, 2 Completed
>> * Status: ALIVE
>> 
>> t2) I start my streaming app
>> 
>> $SPARK_ROOT/bin/spark-submit \
>> --class "com.pws.spark.streaming.IngestDriver" \
>> --master $MASTER_URL \
>> --total-executor-cores 2 \
>> --deploy-mode cluster \
>> $jarPath --clusterMode  $*
>> 
>> t3) on my console I see
>> 
>> * Alive Workers: 3
>> * Cores in use: 6 Total, 3 Used
>> * Memory in use: 18.8 GB Total, 13.0 GB Used
>> * Applications: 1 Running, 15 Completed
>> * Drivers: 1 Running, 2 Completed
>> * Status: ALIVE
>> 
>> Looks like pyspark should be able to use the 3 remaining cores and 5.8 GB
>> of memory
>> 
>> t4) I start pyspark
>> 
>> export PYSPARK_PYTHON=python3.4
>> export PYSPARK_DRIVER_PYTHON=python3.4
>> export IPYTHON_OPTS="notebook --no-browser --port=7000
>> --log-level=WARN"
>> 
>> $SPARK_ROOT/bin/pyspark --master $MASTER_URL --total-executor-cores 3
>> --executor-memory 2g
>> 
>> t5) on my console I see
>> 
>> * Alive Workers: 3
>> * Cores in use: 6 Total, 4 Used
>> * Memory in use: 18.8 GB Total, 15.0 GB Used
>> * Applications: 2 Running, 18 Completed
>> * Drivers: 1 Running, 2 Completed
>> * Status: ALIVE
>> 
>> 
>> I have 2 unused cores and a lot of memory left over. My pyspark
>> application is going getting 1 core. If streaming app is not running
>> pyspark would be assigned 2 cores each on a different worker. I have tried
>> using various combinations of --executor-cores and --total-executor-cores.
>> Any idea how to get pyspark to use more cores and memory?
>> 
>> 
>> Kind regards
>> 
>> Andy
>> 
>> P.s.  Using different values I have wound up with  pyspark status ==
>> ³waiting² I think this is because there are not enough cores available?
>> 
>> 
>> 
>> 
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>> 
> 




Re: newbie: unable to use all my cores and memory

2015-11-20 Thread Igor Berman
u've asked total cores to be 2 + 1 for driver(since you are running in
cluster mode, so it's running on one of the slaves)
change total cores to be 3*2
change submit mode to be client - you'll have full utilization
(btw it's not advisable to use all cores of slave...since there is OS
processes and other processes...)

On 20 November 2015 at 02:02, Andy Davidson 
wrote:

> I am having a heck of a time figuring out how to utilize my cluster
> effectively. I am using the stand alone cluster manager. I have a master
> and 3 slaves. Each machine has 2 cores.
>
> I am trying to run a streaming app in cluster mode and pyspark at the same
> time.
>
> t1) On my console I see
>
> * Alive Workers: 3
> * Cores in use: 6 Total, 0 Used
> * Memory in use: 18.8 GB Total, 0.0 B Used
> * Applications: 0 Running, 15 Completed
> * Drivers: 0 Running, 2 Completed
> * Status: ALIVE
>
> t2) I start my streaming app
>
> $SPARK_ROOT/bin/spark-submit \
> --class "com.pws.spark.streaming.IngestDriver" \
> --master $MASTER_URL \
> --total-executor-cores 2 \
> --deploy-mode cluster \
> $jarPath --clusterMode  $*
>
> t3) on my console I see
>
> * Alive Workers: 3
> * Cores in use: 6 Total, 3 Used
> * Memory in use: 18.8 GB Total, 13.0 GB Used
> * Applications: 1 Running, 15 Completed
> * Drivers: 1 Running, 2 Completed
> * Status: ALIVE
>
> Looks like pyspark should be able to use the 3 remaining cores and 5.8 GB
> of memory
>
> t4) I start pyspark
>
> export PYSPARK_PYTHON=python3.4
> export PYSPARK_DRIVER_PYTHON=python3.4
> export IPYTHON_OPTS="notebook --no-browser --port=7000
> --log-level=WARN"
>
> $SPARK_ROOT/bin/pyspark --master $MASTER_URL
> --total-executor-cores 3
> --executor-memory 2g
>
> t5) on my console I see
>
> * Alive Workers: 3
> * Cores in use: 6 Total, 4 Used
> * Memory in use: 18.8 GB Total, 15.0 GB Used
> * Applications: 2 Running, 18 Completed
> * Drivers: 1 Running, 2 Completed
> * Status: ALIVE
>
>
> I have 2 unused cores and a lot of memory left over. My pyspark
> application is going getting 1 core. If streaming app is not running
> pyspark would be assigned 2 cores each on a different worker. I have tried
> using various combinations of --executor-cores and --total-executor-cores.
> Any idea how to get pyspark to use more cores and memory?
>
>
> Kind regards
>
> Andy
>
> P.s.  Using different values I have wound up with  pyspark status ==
> ³waiting² I think this is because there are not enough cores available?
>
>
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


newbie: unable to use all my cores and memory

2015-11-19 Thread Andy Davidson
I am having a heck of a time figuring out how to utilize my cluster
effectively. I am using the stand alone cluster manager. I have a master
and 3 slaves. Each machine has 2 cores.

I am trying to run a streaming app in cluster mode and pyspark at the same
time.

t1) On my console I see

* Alive Workers: 3
* Cores in use: 6 Total, 0 Used
* Memory in use: 18.8 GB Total, 0.0 B Used
* Applications: 0 Running, 15 Completed
* Drivers: 0 Running, 2 Completed
* Status: ALIVE

t2) I start my streaming app

$SPARK_ROOT/bin/spark-submit \
--class "com.pws.spark.streaming.IngestDriver" \
--master $MASTER_URL \
--total-executor-cores 2 \
--deploy-mode cluster \
$jarPath --clusterMode  $*

t3) on my console I see

* Alive Workers: 3
* Cores in use: 6 Total, 3 Used
* Memory in use: 18.8 GB Total, 13.0 GB Used
* Applications: 1 Running, 15 Completed
* Drivers: 1 Running, 2 Completed
* Status: ALIVE

Looks like pyspark should be able to use the 3 remaining cores and 5.8 GB
of memory

t4) I start pyspark

export PYSPARK_PYTHON=python3.4
export PYSPARK_DRIVER_PYTHON=python3.4
export IPYTHON_OPTS="notebook --no-browser --port=7000 --log-level=WARN"

$SPARK_ROOT/bin/pyspark --master $MASTER_URL --total-executor-cores 3
--executor-memory 2g

t5) on my console I see

* Alive Workers: 3
* Cores in use: 6 Total, 4 Used
* Memory in use: 18.8 GB Total, 15.0 GB Used
* Applications: 2 Running, 18 Completed
* Drivers: 1 Running, 2 Completed
* Status: ALIVE


I have 2 unused cores and a lot of memory left over. My pyspark
application is going getting 1 core. If streaming app is not running
pyspark would be assigned 2 cores each on a different worker. I have tried
using various combinations of --executor-cores and --total-executor-cores.
Any idea how to get pyspark to use more cores and memory?


Kind regards

Andy

P.s.  Using different values I have wound up with  pyspark status ==
³waiting² I think this is because there are not enough cores available?




-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org