Re: newbie: unable to use all my cores and memory
Hi Igor Thanks . The reason I am using cluster mode is this the stream app must will run for ever. I am using client mode for my pyspark work Andy From: Igor Berman Date: Friday, November 20, 2015 at 6:22 AM To: Andrew Davidson Cc: "user @spark" Subject: Re: newbie: unable to use all my cores and memory > u've asked total cores to be 2 + 1 for driver(since you are running in cluster > mode, so it's running on one of the slaves) > change total cores to be 3*2 > change submit mode to be client - you'll have full utilization > (btw it's not advisable to use all cores of slave...since there is OS > processes and other processes...) > > On 20 November 2015 at 02:02, Andy Davidson > wrote: >> I am having a heck of a time figuring out how to utilize my cluster >> effectively. I am using the stand alone cluster manager. I have a master >> and 3 slaves. Each machine has 2 cores. >> >> I am trying to run a streaming app in cluster mode and pyspark at the same >> time. >> >> t1) On my console I see >> >> * Alive Workers: 3 >> * Cores in use: 6 Total, 0 Used >> * Memory in use: 18.8 GB Total, 0.0 B Used >> * Applications: 0 Running, 15 Completed >> * Drivers: 0 Running, 2 Completed >> * Status: ALIVE >> >> t2) I start my streaming app >> >> $SPARK_ROOT/bin/spark-submit \ >> --class "com.pws.spark.streaming.IngestDriver" \ >> --master $MASTER_URL \ >> --total-executor-cores 2 \ >> --deploy-mode cluster \ >> $jarPath --clusterMode $* >> >> t3) on my console I see >> >> * Alive Workers: 3 >> * Cores in use: 6 Total, 3 Used >> * Memory in use: 18.8 GB Total, 13.0 GB Used >> * Applications: 1 Running, 15 Completed >> * Drivers: 1 Running, 2 Completed >> * Status: ALIVE >> >> Looks like pyspark should be able to use the 3 remaining cores and 5.8 GB >> of memory >> >> t4) I start pyspark >> >> export PYSPARK_PYTHON=python3.4 >> export PYSPARK_DRIVER_PYTHON=python3.4 >> export IPYTHON_OPTS="notebook --no-browser --port=7000 >> --log-level=WARN" >> >> $SPARK_ROOT/bin/pyspark --master $MASTER_URL --total-executor-cores 3 >> --executor-memory 2g >> >> t5) on my console I see >> >> * Alive Workers: 3 >> * Cores in use: 6 Total, 4 Used >> * Memory in use: 18.8 GB Total, 15.0 GB Used >> * Applications: 2 Running, 18 Completed >> * Drivers: 1 Running, 2 Completed >> * Status: ALIVE >> >> >> I have 2 unused cores and a lot of memory left over. My pyspark >> application is going getting 1 core. If streaming app is not running >> pyspark would be assigned 2 cores each on a different worker. I have tried >> using various combinations of --executor-cores and --total-executor-cores. >> Any idea how to get pyspark to use more cores and memory? >> >> >> Kind regards >> >> Andy >> >> P.s. Using different values I have wound up with pyspark status == >> ³waiting² I think this is because there are not enough cores available? >> >> >> >> >> - >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >
Re: newbie: unable to use all my cores and memory
u've asked total cores to be 2 + 1 for driver(since you are running in cluster mode, so it's running on one of the slaves) change total cores to be 3*2 change submit mode to be client - you'll have full utilization (btw it's not advisable to use all cores of slave...since there is OS processes and other processes...) On 20 November 2015 at 02:02, Andy Davidson wrote: > I am having a heck of a time figuring out how to utilize my cluster > effectively. I am using the stand alone cluster manager. I have a master > and 3 slaves. Each machine has 2 cores. > > I am trying to run a streaming app in cluster mode and pyspark at the same > time. > > t1) On my console I see > > * Alive Workers: 3 > * Cores in use: 6 Total, 0 Used > * Memory in use: 18.8 GB Total, 0.0 B Used > * Applications: 0 Running, 15 Completed > * Drivers: 0 Running, 2 Completed > * Status: ALIVE > > t2) I start my streaming app > > $SPARK_ROOT/bin/spark-submit \ > --class "com.pws.spark.streaming.IngestDriver" \ > --master $MASTER_URL \ > --total-executor-cores 2 \ > --deploy-mode cluster \ > $jarPath --clusterMode $* > > t3) on my console I see > > * Alive Workers: 3 > * Cores in use: 6 Total, 3 Used > * Memory in use: 18.8 GB Total, 13.0 GB Used > * Applications: 1 Running, 15 Completed > * Drivers: 1 Running, 2 Completed > * Status: ALIVE > > Looks like pyspark should be able to use the 3 remaining cores and 5.8 GB > of memory > > t4) I start pyspark > > export PYSPARK_PYTHON=python3.4 > export PYSPARK_DRIVER_PYTHON=python3.4 > export IPYTHON_OPTS="notebook --no-browser --port=7000 > --log-level=WARN" > > $SPARK_ROOT/bin/pyspark --master $MASTER_URL > --total-executor-cores 3 > --executor-memory 2g > > t5) on my console I see > > * Alive Workers: 3 > * Cores in use: 6 Total, 4 Used > * Memory in use: 18.8 GB Total, 15.0 GB Used > * Applications: 2 Running, 18 Completed > * Drivers: 1 Running, 2 Completed > * Status: ALIVE > > > I have 2 unused cores and a lot of memory left over. My pyspark > application is going getting 1 core. If streaming app is not running > pyspark would be assigned 2 cores each on a different worker. I have tried > using various combinations of --executor-cores and --total-executor-cores. > Any idea how to get pyspark to use more cores and memory? > > > Kind regards > > Andy > > P.s. Using different values I have wound up with pyspark status == > ³waiting² I think this is because there are not enough cores available? > > > > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >
newbie: unable to use all my cores and memory
I am having a heck of a time figuring out how to utilize my cluster effectively. I am using the stand alone cluster manager. I have a master and 3 slaves. Each machine has 2 cores. I am trying to run a streaming app in cluster mode and pyspark at the same time. t1) On my console I see * Alive Workers: 3 * Cores in use: 6 Total, 0 Used * Memory in use: 18.8 GB Total, 0.0 B Used * Applications: 0 Running, 15 Completed * Drivers: 0 Running, 2 Completed * Status: ALIVE t2) I start my streaming app $SPARK_ROOT/bin/spark-submit \ --class "com.pws.spark.streaming.IngestDriver" \ --master $MASTER_URL \ --total-executor-cores 2 \ --deploy-mode cluster \ $jarPath --clusterMode $* t3) on my console I see * Alive Workers: 3 * Cores in use: 6 Total, 3 Used * Memory in use: 18.8 GB Total, 13.0 GB Used * Applications: 1 Running, 15 Completed * Drivers: 1 Running, 2 Completed * Status: ALIVE Looks like pyspark should be able to use the 3 remaining cores and 5.8 GB of memory t4) I start pyspark export PYSPARK_PYTHON=python3.4 export PYSPARK_DRIVER_PYTHON=python3.4 export IPYTHON_OPTS="notebook --no-browser --port=7000 --log-level=WARN" $SPARK_ROOT/bin/pyspark --master $MASTER_URL --total-executor-cores 3 --executor-memory 2g t5) on my console I see * Alive Workers: 3 * Cores in use: 6 Total, 4 Used * Memory in use: 18.8 GB Total, 15.0 GB Used * Applications: 2 Running, 18 Completed * Drivers: 1 Running, 2 Completed * Status: ALIVE I have 2 unused cores and a lot of memory left over. My pyspark application is going getting 1 core. If streaming app is not running pyspark would be assigned 2 cores each on a different worker. I have tried using various combinations of --executor-cores and --total-executor-cores. Any idea how to get pyspark to use more cores and memory? Kind regards Andy P.s. Using different values I have wound up with pyspark status == ³waiting² I think this is because there are not enough cores available? - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org