Instead of SPARK_WORKER_INSTANCES you can also set SPARK_WORKER_CORES, to have one worker that thinks it has more cores.
Matei > On Nov 26, 2014, at 5:01 PM, Yotto Koga <yotto.k...@autodesk.com> wrote: > > Thanks Sean. That worked out well. > > For anyone who happens onto this post and wants to do the same, these are the > steps I took to do as Sean suggested... > > (Note this is for a stand alone cluster) > > login to the master > > ~/spark/sbin/stop-all.sh > > edit ~/spark/conf/spark-env.sh > > modify the line > export SPARK_WORKER_INSTANCES=1 > to the multiple you want to set (e.g 2) > > I also added > export SPARK_WORKER_MEMORY=some reasonable value so that the total number of > workers on a node is within the available memory available on the node (e.g. > 2g) > > ~/spark-ec2/copy-dir /root/spark/conf > > ~/spark/sbin/start-all.sh > > > ________________________________________ > From: Sean Owen [so...@cloudera.com] > Sent: Wednesday, November 26, 2014 12:14 AM > To: Yotto Koga > Cc: user@spark.apache.org > Subject: Re: configure to run multiple tasks on a core > > What about running, say, 2 executors per machine, each of which thinks > it should use all cores? > > You can also multi-thread your map function manually, directly, within > your code, with careful use of a java.util.concurrent.Executor > > On Wed, Nov 26, 2014 at 6:57 AM, yotto <yotto.k...@autodesk.com> wrote: >> I'm running a spark-ec2 cluster. >> >> I have a map task that calls a specialized C++ external app. The app doesn't >> fully utilize the core as it needs to download/upload data as part of the >> task. Looking at the worker nodes, it appears that there is one task with my >> app running per core. >> >> I'd like to better utilize the cpu resources with the hope of increasing >> throughput by running multiple tasks (with my app) per core in parallel. >> >> I see there is a spark.task.cpus config setting with a default value of 1. >> It appears though that this is used to go the other way than what I am >> looking for. >> >> Is there a way where I can specify multiple tasks per core rather than >> multiple cores per task? >> >> thanks for any help. >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/configure-to-run-multiple-tasks-on-a-core-tp19834.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org