for yarn-client mode: SPARK_EXECUTOR_CORES * SPARK_EXECUTOR_INSTANCES = 2(or 3) * TotalCoresOnYourCluster
for standlone mode: SPARK_WORKER_INSTANCES * SPARK_WORKER_CORES = 2(or 3) * TotalCoresOnYourCluster Best Regards, Yi Tian tianyi.asiai...@gmail.com On Sep 28, 2014, at 17:59, myasuka <myas...@live.com> wrote: > Hi, everyone > I come across with a problem about increasing the concurency. In a > program, after shuffle write, each node should fetch 16 pair matrices to do > matrix multiplication. such as: > > *import breeze.linalg.{DenseMatrix => BDM} > > pairs.map(t => { > val b1 = t._2._1.asInstanceOf[BDM[Double]] > val b2 = t._2._2.asInstanceOf[BDM[Double]] > > val c = (b1 * b2).asInstanceOf[BDM[Double]] > > (new BlockID(t._1.row, t._1.column), c) > })* > > Each node has 16 cores. However, no matter I set 16 tasks or more on > each node, the concurrency cannot be higher than 60%, which means not every > core on the node is computing. Then I check the running log on the WebUI, > according to the amount of shuffle read and write in every task, I see some > task do once matrix multiplication, some do twice while some do none. > > Thus, I think of using java multi thread to increase the concurrency. I > wrote a program in scala which calls java multi thread without Spark on a > single node, by watch the 'top' monitor, I find this program can use CPU up > to 1500% ( means nearly every core are computing). But I have no idea how to > use Java multi thread in RDD transformation. > > Is there any one can provide some example code to use Java multi thread > in RDD transformation, or give any idea to increase the concurrency ? > > Thanks for all > > > > > -- > View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/How-to-use-multi-thread-in-RDD-map-function-tp8583.html > Sent from the Apache Spark Developers List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org >