Our cluster is a standalone cluster with 16 computing nodes, each node has 16
cores. I set SPARK_WORKER_INSTANCES to 1, and set SPARK_WORKER_CORES to 32,
we give 512 tasks all together, this situation can help increase the
concurrency. But if I set SPARK_WORKER_INSTANCES to 2, SPARK_WORKER_CORES
to 16, this dosen't work well.
Thank you for your reply.
Yi Tian wrote
> for yarn-client mode:
>
> SPARK_EXECUTOR_CORES * SPARK_EXECUTOR_INSTANCES = 2(or 3) *
> TotalCoresOnYourCluster
>
> for standlone mode:
>
> SPARK_WORKER_INSTANCES * SPARK_WORKER_CORES = 2(or 3) *
> TotalCoresOnYourCluster
>
>
>
> Best Regards,
>
> Yi Tian
> tianyi.asiainfo@
>
>
>
>
> On Sep 28, 2014, at 17:59, myasuka <
> myasuka@
> > wrote:
>
>> Hi, everyone
>> I come across with a problem about increasing the concurency. In a
>> program, after shuffle write, each node should fetch 16 pair matrices to
>> do
>> matrix multiplication. such as:
>>
>> *import breeze.linalg.{DenseMatrix => BDM}
>>
>> pairs.map(t => {
>> val b1 = t._2._1.asInstanceOf[BDM[Double]]
>> val b2 = t._2._2.asInstanceOf[BDM[Double]]
>>
>> val c = (b1 * b2).asInstanceOf[BDM[Double]]
>>
>> (new BlockID(t._1.row, t._1.column), c)
>> })*
>>
>> Each node has 16 cores. However, no matter I set 16 tasks or more on
>> each node, the concurrency cannot be higher than 60%, which means not
>> every
>> core on the node is computing. Then I check the running log on the WebUI,
>> according to the amount of shuffle read and write in every task, I see
>> some
>> task do once matrix multiplication, some do twice while some do none.
>>
>> Thus, I think of using java multi thread to increase the concurrency.
>> I
>> wrote a program in scala which calls java multi thread without Spark on a
>> single node, by watch the 'top' monitor, I find this program can use CPU
>> up
>> to 1500% ( means nearly every core are computing). But I have no idea how
>> to
>> use Java multi thread in RDD transformation.
>>
>> Is there any one can provide some example code to use Java multi
>> thread
>> in RDD transformation, or give any idea to increase the concurrency ?
>>
>> Thanks for all
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-developers-list.1001551.n3.nabble.com/How-to-use-multi-thread-in-RDD-map-function-tp8583.html
>> Sent from the Apache Spark Developers List mailing list archive at
>> Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
> [email protected]
>> For additional commands, e-mail:
> [email protected]
>>
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/How-to-use-multi-thread-in-RDD-map-function-tp8583p8594.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]