[jira] [Commented] (SPARK-16574) Distribute computing to each node based on certain hints

Sean Owen (JIRA) Fri, 15 Jul 2016 12:05:11 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-16574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15379923#comment-15379923
 ]


Sean Owen commented on SPARK-16574:
-----------------------------------

spark.yarn.executor.nodeLabelExpression is described at 
http://spark.apache.org/docs/latest/running-on-yarn.html 
You have to consider data locality too; it's not just a question of where the 
compute is but where the data is. Generally tasks go to data not compute.

> Distribute computing to each node based on certain hints
> --------------------------------------------------------
>
>                 Key: SPARK-16574
>                 URL: https://issues.apache.org/jira/browse/SPARK-16574
>             Project: Spark
>          Issue Type: Wish
>            Reporter: Norman He
>
> 1) I have gpuWorkers RDD like(each node have 2 gpus)
>     val nodes= 10
>     val gpuCount = 2
>     val cross: Array[(Int, Int)] = for( x <- Array.range(0, nodes);  y <-     
>  Array.range(0, gpuCount ) ) yield (x, y)
>     var gpuWorkers: RDD[(Int, Int)] = sc.parallelize(cross, nodes * gpuCount)
> 2) when executor runs, I would somehow like to distribute code to each nodes 
> based on cross's gpu index(y) so that each machine 2 gpu can be used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-16574) Distribute computing to each node based on certain hints

Reply via email to