[
https://issues.apache.org/jira/browse/SPARK-16574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15379923#comment-15379923
]
Sean Owen commented on SPARK-16574:
-----------------------------------
spark.yarn.executor.nodeLabelExpression is described at
http://spark.apache.org/docs/latest/running-on-yarn.html
You have to consider data locality too; it's not just a question of where the
compute is but where the data is. Generally tasks go to data not compute.
> Distribute computing to each node based on certain hints
> --------------------------------------------------------
>
> Key: SPARK-16574
> URL: https://issues.apache.org/jira/browse/SPARK-16574
> Project: Spark
> Issue Type: Wish
> Reporter: Norman He
>
> 1) I have gpuWorkers RDD like(each node have 2 gpus)
> val nodes= 10
> val gpuCount = 2
> val cross: Array[(Int, Int)] = for( x <- Array.range(0, nodes); y <-
> Array.range(0, gpuCount ) ) yield (x, y)
> var gpuWorkers: RDD[(Int, Int)] = sc.parallelize(cross, nodes * gpuCount)
> 2) when executor runs, I would somehow like to distribute code to each nodes
> based on cross's gpu index(y) so that each machine 2 gpu can be used.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]