[ https://issues.apache.org/jira/browse/SPARK-32120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Enrico Minack updated SPARK-32120: ---------------------------------- Attachment: screenshot-3.png > Single GPU is allocated multiple times > -------------------------------------- > > Key: SPARK-32120 > URL: https://issues.apache.org/jira/browse/SPARK-32120 > Project: Spark > Issue Type: Bug > Components: Scheduler > Affects Versions: 3.0.0 > Reporter: Enrico Minack > Priority: Major > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png > > > Running spark in a {{local-cluster[2,1,1024]}} with one GPU per worker, task > and executor and two GPUs provided through a GPU discovery script, the same > GPU is allocated to both executors. > Discovery script output: > {code} > {"name": "gpu", "addresses": ["0", "1"]} > {code} > Spark local cluster setup through `spark-shell`: > {code} > ./spark-3.0.0-bin-hadoop2.7/bin/spark-shell --master > "local-cluster[2,1,1024]" --conf > spark.worker.resource.gpu.discoveryScript=/tmp/gpu.json --conf > spark.worker.resource.gpu.amount=1 --conf spark.task.resource.gpu.amount=1 > --conf spark.executor.resource.gpu.amount=1 > {code} > Executor of this cluster: > Code run in the Spark shell: > {code} > scala> import org.apache.spark.TaskContext > import org.apache.spark.TaskContext > scala> def fn(it: Iterator[java.lang.Long]): Iterator[(String, (String, > Array[String]))] = { TaskContext.get().resources().mapValues(v => (v.name, > v.addresses)).iterator } > fn: (it: Iterator[Long])Iterator[(String, (String, Array[String]))] > scala> spark.range(0,2,1,2).mapPartitions(fn).collect > res0: Array[(String, (String, Array[String]))] = Array((gpu,(gpu,Array(1))), > (gpu,(gpu,Array(1)))) > {code} > The result shows that each task got GPU {{1}}. The executor page shows that > each task has been run on different executors: > The expected behaviour would have been to have GPU `0` assigned to one > executor and GPU {{1}} to the other executor. Consequently, each partition / > task should then see a different GPU. > With Spark 3.0.0-preview2 the allocation was as expected (identical code and > Spark shell setup): > {code} > res0: Array[(String, (String, Array[String]))] = Array((gpu,(gpu,Array(0))), > (gpu,(gpu,Array(1)))) > {code} > Happy to contribute a patch if this is an accepted bug. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org