[
https://issues.apache.org/jira/browse/SPARK-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14339798#comment-14339798
]
Mridul Muralidharan commented on SPARK-6050:
--------------------------------------------
With more verbose debug added, the problem surfaces.
Atleast with hadoop 2.5, the returned response always has vCores == 1 (and at
the RM, it is treated as vCores == 1 too ... sigh, unimplemented ?)
So in effect, we must not set executorCores while creating "resource" in
YarnAllocator.
See below for log snippet :
15/02/27 06:37:33 INFO YarnAllocator: Will request 1 executor containers, each
with 2 cores and 32870 MB memory including 2150 MB overhead
15/02/27 06:37:33 DEBUG AMRMClientImpl: Added priority=1
15/02/27 06:37:33 DEBUG AMRMClientImpl: addResourceRequest: applicationId=
priority=1 resourceName=* numContainers=1 #asks=1
15/02/27 06:37:33 INFO YarnAllocator: Container request (host: Any, capability:
<memory:32870, vCores:2>)
15/02/27 06:37:33 INFO YarnAllocator: missing = 0, targetNumExecutors = 1,
numPendingAllocate = 1, numExecutorsRunning = 0
15/02/27 06:37:33 INFO AMRMClientImpl: Received new token for : <host>:8041
15/02/27 06:37:33 DEBUG YarnAllocator: Allocated containers: 1. Current
executor count: 0. Cluster resources: <memory:81907200, vCores:-1>.
15/02/27 06:37:33 INFO YarnAllocator: allocatedContainer = Container:
[ContainerId: <contained_id>, NodeId: <host>:8041, NodeHttpAddress:
<host>:8042, Resource: <memory:33280, vCores:1>, Priority: 1, Token: Token {
kind: ContainerToken, service: <host>:8041 }, ], location = <host>
15/02/27 06:37:33 INFO YarnAllocator: allocatedContainer = Container:
[ContainerId: <contained_id>, NodeId: <host>:8041, NodeHttpAddress:
<host>:8042, Resource: <memory:33280, vCores:1>, Priority: 1, Token: Token {
kind: ContainerToken, service: <host>:8041 }, ], location = /<IP>
15/02/27 06:37:33 INFO YarnAllocator: allocatedContainer = Container:
[ContainerId: <contained_id>, NodeId: <host>:8041, NodeHttpAddress:
<host>:8042, Resource: <memory:33280, vCores:1>, Priority: 1, Token: Token {
kind: ContainerToken, service: <host>:8041 }, ], location = *
15/02/27 06:37:33 DEBUG YarnAllocator: Releasing 1 unneeded containers that
were allocated to us
15/02/27 06:37:33 INFO YarnAllocator: Received 1 containers from YARN,
launching executors on 0 of them.
> Spark on YARN does not work --executor-cores is specified
> ---------------------------------------------------------
>
> Key: SPARK-6050
> URL: https://issues.apache.org/jira/browse/SPARK-6050
> Project: Spark
> Issue Type: Bug
> Components: YARN
> Affects Versions: 1.3.0
> Environment: 2.5 based YARN cluster.
> Reporter: Mridul Muralidharan
> Priority: Blocker
>
> There are multiple issues here (which I will detail as comments), but to
> reproduce running the following ALWAYS hangs in our cluster with the 1.3 RC
> ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master
> yarn-cluster --executor-cores 8 --num-executors 15 --driver-memory 4g
> --executor-memory 2g --queue webmap lib/spark-examples*.jar
> 10
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]