[ 
https://issues.apache.org/jira/browse/SPARK-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14339798#comment-14339798
 ] 

Mridul Muralidharan commented on SPARK-6050:
--------------------------------------------

With more verbose debug added, the problem surfaces.
Atleast with hadoop 2.5, the returned response always has vCores == 1 (and at 
the RM, it is treated as vCores == 1 too ... sigh, unimplemented ?)


So in effect, we must not set executorCores while creating "resource" in 
YarnAllocator.

See below for log snippet :


15/02/27 06:37:33 INFO YarnAllocator: Will request 1 executor containers, each 
with 2 cores and 32870 MB memory including 2150 MB overhead
15/02/27 06:37:33 DEBUG AMRMClientImpl: Added priority=1
15/02/27 06:37:33 DEBUG AMRMClientImpl: addResourceRequest: applicationId= 
priority=1 resourceName=* numContainers=1 #asks=1
15/02/27 06:37:33 INFO YarnAllocator: Container request (host: Any, capability: 
<memory:32870, vCores:2>)
15/02/27 06:37:33 INFO YarnAllocator: missing = 0, targetNumExecutors = 1, 
numPendingAllocate = 1, numExecutorsRunning = 0
15/02/27 06:37:33 INFO AMRMClientImpl: Received new token for : <host>:8041
15/02/27 06:37:33 DEBUG YarnAllocator: Allocated containers: 1. Current 
executor count: 0. Cluster resources: <memory:81907200, vCores:-1>.
15/02/27 06:37:33 INFO YarnAllocator: allocatedContainer = Container: 
[ContainerId: <contained_id>, NodeId: <host>:8041, NodeHttpAddress: 
<host>:8042, Resource: <memory:33280, vCores:1>, Priority: 1, Token: Token { 
kind: ContainerToken, service: <host>:8041 }, ], location = <host>
15/02/27 06:37:33 INFO YarnAllocator: allocatedContainer = Container: 
[ContainerId: <contained_id>, NodeId: <host>:8041, NodeHttpAddress: 
<host>:8042, Resource: <memory:33280, vCores:1>, Priority: 1, Token: Token { 
kind: ContainerToken, service: <host>:8041 }, ], location = /<IP>
15/02/27 06:37:33 INFO YarnAllocator: allocatedContainer = Container: 
[ContainerId: <contained_id>, NodeId: <host>:8041, NodeHttpAddress: 
<host>:8042, Resource: <memory:33280, vCores:1>, Priority: 1, Token: Token { 
kind: ContainerToken, service: <host>:8041 }, ], location = *
15/02/27 06:37:33 DEBUG YarnAllocator: Releasing 1 unneeded containers that 
were allocated to us
15/02/27 06:37:33 INFO YarnAllocator: Received 1 containers from YARN, 
launching executors on 0 of them.


> Spark on YARN does not work --executor-cores is specified
> ---------------------------------------------------------
>
>                 Key: SPARK-6050
>                 URL: https://issues.apache.org/jira/browse/SPARK-6050
>             Project: Spark
>          Issue Type: Bug
>          Components: YARN
>    Affects Versions: 1.3.0
>         Environment: 2.5 based YARN cluster.
>            Reporter: Mridul Muralidharan
>            Priority: Blocker
>
> There are multiple issues here (which I will detail as comments), but to 
> reproduce running the following ALWAYS hangs in our cluster with the 1.3 RC
> ./bin/spark-submit --class org.apache.spark.examples.SparkPi     --master 
> yarn-cluster --executor-cores 8    --num-executors 15     --driver-memory 4g  
>    --executor-memory 2g          --queue webmap     lib/spark-examples*.jar   
>   10



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to