[ 
https://issues.apache.org/jira/browse/SPARK-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068648#comment-14068648
 ] 

Twinkle Sachdeva commented on SPARK-2604:
-----------------------------------------

For Executors, In verifyClusterResources we do not take into account the 
overhead, where as in YarnAllocationHandler.scala, following def is provided:

isResourceConstraintSatisfied() : itif the container memory is >= 
executormemory + MemoryOverhead.

In the case,when container is not allocated with enough memory to satisfy the 
condition, container is release. As executor has not been launched, it is not 
counted as failures. Please see the code below:

for (container <- allocatedContainers) {
        if (isResourceConstraintSatisfied(container)) {
          // Add the accepted `container` to the host's list of already 
accepted,
          // allocated containers
          val host = container.getNodeId.getHost
          val containersForHost = hostToContainers.getOrElseUpdate(host,
            new ArrayBuffer[Container]())
          containersForHost += container
        } else {
          // Release container, since it doesn't satisfy resource constraints.
          releaseContainer(container)
        }
      }

So allocation happens and container is then returned and not counted as failed, 
due to which only App master is launched.

> Spark Application hangs on yarn in edge case scenario of executor memory 
> requirement
> ------------------------------------------------------------------------------------
>
>                 Key: SPARK-2604
>                 URL: https://issues.apache.org/jira/browse/SPARK-2604
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.0.0
>            Reporter: Twinkle Sachdeva
>
> In yarn environment, let's say :
> MaxAM = Maximum allocatable memory
> ExecMem - Executor's memory
> if (MaxAM > ExecMem && ( MaxAM - ExecMem) > 384m ))
>   then Maximum resource validation fails w.r.t executor memory , and 
> application master gets launched, but when resource is allocated and again 
> validated, they are returned and application appears to be hanged.
> Typical use case is to ask for executor memory = maximum allowed memory as 
> per yarn config



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to