Martin Weindel created MESOS-1688:
-------------------------------------

             Summary: No offers if no memory is allocatable
                 Key: MESOS-1688
                 URL: https://issues.apache.org/jira/browse/MESOS-1688
             Project: Mesos
          Issue Type: Bug
          Components: master
    Affects Versions: 0.19.1, 0.19.0, 0.18.2, 0.18.1
            Reporter: Martin Weindel
            Priority: Critical


The [Spark 
scheduler|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala]
 allocates memory only for the executor and cpu only for its tasks.
So it can happen that all memory is nearly completely allocated by Spark 
executors, but all cpu resources are idle.
In this case Mesos does not offer resources anymore, as less than MIN_MEM 
(=32MB) memory is allocatable.
This effectively causes a dead lock in the Spark job, as it is not offered cpu 
resources needed for launching new tasks.

see {{HierarchicalAllocatorProcess::allocatable(const Resources&)}} called in 
{{HierarchicalAllocatorProcess::allocate(const hashset<SlaveID>&)}}
{code}
template <class RoleSorter, class FrameworkSorter>
bool
HierarchicalAllocatorProcess<RoleSorter, FrameworkSorter>::allocatable(
    const Resources& resources)
{
...
  Option<double> cpus = resources.cpus();
  Option<Bytes> mem = resources.mem();

  if (cpus.isSome() && mem.isSome()) {
    return cpus.get() >= MIN_CPUS && mem.get() > MIN_MEM;
  }

  return false;
}
{code}

A possible solution may to completely drop the condition on allocatable memory.




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to