In the Spark scheduler, the memory is always allocated for the executor.
So these few MBs are not relevant. In fact each executor has about 3 GB memory in my settings. During searching for the cause, I added minimal memory resources to the tasks. Sorry, if this causes some confusion.

Am 07.08.2014 17:32, schrieb Adam Bordelon:
Seems strange that you only have 2MB of allocatable memory on your slave ("total allocatable: cpus(*):2; mem(*):2;"). Try bumping that up to something like 2GB ("mem(*):2048") and I bet you'll see more tasks able to run. Even the default executor (no task) needs 32MB, so you won't be able to do much with a mesos slave that has <64MB memory. Are you explicitly setting a --resources flag on your slave? If not, do you only have tiny VMs available for the slaves?


On Thu, Aug 7, 2014 at 7:03 AM, Martin Weindel <[email protected] <mailto:[email protected]>> wrote:

    I'm using Apache Mesos 0.19.0 together with Apache Spark 1.0.2 on
    a three node cluster.

    When using the fine-grained task scheduling mode of Spark, I
    reproducably see some kind of dead lock on high load.
    If multiple jobs are running, after some time the jobs do not
    submit any tasks anymore.

    I have added some more log output in the Scheduler implementation
    of Spark and it looks as if Mesos does not make any offers
    anymore, although there are allocatable resources.

    Below is the log from Mesos. The last task is normally finished,
    the resources recovered, the filters are removed, but the log
    shows no "sending ... offers to framework" entries after this
    timepoint.
    I have tried to wake up the offers with a reviveOffers call I have
    added to the Spark code, but with no effect.
    The "Resources" section on the Mesos web UI shows all CPUs as
    idle, none is used or offered.

    If I kill all jobs but one, this last job continues and finishes
    normally.

    Is this a bug?

    Thanks,
    Martin

    I0807 15:17:54.605695 15727 master.cpp:2933] Sending 1 offers to framework 
20140717-090825-308511242-5050-15711-0044
    I0807 15:17:54.615705 15732 master.cpp:1889] Processing reply for offers: [ 
20140717-090825-308511242-5050-15711-2132 ] on slave 
20140717-090821-325288458-5050-2360-1 at slave(1)@10.130.99.20:5051  
<http://10.130.99.20:5051>  (ustst020-cep-node3.usu.usu.grp) for framework 
20140717-090825-308511242-5050-15711-0044
    I0807 15:17:54.615897 15732 master.hpp:655] Adding task 1 with resources 
cpus(*):1; mem(*):1 on slave 20140717-090821-325288458-5050-2360-1 
(ustst020-cep-node3.usu.usu.grp)
    I0807 15:17:54.616029 15732 master.cpp:3111] Launching task 1 of framework 
20140717-090825-308511242-5050-15711-0044 with resources cpus(*):1; mem(*):1 on slave 
20140717-090821-325288458-5050-2360-1 at slave(1)@10.130.99.20:5051  
<http://10.130.99.20:5051>  (ustst020-cep-node3.usu.usu.grp)
    I0807 15:17:54.616325 15732 hierarchical_allocator_process.hpp:589] 
Framework 20140717-090825-308511242-5050-15711-0044 filtered slave 
20140717-090821-325288458-5050-2360-1 for 8secs
    I0807 15:17:58.324476 15728 master.cpp:2628] Status update TASK_RUNNING (UUID: 
ec5ecf90-7313-4bf1-af9e-b5f6e35189f7) for task 1 of framework 
20140717-090825-308511242-5050-15711-0044 from slave 
20140717-090821-325288458-5050-2360-1 at slave(1)@10.130.99.20:5051  
<http://10.130.99.20:5051>  (ustst020-cep-node3.usu.usu.grp)
    I0807 15:17:58.326279 15726 master.cpp:1988] Reviving offers for framework 
20140717-090825-308511242-5050-15711-0044
    I0807 15:17:58.326406 15732 hierarchical_allocator_process.hpp:660] Removed 
filters for framework 20140717-090825-308511242-5050-15711-0044
    I0807 15:18:00.993798 15726 master.cpp:2628] Status update TASK_FINISHED (UUID: 
ef7a4dfd-c403-483a-a6a7-c2cd995aa64e) for task 1 of framework 
20140717-090825-308511242-5050-15711-0044 from slave 
20140717-090821-325288458-5050-2360-1 at slave(1)@10.130.99.20:5051  
<http://10.130.99.20:5051>  (ustst020-cep-node3.usu.usu.grp)
    I0807 15:18:00.994935 15726 master.hpp:673] Removing task 1 with resources 
cpus(*):1; mem(*):1 on slave 20140717-090821-325288458-5050-2360-1 
(ustst020-cep-node3.usu.usu.grp)
    I0807 15:18:00.995511 15726 master.cpp:1988] Reviving offers for framework 
20140717-090825-308511242-5050-15711-0044
    I0807 15:18:00.995599 15725 hierarchical_allocator_process.hpp:636] 
Recovered cpus(*):1; mem(*):1 (total allocatable: cpus(*):2; mem(*):2; 
disk(*):12526; ports(*):[31000-32000]) on slave 
20140717-090821-325288458-5050-2360-1 from framework 
20140717-090825-308511242-5050-15711-0044
    I0807 15:18:00.995846 15725 hierarchical_allocator_process.hpp:660] Removed 
filters for framework 20140717-090825-308511242-5050-15711-0044
    I0807 15:18:01.055794 15730 master.cpp:1988] Reviving offers for framework 
20140717-090825-308511242-5050-15711-0044
    I0807 15:18:01.055982 15730 hierarchical_allocator_process.hpp:660] Removed 
filters for framework 20140717-090825-308511242-5050-15711-0044



Reply via email to