Have you done the math on number of tasks * size of task?

We didn't wipe the .data field in 0.19.1:
https://issues.apache.org/jira/browse/MESOS-1746

On Thu, Nov 20, 2014 at 2:51 PM, Tom Arnfeld <t...@duedil.com> wrote:

> That's what I thought. There around 2500 tasks launched with this master,
> most of which will be by our Hadoop JT. The Hadoop framework ships the
> configuration for the TT using the TaskInfo.data property, and that looks
> to be about 80K per task.
>
> Any debugging suggestions?
>
> --
>
> Tom Arnfeld
> Developer // DueDil
>
> (+44) 7525940046
> 25 Christopher Street, London, EC2A 2BS
>
>
> On Thu, Nov 20, 2014 at 10:33 PM, Benjamin Mahler <
> benjamin.mah...@gmail.com> wrote:
>
>> It shouldn't be that high, especially with the size of the cluster I see
>> in your stats.
>>
>> Which scheduler(s) are you running, and do they create large TaskInfo
>> objects? Just a hunch, as I do not recall any leaks in 0.19.1.
>>
>> On Tue, Nov 18, 2014 at 1:00 AM, Tom Arnfeld <t...@duedil.com> wrote:
>>
>>>  I've noticed some strange memory usage behaviour of the Mesos master
>>> in a small cluster of ours. We have three master nodes in a quorum and are
>>> using ZK.
>>>
>>> The master in question has 12GB of ram available of which the
>>> mesos-master process is using 10GB (resident) of which seems quite a lot.
>>> That being said I'm not sure what the memory profile of the master should
>>> look like...
>>>
>>> Here's a snapshot of our /stats.json endpoint.
>>>
>>> This cluster is running 0.19.1 so perhaps there are some memory leak
>>> fixes in a newer release that we need to take advantage of.
>>>
>>> Any help would be appreciated!
>>>
>>> ---------------------------------------------
>>>
>>> {"activated_slaves":19,"active_schedulers":1,"active_tasks_gauge":1,"cpus_percent":0.116618075801749,"cpus_total":171.5,"cpus_used":20,"deactivated_slaves":0,"disk_percent":0.0273684210526316,"disk_total":972800,"disk_used":26624,"elected":1,"failed_tasks":11,"finished_tasks":2658,"invalid_status_updates":2638,"killed_tasks":1,"lost_tasks":4,"master/cpus_percent":0.116618075801749,"master/cpus_total":171.5,"master/cpus_used":20,"master/disk_percent":0.0273684210526316,"master/disk_total":972800,"master/disk_used":26624,"master/dropped_messages":16,"master/elected":1,"master/event_queue_size":0,"master/frameworks_active":1,"master/frameworks_inactive":0,"master/invalid_framework_to_executor_messages":0,"master/invalid_status_update_acknowledgements":0,"master/invalid_status_updates":2638,"master/mem_percent":0.279896013864818,"master/mem_total":1181696,"master/mem_used":330752,"master/messages_authenticate":0,"master/messages_deactivate_framework":0,"master/messages_exited_executor":2667,"master/messages_framework_to_executor":0,"master/messages_kill_task":4397,"master/messages_launch_tasks":838024,"master/messages_reconcile_tasks":0,"master/messages_register_framework":27,"master/messages_register_slave":1,"master/messages_reregister_framework":326788,"master/messages_reregister_slave":31,"master/messages_resource_request":0,"master/messages_revive_offers":0,"master/messages_status_update":8009,"master/messages_status_update_acknowledgement":0,"master/messages_unregister_framework":26,"master/messages_unregister_slave":0,"master/outstanding_offers":0,"master/recovery_slave_removals":0,"master/slave_registrations":1,"master/slave_removals":0,"master/slave_reregistrations":18,"master/slaves_active":19,"master/slaves_inactive":0,"master/tasks_failed":11,"master/tasks_finished":2658,"master/tasks_killed":1,"master/tasks_lost":4,"master/tasks_running":1,"master/tasks_staging":0,"master/tasks_starting":0,"master/uptime_secs":1411611.70786125,"master/valid_framework_to_executor_messages":0,"master/valid_status_update_acknowledgements":0,"master/valid_status_updates":5371,"mem_percent":0.279896013864818,"mem_total":1181696,"mem_used":330752,"outstanding_offers":0,"registrar/queued_operations":0,"registrar/registry_size_bytes":4348,"registrar/state_fetch_ms":95.591936,"registrar/state_store_ms":48.622848,"staged_tasks":2675,"started_tasks":26,"system/cpus_total":2,"system/load_15min":0.05,"system/load_1min":0.03,"system/load_5min":0.04,"system/mem_free_bytes":152408064,"system/mem_total_bytes":12631490560,"total_schedulers":1,"uptime":1411611.27369318,"valid_status_updates":5371}
>>>
>>>
>>> --
>>>
>>> Tom Arnfeld
>>> Developer // DueDil
>>>
>>> (+44) 7525940046
>>> 25 Christopher Street, London, EC2A 2BS
>>>
>>
>>
>

Reply via email to