That's what I thought. There around 2500 tasks launched with this master, most 
of which will be by our Hadoop JT. The Hadoop framework ships the configuration 
for the TT using the TaskInfo.data property, and that looks to be about 80K per 
task.




Any debugging suggestions?


--


Tom Arnfeld

Developer // DueDil





(+44) 7525940046

25 Christopher Street, London, EC2A 2BS

On Thu, Nov 20, 2014 at 10:33 PM, Benjamin Mahler
<benjamin.mah...@gmail.com> wrote:

> It shouldn't be that high, especially with the size of the cluster I see in
> your stats.
> Which scheduler(s) are you running, and do they create large TaskInfo
> objects? Just a hunch, as I do not recall any leaks in 0.19.1.
> On Tue, Nov 18, 2014 at 1:00 AM, Tom Arnfeld <t...@duedil.com> wrote:
>>  I've noticed some strange memory usage behaviour of the Mesos master in
>> a small cluster of ours. We have three master nodes in a quorum and are
>> using ZK.
>>
>> The master in question has 12GB of ram available of which the mesos-master
>> process is using 10GB (resident) of which seems quite a lot. That being
>> said I'm not sure what the memory profile of the master should look like...
>>
>> Here's a snapshot of our /stats.json endpoint.
>>
>> This cluster is running 0.19.1 so perhaps there are some memory leak fixes
>> in a newer release that we need to take advantage of.
>>
>> Any help would be appreciated!
>>
>> ---------------------------------------------
>>
>> {"activated_slaves":19,"active_schedulers":1,"active_tasks_gauge":1,"cpus_percent":0.116618075801749,"cpus_total":171.5,"cpus_used":20,"deactivated_slaves":0,"disk_percent":0.0273684210526316,"disk_total":972800,"disk_used":26624,"elected":1,"failed_tasks":11,"finished_tasks":2658,"invalid_status_updates":2638,"killed_tasks":1,"lost_tasks":4,"master/cpus_percent":0.116618075801749,"master/cpus_total":171.5,"master/cpus_used":20,"master/disk_percent":0.0273684210526316,"master/disk_total":972800,"master/disk_used":26624,"master/dropped_messages":16,"master/elected":1,"master/event_queue_size":0,"master/frameworks_active":1,"master/frameworks_inactive":0,"master/invalid_framework_to_executor_messages":0,"master/invalid_status_update_acknowledgements":0,"master/invalid_status_updates":2638,"master/mem_percent":0.279896013864818,"master/mem_total":1181696,"master/mem_used":330752,"master/messages_authenticate":0,"master/messages_deactivate_framework":0,"master/messages_exited_executor":2667,"master/messages_framework_to_executor":0,"master/messages_kill_task":4397,"master/messages_launch_tasks":838024,"master/messages_reconcile_tasks":0,"master/messages_register_framework":27,"master/messages_register_slave":1,"master/messages_reregister_framework":326788,"master/messages_reregister_slave":31,"master/messages_resource_request":0,"master/messages_revive_offers":0,"master/messages_status_update":8009,"master/messages_status_update_acknowledgement":0,"master/messages_unregister_framework":26,"master/messages_unregister_slave":0,"master/outstanding_offers":0,"master/recovery_slave_removals":0,"master/slave_registrations":1,"master/slave_removals":0,"master/slave_reregistrations":18,"master/slaves_active":19,"master/slaves_inactive":0,"master/tasks_failed":11,"master/tasks_finished":2658,"master/tasks_killed":1,"master/tasks_lost":4,"master/tasks_running":1,"master/tasks_staging":0,"master/tasks_starting":0,"master/uptime_secs":1411611.70786125,"master/valid_framework_to_executor_messages":0,"master/valid_status_update_acknowledgements":0,"master/valid_status_updates":5371,"mem_percent":0.279896013864818,"mem_total":1181696,"mem_used":330752,"outstanding_offers":0,"registrar/queued_operations":0,"registrar/registry_size_bytes":4348,"registrar/state_fetch_ms":95.591936,"registrar/state_store_ms":48.622848,"staged_tasks":2675,"started_tasks":26,"system/cpus_total":2,"system/load_15min":0.05,"system/load_1min":0.03,"system/load_5min":0.04,"system/mem_free_bytes":152408064,"system/mem_total_bytes":12631490560,"total_schedulers":1,"uptime":1411611.27369318,"valid_status_updates":5371}
>>
>>
>> --
>>
>> Tom Arnfeld
>> Developer // DueDil
>>
>> (+44) 7525940046
>> 25 Christopher Street, London, EC2A 2BS
>>

Reply via email to