[
https://issues.apache.org/jira/browse/MESOS-265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benjamin Mahler closed MESOS-265.
---------------------------------
Resolution: Cannot Reproduce
Thanks for replying!
My best guess would be that the particular version of Spark on Mesos you were
running was possibly causing a DOS against the Master.
Since this ticket is pretty old, I'll close this and we can debug this if
others notice this issue. :)
> Master hogs CPU
> ---------------
>
> Key: MESOS-265
> URL: https://issues.apache.org/jira/browse/MESOS-265
> Project: Mesos
> Issue Type: Bug
> Components: master
> Environment: * Cluster of 6 CentOS machines; 24 cores / 256G RAM each.
> * Large data file (1b records, 100GB) retrieved via HDFS.
> * Simple jobs (eg. count records by key, where key takes 20 possible values)
> using Spark with local modifications.
> Reporter: James Zhao
> Attachments: mesos-log.txt.zip
>
>
> CPU usage of the master slowly grows until eventually it stops working.
> Upon launch, CPU usage is about 1%. After running just 1 job, it climbs to
> 20% (even after the job has stopped), and upon running more jobs, quickly
> grows to 50%. It then continues to grow at a slower rate until eventually it
> stops responding (after perhaps several hundred jobs).
> Screenshot: http://math.stanford.edu/~jyzhao/mesos-cpu.png (there are no
> active jobs, but lt-mesos-master is using 75% CPU)
--
This message was sent by Atlassian JIRA
(v6.1#6144)