We have half a dozen clusters of varying sizes and all of them have high memory 
usage on the mons every 1-3 months. I've thought about opening a ticket with 
Ceph Enterprise support or bringing it up here, but there's no way for us to 
really get logs on it because we can't run with high logging for multiple 
months and we can't tell which of our clusters is going to have the issue next. 
 We've seen this on 0.94.5 and 0.94.7.

We've noticed that the memory usage is either high on the primary mon or all of 
the secondary mons.  I've never seen high memory usage on all primary and 
secondary mons at once.  Our fix has been to monitor memory usage on the server 
and restart the mon processes for the entire cluster when one of them spikes.

________________________________

[cid:[email protected]]<https://storagecraft.com>       David 
Turner | Cloud Operations Engineer | StorageCraft Technology 
Corporation<https://storagecraft.com>
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2760 | Mobile: 385.224.2943

________________________________

If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.

________________________________

________________________________________
From: ceph-users [[email protected]] on behalf of mj 
[[email protected]]
Sent: Friday, November 04, 2016 7:06 AM
To: [email protected]
Subject: [ceph-users] suddenly high memory usage for ceph-mon process

Hi,

Running ceph 0.94.9 on jessie (proxmox), three hosts, 4 OSDs per host,
ssd journal, 10G cluster network. Hosts have 65G ram. The cluster is
generally not very buzy.

Suddenly we were getting HEALTH_WRN today, with two osd's (both on the
same server) being slow. Looking into this, we noticed very high memory
usage on that host: 75% memory for ceph-mon!

(normally here ceph-mon uses around 1% - 2%)

I restarted ceph-mon on that host, and that seems to have brought things
back to normal immediately.

I don't see anything out of the ordinary in /var/log/syslog on that
server, and also generally the cluster is HEALTH_OK. No changes to
configs lately (last many weeks) and last time I applied updates and
rebooted is 30 days ago.

No idea what could have caused this. Any ideas what to check, where to
look? What would typically cause such high memory usage for the ceph-mon
process?

MJ

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to