Hello,

our problems with ceph monitors continue in version 12.2.2:

Adding a specific monitor causes all monitors to hang and not respond to
ceph -s or similar anymore.

Interestingly when this monitor is on (mon.server2), the other two
monitors (mon.server3, mon.server5) randomly begin to consume 100% cpu
time, until we restart them, when the procedure repeats.

The monitor mon.server2 interestingly has a different view on the
cluster: when the other two are electing, it is in state synchronising.

We recently noticed that the MTUs of the bond0 device that we use was
setup to be 9200 and the vlan tagged device bond0.2, that we use for
ceph, also had an 9200 mtu. We raised the underlying devices and bond0
to 9204, restarted the monitors, but the problem persists.

Does anyone have a hint on how to further debug this problem?

I have added the logs from the time when we tried to restart the monitor
on server2.

Best,

Nico

Attachment: ceph-mon.server2.log.bz2
Description: BZip2 compressed data

Attachment: ceph-mon.server5.log.bz2
Description: BZip2 compressed data



--
Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to