Re: [ceph-users] problem returning mon back to cluster

2019-11-14 Thread Nikola Ciprich
Hi, just wanted to add some info.. 1) I was able to workaround the problem (as advised by Harald) by increasing mon_lease to 50s, waiting for monitor to join the cluster (it took hours!) and decreasing it again. 2) since then we got hit by the same problem on different cluster. same symptoms,

Re: [ceph-users] problem returning mon back to cluster

2019-10-15 Thread Harald Staub
On 14.10.19 16:31, Nikola Ciprich wrote: On Mon, Oct 14, 2019 at 01:40:19PM +0200, Harald Staub wrote: Probably same problem here. When I try to add another MON, "ceph health" becomes mostly unresponsive. One of the existing ceph-mon processes uses 100% CPU for several minutes. Tried it on 2

Re: [ceph-users] problem returning mon back to cluster

2019-10-14 Thread Nikola Ciprich
On Tue, Oct 15, 2019 at 06:50:31AM +0200, Nikola Ciprich wrote: > > > On Mon, Oct 14, 2019 at 11:52:55PM +0200, Paul Emmerich wrote: > > How big is the mon's DB? As in just the total size of the directory you > > copied > > > > FWIW I recently had to perform mon surgery on a 14.2.4 (or was it

Re: [ceph-users] problem returning mon back to cluster

2019-10-14 Thread Nikola Ciprich
On Mon, Oct 14, 2019 at 11:52:55PM +0200, Paul Emmerich wrote: > How big is the mon's DB? As in just the total size of the directory you > copied > > FWIW I recently had to perform mon surgery on a 14.2.4 (or was it > 14.2.2?) cluster with 8 GB mon size and I encountered no such problems >

Re: [ceph-users] problem returning mon back to cluster

2019-10-14 Thread Paul Emmerich
How big is the mon's DB? As in just the total size of the directory you copied FWIW I recently had to perform mon surgery on a 14.2.4 (or was it 14.2.2?) cluster with 8 GB mon size and I encountered no such problems while syncing a new mon which took 10 minutes or so. Paul -- Paul Emmerich

Re: [ceph-users] problem returning mon back to cluster

2019-10-14 Thread Nikola Ciprich
On Mon, Oct 14, 2019 at 04:31:22PM +0200, Nikola Ciprich wrote: > On Mon, Oct 14, 2019 at 01:40:19PM +0200, Harald Staub wrote: > > Probably same problem here. When I try to add another MON, "ceph > > health" becomes mostly unresponsive. One of the existing ceph-mon > > processes uses 100% CPU for

Re: [ceph-users] problem returning mon back to cluster

2019-10-14 Thread Nikola Ciprich
On Mon, Oct 14, 2019 at 01:40:19PM +0200, Harald Staub wrote: > Probably same problem here. When I try to add another MON, "ceph > health" becomes mostly unresponsive. One of the existing ceph-mon > processes uses 100% CPU for several minutes. Tried it on 2 test > clusters (14.2.4, 3 MONs, 5

Re: [ceph-users] problem returning mon back to cluster

2019-10-14 Thread Harald Staub
Probably same problem here. When I try to add another MON, "ceph health" becomes mostly unresponsive. One of the existing ceph-mon processes uses 100% CPU for several minutes. Tried it on 2 test clusters (14.2.4, 3 MONs, 5 storage nodes with around 2 hdd osds each). To avoid errors like "lease

[ceph-users] problem returning mon back to cluster

2019-10-13 Thread Nikola Ciprich
dear ceph users and developers, on one of our production clusters, we got into pretty unpleasant situation. After rebooting one of the nodes, when trying to start monitor, whole cluster seems to hang, including IO, ceph -s etc. When this mon is stopped again, everything seems to continue.