Hi,
just wanted to add some info..
1) I was able to workaround the problem (as advised by Harald) by increasing
mon_lease to 50s, waiting for monitor to join the cluster (it took hours!) and
decreasing it again.
2) since then we got hit by the same problem on different cluster. same
symptoms,
s
On 14.10.19 16:31, Nikola Ciprich wrote:
On Mon, Oct 14, 2019 at 01:40:19PM +0200, Harald Staub wrote:
Probably same problem here. When I try to add another MON, "ceph
health" becomes mostly unresponsive. One of the existing ceph-mon
processes uses 100% CPU for several minutes. Tried it on 2 tes
On Tue, Oct 15, 2019 at 06:50:31AM +0200, Nikola Ciprich wrote:
>
>
> On Mon, Oct 14, 2019 at 11:52:55PM +0200, Paul Emmerich wrote:
> > How big is the mon's DB? As in just the total size of the directory you
> > copied
> >
> > FWIW I recently had to perform mon surgery on a 14.2.4 (or was it
On Mon, Oct 14, 2019 at 11:52:55PM +0200, Paul Emmerich wrote:
> How big is the mon's DB? As in just the total size of the directory you
> copied
>
> FWIW I recently had to perform mon surgery on a 14.2.4 (or was it
> 14.2.2?) cluster with 8 GB mon size and I encountered no such problems
> wh
How big is the mon's DB? As in just the total size of the directory you copied
FWIW I recently had to perform mon surgery on a 14.2.4 (or was it
14.2.2?) cluster with 8 GB mon size and I encountered no such problems
while syncing a new mon which took 10 minutes or so.
Paul
--
Paul Emmerich
Lo
On Mon, Oct 14, 2019 at 04:31:22PM +0200, Nikola Ciprich wrote:
> On Mon, Oct 14, 2019 at 01:40:19PM +0200, Harald Staub wrote:
> > Probably same problem here. When I try to add another MON, "ceph
> > health" becomes mostly unresponsive. One of the existing ceph-mon
> > processes uses 100% CPU for
On Mon, Oct 14, 2019 at 01:40:19PM +0200, Harald Staub wrote:
> Probably same problem here. When I try to add another MON, "ceph
> health" becomes mostly unresponsive. One of the existing ceph-mon
> processes uses 100% CPU for several minutes. Tried it on 2 test
> clusters (14.2.4, 3 MONs, 5 storag
Probably same problem here. When I try to add another MON, "ceph health"
becomes mostly unresponsive. One of the existing ceph-mon processes uses
100% CPU for several minutes. Tried it on 2 test clusters (14.2.4, 3
MONs, 5 storage nodes with around 2 hdd osds each). To avoid errors like
"lease
dear ceph users and developers,
on one of our production clusters, we got into pretty unpleasant situation.
After rebooting one of the nodes, when trying to start monitor, whole cluster
seems to hang, including IO, ceph -s etc. When this mon is stopped again,
everything seems to continue. Traying