We had same issue and this is stable after upgrading from 14.2.11 to
14.2.15.  Also, the size of the DB is not same for the one failed to join
since the information it had to sync is huge.   The compact on reboot does
the job but it takes a long time to catch up.  You can force the join by
quorum enter but won't help.  Upgrade helped our case.

On Mon, Dec 14, 2020, 4:42 PM Wesley Dillingham <w...@wesdillingham.com>
wrote:

> We had to rebuild our mons on a few occasions because of this. Only one mon
> was ever dropped from quorum at a time in our case. In other scenarios with
> the same error the mon was able to rejoin after thirty minutes or so. We
> believe we may have tracked it down (in our case) to the upgrade of an AV /
> packet inspection security technology being run on the servers. Perhaps
> you've made similar updates.
>
> Respectfully,
>
> *Wes Dillingham*
> w...@wesdillingham.com
> LinkedIn <http://www.linkedin.com/in/wesleydillingham>
>
>
> On Tue, Dec 8, 2020 at 7:46 PM Wesley Dillingham <w...@wesdillingham.com>
> wrote:
>
> > We have also had this issue multiple times in 14.2.11
> >
> > On Tue, Dec 8, 2020, 5:11 PM <hoann...@gmail.com> wrote:
> >
> >> I have same issue. My cluster runing 14.2.11 versions. What is your
> >> version ceph?
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> >>
> >
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to