Having all of the daemons in your cluster able to restart themselves at
will sounds terrifying. What's preventing every osd from restarting at the
same time? Also, ceph dot releases have been known to break environments.
It's the nature of such a widely used software. I would recommend pinning
the ceph version instead.

On Wed, Feb 21, 2018, 6:09 PM Oliver Freyermuth <
freyerm...@physik.uni-bonn.de> wrote:

> Dear Cephalopodians,
>
> we had our cluster (still in testing phase) configured for automatic
> updates so we got 12.2.3 "automagically" when it was released.
>
> In /etc/sysconfig/ceph, we still have the default:
> CEPH_AUTO_RESTART_ON_UPGRADE=no
> so as expected, services were not restarted.
>
> However, as soon as scrubs started to run, we got many scrub errors and
> inconsistent PGs.
> Looking into the logs, I found that some ceph-osd processes (still running
> as 12.2.2) tried to load
> the compression library (libsnappy) dynamically, and refused to do so
> since it was already updated to 12.2.3 on disk.
> This appears to have caused the OSD to report read errors.
>
> The situation was reasonably easy to fix (i.e. just restart all ceph-osd
> processes, and re-run a deep scrub some of the inconsistent PGs).
> Still, I wonder whether this could be prevented by loading the libraries
> at OSD startup (and never unloading them),
> or by shutting down the OSD in case of a library load failure.
> Did anybody else experience this as of yet?
>
> We will work around it either by version pinning or
> CEPH_AUTO_RESTART_ON_UPGRADE=yes (not decided yet).
>
> Cheers,
>         Oliver
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to