Re: [ceph-users] Upgrading inconvenience for Luminous

2018-02-21 Thread David Turner
Having all of the daemons in your cluster able to restart themselves at
will sounds terrifying. What's preventing every osd from restarting at the
same time? Also, ceph dot releases have been known to break environments.
It's the nature of such a widely used software. I would recommend pinning
the ceph version instead.

On Wed, Feb 21, 2018, 6:09 PM Oliver Freyermuth <
freyerm...@physik.uni-bonn.de> wrote:

> Dear Cephalopodians,
>
> we had our cluster (still in testing phase) configured for automatic
> updates so we got 12.2.3 "automagically" when it was released.
>
> In /etc/sysconfig/ceph, we still have the default:
> CEPH_AUTO_RESTART_ON_UPGRADE=no
> so as expected, services were not restarted.
>
> However, as soon as scrubs started to run, we got many scrub errors and
> inconsistent PGs.
> Looking into the logs, I found that some ceph-osd processes (still running
> as 12.2.2) tried to load
> the compression library (libsnappy) dynamically, and refused to do so
> since it was already updated to 12.2.3 on disk.
> This appears to have caused the OSD to report read errors.
>
> The situation was reasonably easy to fix (i.e. just restart all ceph-osd
> processes, and re-run a deep scrub some of the inconsistent PGs).
> Still, I wonder whether this could be prevented by loading the libraries
> at OSD startup (and never unloading them),
> or by shutting down the OSD in case of a library load failure.
> Did anybody else experience this as of yet?
>
> We will work around it either by version pinning or
> CEPH_AUTO_RESTART_ON_UPGRADE=yes (not decided yet).
>
> Cheers,
> Oliver
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Upgrading inconvenience for Luminous

2018-02-21 Thread Oliver Freyermuth
Dear Cephalopodians, 

we had our cluster (still in testing phase) configured for automatic updates so 
we got 12.2.3 "automagically" when it was released. 

In /etc/sysconfig/ceph, we still have the default: 
CEPH_AUTO_RESTART_ON_UPGRADE=no
so as expected, services were not restarted. 

However, as soon as scrubs started to run, we got many scrub errors and 
inconsistent PGs. 
Looking into the logs, I found that some ceph-osd processes (still running as 
12.2.2) tried to load
the compression library (libsnappy) dynamically, and refused to do so since it 
was already updated to 12.2.3 on disk. 
This appears to have caused the OSD to report read errors. 

The situation was reasonably easy to fix (i.e. just restart all ceph-osd 
processes, and re-run a deep scrub some of the inconsistent PGs). 
Still, I wonder whether this could be prevented by loading the libraries at OSD 
startup (and never unloading them), 
or by shutting down the OSD in case of a library load failure. 
Did anybody else experience this as of yet? 

We will work around it either by version pinning or 
CEPH_AUTO_RESTART_ON_UPGRADE=yes (not decided yet). 

Cheers,
Oliver



smime.p7s
Description: S/MIME Cryptographic Signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com