Apparently RGW daemons running 12.2.2 cannot sync data from RGW daemons
running anything other than Luminous.  This means that if you run multisite
and you don't upgrade both sites at the same time, then you have broken
replication.  There is a fix for this scheduled for 12.2.3 (
http://tracker.ceph.com/issues/22183).

As most people running Multi-site are probably doing so with the intention
of routing traffic around a zone to do maintenance, like upgrades, that
means that the upgrade to Luminous is impossible and leaves you in a
degraded state.  The assumed process for an upgrade would be to have 2+
Jewel sites, route traffic away from one of them and upgrade it to
Luminous, now the Luminous site will need to catch up from the Jewel
site... which it can't.  The only existing options to upgrade multi-site
RGW from Jewel to Luminous would be to do it simultaneously on all sites or
set all traffic to go to the site you're performing the upgrade on and set
the Jewel site to read-only.  Nothing of any of this is documented in the
release notes.

On top of this, I ran into a problem where all of my index pools had dozens
of scrub errors after the upgrade from Jewel to Luminous.  This wasn't only
for multi-site realms as a local only realm also had these scrub errors.

We pushed up the upgrade because of a memory leak in RGW that was fixed in
12.2.2 that causes our RGW daemons to OOM restart about every 30 minutes
while in multi-site (http://tracker.ceph.com/issues/19446).  As well as
Bluestore fixing the problem with high object count pools and filestore
subfolder splitting crippling clusters.  Between the 2 of those, we had
constant maintenance trying to keep the clusters from running with
persistent blocked requests.

Does anyone have any suggestions for how to move forward now?  I don't
trust upgrading our remaining Jewel site to Luminous because of the
unexplained scrub errors (since RGW multi-site is busted, the daemons are
no longer OOM restarting on the Jewel site).  Option 2 we would have to
stop all writes to our remaining active Jewel site, manually sync the data
it has to the Luminous site, and ultimately direct traffic to Luminous
while setting the Jewel site to read-only.  I suppose the 3rd option is the
one we'll have to go with, which is to wait until 12.2.3 and hopefully fix
multi-site sync before moving forward with anything else.  Unfortunately we
lose the redundancy of the second site, but at least we don't have to deal
with the RGW daemons restarting 2x/hr.

There is also a typo in the Redhat docs and Ceph docs for RGW multi-site
disaster recovery and failover (
http://docs.ceph.com/docs/luminous/radosgw/multisite/#failover-and-disaster-recovery
,
https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/object_gateway_guide_for_red_hat_enterprise_linux/multi_site#failover_and_disaster_recovery).
radosgw-admin --read-only=False is parsed without the =False and sets the
target zone as read-only.  This is tested on both Jewel and Luminous.  The
only way I could find to fix this was to download the zonegroup.json, edit
it manually, and set it back to the realm.

Anyway, here's another cautionary tale of upgrading to Luminous without
regression testing your environment and the upgrade process.  You can't
assume anyone else tested it.
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to