Hi,
I've previously discussed some issues I've had with the RGW lifecycle
processing. I've discovered that the root cause of my problem is that:
* I'm running a multisite configuration
* Life cycle processing is done on the master site each night.
`radosgw-admin lc list` correctly returns all buckets with lc config.
* I simulate the master site being destroyed from my VM host.
* I promote the secondary site to master following the instructions here:
https://docs.ceph.com/docs/master/radosgw/multisite/
* The new master site isn't doing any lifecycle processing.
`radosgw-admin lc list` returns empty.
* I recreate a cluster and pair it with the new master site to get back to
having multisite redundancy.
* Neither site is doing any lifecycle processing. `radosgw-admin lc
list` returns empty.
So in the process of failover/recovery I have gone from having two paired
clusters performing lifecycle processing, to two paired clusters NOT performing
lifecycle processing.
Is this behaviour expected? I've found `radosgw-admin lc reshard fix` will
"remind" the cluster that I run it on that it needs to do lifecycle processing.
Although I found no mention of having to use this in the docs, for that command
the docs state it's only relevant on earlier Ceph versions. I'm running
Nautilus 14.2.9.
In addition, if I have two healthy clusters paired in a multisite system, and
swap the master cluster by promoting the non-master, the demoted cluster seems
to still continue doing lifecycle processing, while the promote does not. If I
run `radosgw-admin lc reshard fix` on the promoted cluster, then both clusters
seem to claim they are doing the processing. Is this a happy state to be in?
Does anyone have any experience with this?
Thanks,
Alex
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]