[ceph-users] Mimic 13.2.1 released date?
Hi there, Any plan for the release of 13.2.1? -- Regards Frank Yu ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] IMPORTANT: broken luminous 12.2.6 release in repo, do not upgrade
Hi everyone, tl;dr: Please avoid the 12.2.6 packages that are currently present on download.ceph.com. We will have a 12.2.7 published ASAP (probably Monday). If you do not use bluestore or erasure-coded pools, none of the issues affect you. Details: We built 12.2.6 and pushed it to the repos Wednesday, but as that was happening realized there was a potentially dangerous regression in 12.2.5[1] that an upgrade might exacerbate. While we sorted that issue out, several people noticed the updated version in the repo and upgraded. That turned up two other regressions[2][3]. We have fixes for those, but are working on an additional fix to make the damage from [3] be transparently repaired. More details: -- [1] http://tracker.ceph.com/issues/24597 -- This is actually a regression in 12.2.5 that affects erasure-coded pools. If there are (1) normal erasure code writes, and simultanously (2) erasure code writes that result in rados returning an error (for example, a delete of a non-existent object, which commonly happens when rgw is doing garbage collection), and (3) OSDs that are somewhat heavily loaded and then restart, then the bug might incorrectly roll-forward the in-progress EC operations. When the PG repeers this results in an OSD crash like src/os/filestore/FileStore.cc: 5524: FAILED assert(0 == "ERROR: source must exist") It seems to affect filestore and busy clusters with this specific workload. The OSDs recover once restarted. However, it is also unclear whether it damages the objects in question. For this reason, please avoid unnecessary OSD restarts if you are running 12.2.5 or 12.2.6. When we release 12.2.7, we will have an upgrade procedure in the release notes that quiesces RADOS IO to minimize the probability that this bug will affect you. If you do not have erasure-coded pools, this bug does not affect you. -- [2] https://tracker.ceph.com/issues/24903 -- ceph-volume has had a bug for a while that leaves the /var/lib/ceph/osd/*/block.db or block.wal symlinks for bluestore OSDs owned by root:root. This didn't matter because bluestore was ignoring these symlinks and using an internally stored value instead. Both of these were fixed/changed in 12.2.6. However, after upgrading and restarting, the symlink is still present in the /var/lib/ceph/osd/*/ tmpfs and the OSD won't restart. Rerunning ceph-volume will fix it, as will manually chown -h ceph:ceph /var/lib/ceph/osd/*/block*, or a reboot. 12.2.7 has a packaging fix to fixed this up on upgrade so there is no disruption. If you do not run bluestore, this bug does not affect you. -- [3] https://tracker.ceph.com/issues/23871 -- We modified the OSD recently to avoid storing full-object CRCs when bluestore is in use because those CRCs are redundant. There was a bug in this code that was later fixed in master. This code was backported to luminous, but the follow-on fix was missed. The result is that a sequence of - running 12.2.5 - deep-scrub (updates stored whole-object crc) - upgrade to 12.2.6 - writefull to existing (on 12.2.6) fails to clear the whole-object crc - read of full object -> crc mismatch which leads to an (incorrect) EIO error. We have fixed the original problem by backporting the missing fix. However, users who mistakenly installed 12.2.6 may have many objects with a mismatched whole-object crc. We are currently working on a fix to ignore the whole-object CRC if the same conditions are met that make us skip them entirely (i.e., running bluestore), and to clear/repair them on scrub. Once this is done, we'll push out 12.2.7. If you do not run bluestore, this bug does no affect you. We don't have an easy workaround for this one at the moment, unfortunately. Exciting week! Thanks everyone, sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] osd prepare issue device-mapper mapping
Also, looking at your ceph-disk list output, the LVM is probably your root filesystem and cannot be wiped. If you'd like the send the output of a 'mount' and 'lvs' command, you should be able to to tell. -- jacob On 07/13/2018 03:42 PM, Jacob DeGlopper wrote: You have LVM data on /dev/sdb already; you will need to remove that before you can use ceph-disk on that device. Use the LVM commands 'lvs','vgs', and 'pvs' to list the logical volumes, volume groups, and physical volumes defined. Once you're sure you don't need the data, lvremove, vgremove, and pvremove them, then zero the disk using 'dd if=/dev/zero of=/dev/sdb bs=1M count=10'. Note that this command wipes the disk - you must be sure that you're wiping the right disk. -- jacob On 07/13/2018 03:26 PM, Satish Patel wrote: I am installing ceph in my lab box using ceph-ansible, i have two HDD for OSD and i am getting following error on one of OSD not sure what is the issue. [root@ceph-osd-01 ~]# ceph-disk prepare --cluster ceph --bluestore /dev/sdb ceph-disk: Error: Device /dev/sdb1 is in use by a device-mapper mapping (dm-crypt?): dm-0 [root@ceph-osd-01 ~]# ceph-disk list /dev/dm-0 other, xfs, mounted on / /dev/sda : /dev/sda1 other, xfs, mounted on /boot /dev/sda2 swap, swap /dev/sdb : /dev/sdb1 other, LVM2_member /dev/sdc : /dev/sdc1 ceph data, active, cluster ceph, osd.3, block /dev/sdc2 /dev/sdc2 ceph block, for /dev/sdc1 /dev/sr0 other, unknown ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] osd prepare issue device-mapper mapping
You have LVM data on /dev/sdb already; you will need to remove that before you can use ceph-disk on that device. Use the LVM commands 'lvs','vgs', and 'pvs' to list the logical volumes, volume groups, and physical volumes defined. Once you're sure you don't need the data, lvremove, vgremove, and pvremove them, then zero the disk using 'dd if=/dev/zero of=/dev/sdb bs=1M count=10'. Note that this command wipes the disk - you must be sure that you're wiping the right disk. -- jacob On 07/13/2018 03:26 PM, Satish Patel wrote: I am installing ceph in my lab box using ceph-ansible, i have two HDD for OSD and i am getting following error on one of OSD not sure what is the issue. [root@ceph-osd-01 ~]# ceph-disk prepare --cluster ceph --bluestore /dev/sdb ceph-disk: Error: Device /dev/sdb1 is in use by a device-mapper mapping (dm-crypt?): dm-0 [root@ceph-osd-01 ~]# ceph-disk list /dev/dm-0 other, xfs, mounted on / /dev/sda : /dev/sda1 other, xfs, mounted on /boot /dev/sda2 swap, swap /dev/sdb : /dev/sdb1 other, LVM2_member /dev/sdc : /dev/sdc1 ceph data, active, cluster ceph, osd.3, block /dev/sdc2 /dev/sdc2 ceph block, for /dev/sdc1 /dev/sr0 other, unknown ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] osd prepare issue device-mapper mapping
I am installing ceph in my lab box using ceph-ansible, i have two HDD for OSD and i am getting following error on one of OSD not sure what is the issue. [root@ceph-osd-01 ~]# ceph-disk prepare --cluster ceph --bluestore /dev/sdb ceph-disk: Error: Device /dev/sdb1 is in use by a device-mapper mapping (dm-crypt?): dm-0 [root@ceph-osd-01 ~]# ceph-disk list /dev/dm-0 other, xfs, mounted on / /dev/sda : /dev/sda1 other, xfs, mounted on /boot /dev/sda2 swap, swap /dev/sdb : /dev/sdb1 other, LVM2_member /dev/sdc : /dev/sdc1 ceph data, active, cluster ceph, osd.3, block /dev/sdc2 /dev/sdc2 ceph block, for /dev/sdc1 /dev/sr0 other, unknown ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Approaches for migrating to a much newer cluster
Just a thought: have you considered rbd replication? On Fri, Jul 13, 2018 at 9:30 AM r...@cleansafecloud.com < r...@cleansafecloud.com> wrote: > > Hello folks, > > We have an old active Ceph cluster on Firefly (v0.80.9) which we use for > OpenStack and have multiple live clients. We have been put in a position > whereby we need to move to a brand new cluster under a new OpenStack > deployment. The new cluster is on Luminous (v.12.2.5). Now we obviously do > not want to migrate huge images across in one go if we can avoid it, so our > current plan is to transfer base images well in advance of the migration, > and use the rbd export-diff feature to apply incremental updates from that > point forwards. I wanted to reach out to you experts to see if we are going > down the right path here, what issues we might encounter, or if there might > be any better options. Or does this sound like the right approach? > > Many thanks, > Rob > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] MDS damaged
Hi Dan, you're right, I was following the mimic instructions (which indeed worked on my mimic testbed), but luminous is different and I missed the additional step. Works now, thanks! Alessandro Il 13/07/18 17:51, Dan van der Ster ha scritto: On Fri, Jul 13, 2018 at 4:07 PM Alessandro De Salvo wrote: However, I cannot reduce the number of mdses anymore, I was used to do that with e.g.: ceph fs set cephfs max_mds 1 Trying this with 12.2.6 has apparently no effect, I am left with 2 active mdses. Is this another bug? Are you following this procedure? http://docs.ceph.com/docs/luminous/cephfs/multimds/#decreasing-the-number-of-ranks i.e. you need to deactivate after decreasing max_mds. (Mimic does this automatically, OTOH). -- dan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] MDS damaged
On Fri, Jul 13, 2018 at 4:07 PM Alessandro De Salvo wrote: > However, I cannot reduce the number of mdses anymore, I was used to do > that with e.g.: > > ceph fs set cephfs max_mds 1 > > Trying this with 12.2.6 has apparently no effect, I am left with 2 > active mdses. Is this another bug? Are you following this procedure? http://docs.ceph.com/docs/luminous/cephfs/multimds/#decreasing-the-number-of-ranks i.e. you need to deactivate after decreasing max_mds. (Mimic does this automatically, OTOH). -- dan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Approaches for migrating to a much newer cluster
Hello folks, We have an old active Ceph cluster on Firefly (v0.80.9) which we use for OpenStack and have multiple live clients. We have been put in a position whereby we need to move to a brand new cluster under a new OpenStack deployment. The new cluster is on Luminous (v.12.2.5). Now we obviously do not want to migrate huge images across in one go if we can avoid it, so our current plan is to transfer base images well in advance of the migration, and use the rbd export-diff feature to apply incremental updates from that point forwards. I wanted to reach out to you experts to see if we are going down the right path here, what issues we might encounter, or if there might be any better options. Or does this sound like the right approach? Many thanks, Rob___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] MDS damaged
Thanks all, 100..inode, mds_snaptable and 1..inode were not corrupted, so I left them as they were. I have re-injected all the bad objects, for all mdses (2 per filesysytem) and all filesystems I had (2), and after setiing the mdses as repaired my filesystems are back! However, I cannot reduce the number of mdses anymore, I was used to do that with e.g.: ceph fs set cephfs max_mds 1 Trying this with 12.2.6 has apparently no effect, I am left with 2 active mdses. Is this another bug? Thanks, Alessandro Il 13/07/18 15:54, Yan, Zheng ha scritto: On Thu, Jul 12, 2018 at 11:39 PM Alessandro De Salvo wrote: Some progress, and more pain... I was able to recover the 200. using the ceph-objectstore-tool for one of the OSDs (all identical copies) but trying to re-inject it just with rados put was giving no error while the get was still giving the same I/O error. So the solution was to rm the object and the put it again, that worked. However, after restarting one of the MDSes and seeting it to repaired, I've hit another, similar problem: 2018-07-12 17:04:41.999136 7f54c3f4e700 -1 log_channel(cluster) log [ERR] : error reading table object 'mds0_inotable' -5 ((5) Input/output error) Can I safely try to do the same as for object 200.? Should I check something before trying it? Again, checking the copies of the object, they have identical md5sums on all the replicas. Yes, It should be safe. you also need to the same for several other objects. full object list are: 200. mds0_inotable 100..inode mds_snaptable 1..inode The first three objects are per-mds-rank. Ff you have enabled multi-active mds, you also need to update objects of other ranks. For mds.1, object names are 201., mds1_inotable and 101..inode. Thanks, Alessandro Il 12/07/18 16:46, Alessandro De Salvo ha scritto: Unfortunately yes, all the OSDs were restarted a few times, but no change. Thanks, Alessandro Il 12/07/18 15:55, Paul Emmerich ha scritto: This might seem like a stupid suggestion, but: have you tried to restart the OSDs? I've also encountered some random CRC errors that only showed up when trying to read an object, but not on scrubbing, that magically disappeared after restarting the OSD. However, in my case it was clearly related to https://tracker.ceph.com/issues/22464 which doesn't seem to be the issue here. Paul 2018-07-12 13:53 GMT+02:00 Alessandro De Salvo : Il 12/07/18 11:20, Alessandro De Salvo ha scritto: Il 12/07/18 10:58, Dan van der Ster ha scritto: On Wed, Jul 11, 2018 at 10:25 PM Gregory Farnum wrote: On Wed, Jul 11, 2018 at 9:23 AM Alessandro De Salvo wrote: OK, I found where the object is: ceph osd map cephfs_metadata 200. osdmap e632418 pool 'cephfs_metadata' (10) object '200.' -> pg 10.844f3494 (10.14) -> up ([23,35,18], p23) acting ([23,35,18], p23) So, looking at the osds 23, 35 and 18 logs in fact I see: osd.23: 2018-07-11 15:49:14.913771 7efbee672700 -1 log_channel(cluster) log [ERR] : 10.14 full-object read crc 0x976aefc5 != expected 0x9ef2b41b on 10:292cf221:::200.:head osd.35: 2018-07-11 18:01:19.989345 7f760291a700 -1 log_channel(cluster) log [ERR] : 10.14 full-object read crc 0x976aefc5 != expected 0x9ef2b41b on 10:292cf221:::200.:head osd.18: 2018-07-11 18:18:06.214933 7fabaf5c1700 -1 log_channel(cluster) log [ERR] : 10.14 full-object read crc 0x976aefc5 != expected 0x9ef2b41b on 10:292cf221:::200.:head So, basically the same error everywhere. I'm trying to issue a repair of the pg 10.14, but I'm not sure if it may help. No SMART errors (the fileservers are SANs, in RAID6 + LVM volumes), and no disk problems anywhere. No relevant errors in syslogs, the hosts are just fine. I cannot exclude an error on the RAID controllers, but 2 of the OSDs with 10.14 are on a SAN system and one on a different one, so I would tend to exclude they both had (silent) errors at the same time. That's fairly distressing. At this point I'd probably try extracting the object using ceph-objectstore-tool and seeing if it decodes properly as an mds journal. If it does, you might risk just putting it back in place to overwrite the crc. Wouldn't it be easier to scrub repair the PG to fix the crc? this is what I already instructed the cluster to do, a deep scrub, but I'm not sure it could repair in case all replicas are bad, as it seems to be the case. I finally managed (with the help of Dan), to perform the deep-scrub on pg 10.14, but the deep scrub did not detect anything wrong. Also trying to repair 10.14 has no effect. Still, trying to access the object I get in the OSDs: 2018-07-12 13:40:32.711732 7efbee672700 -1 log_channel(cluster) log [ERR] : 10.14 full-object read crc 0x976aefc5 != expected 0x9ef2b41b on 10:292cf221:::200.:head Was deep-scrub supposed to detect the wrong crc? If yes, them it sounds like
Re: [ceph-users] MDS damaged
On Thu, Jul 12, 2018 at 11:39 PM Alessandro De Salvo wrote: > > Some progress, and more pain... > > I was able to recover the 200. using the ceph-objectstore-tool for > one of the OSDs (all identical copies) but trying to re-inject it just with > rados put was giving no error while the get was still giving the same I/O > error. So the solution was to rm the object and the put it again, that worked. > > However, after restarting one of the MDSes and seeting it to repaired, I've > hit another, similar problem: > > > 2018-07-12 17:04:41.999136 7f54c3f4e700 -1 log_channel(cluster) log [ERR] : > error reading table object 'mds0_inotable' -5 ((5) Input/output error) > > > Can I safely try to do the same as for object 200.? Should I check > something before trying it? Again, checking the copies of the object, they > have identical md5sums on all the replicas. > Yes, It should be safe. you also need to the same for several other objects. full object list are: 200. mds0_inotable 100..inode mds_snaptable 1..inode The first three objects are per-mds-rank. Ff you have enabled multi-active mds, you also need to update objects of other ranks. For mds.1, object names are 201., mds1_inotable and 101..inode. > Thanks, > > > Alessandro > > > Il 12/07/18 16:46, Alessandro De Salvo ha scritto: > > Unfortunately yes, all the OSDs were restarted a few times, but no change. > > Thanks, > > > Alessandro > > > Il 12/07/18 15:55, Paul Emmerich ha scritto: > > This might seem like a stupid suggestion, but: have you tried to restart the > OSDs? > > I've also encountered some random CRC errors that only showed up when trying > to read an object, > but not on scrubbing, that magically disappeared after restarting the OSD. > > However, in my case it was clearly related to > https://tracker.ceph.com/issues/22464 which doesn't > seem to be the issue here. > > Paul > > 2018-07-12 13:53 GMT+02:00 Alessandro De Salvo > : >> >> >> Il 12/07/18 11:20, Alessandro De Salvo ha scritto: >> >>> >>> >>> Il 12/07/18 10:58, Dan van der Ster ha scritto: On Wed, Jul 11, 2018 at 10:25 PM Gregory Farnum wrote: > > On Wed, Jul 11, 2018 at 9:23 AM Alessandro De Salvo > wrote: >> >> OK, I found where the object is: >> >> >> ceph osd map cephfs_metadata 200. >> osdmap e632418 pool 'cephfs_metadata' (10) object '200.' -> pg >> 10.844f3494 (10.14) -> up ([23,35,18], p23) acting ([23,35,18], p23) >> >> >> So, looking at the osds 23, 35 and 18 logs in fact I see: >> >> >> osd.23: >> >> 2018-07-11 15:49:14.913771 7efbee672700 -1 log_channel(cluster) log >> [ERR] : 10.14 full-object read crc 0x976aefc5 != expected 0x9ef2b41b on >> 10:292cf221:::200.:head >> >> >> osd.35: >> >> 2018-07-11 18:01:19.989345 7f760291a700 -1 log_channel(cluster) log >> [ERR] : 10.14 full-object read crc 0x976aefc5 != expected 0x9ef2b41b on >> 10:292cf221:::200.:head >> >> >> osd.18: >> >> 2018-07-11 18:18:06.214933 7fabaf5c1700 -1 log_channel(cluster) log >> [ERR] : 10.14 full-object read crc 0x976aefc5 != expected 0x9ef2b41b on >> 10:292cf221:::200.:head >> >> >> So, basically the same error everywhere. >> >> I'm trying to issue a repair of the pg 10.14, but I'm not sure if it may >> help. >> >> No SMART errors (the fileservers are SANs, in RAID6 + LVM volumes), and >> no disk problems anywhere. No relevant errors in syslogs, the hosts are >> just fine. I cannot exclude an error on the RAID controllers, but 2 of >> the OSDs with 10.14 are on a SAN system and one on a different one, so I >> would tend to exclude they both had (silent) errors at the same time. > > > That's fairly distressing. At this point I'd probably try extracting the > object using ceph-objectstore-tool and seeing if it decodes properly as > an mds journal. If it does, you might risk just putting it back in place > to overwrite the crc. > Wouldn't it be easier to scrub repair the PG to fix the crc? >>> >>> >>> this is what I already instructed the cluster to do, a deep scrub, but I'm >>> not sure it could repair in case all replicas are bad, as it seems to be >>> the case. >> >> >> I finally managed (with the help of Dan), to perform the deep-scrub on pg >> 10.14, but the deep scrub did not detect anything wrong. Also trying to >> repair 10.14 has no effect. >> Still, trying to access the object I get in the OSDs: >> >> 2018-07-12 13:40:32.711732 7efbee672700 -1 log_channel(cluster) log [ERR] : >> 10.14 full-object read crc 0x976aefc5 != expected 0x9ef2b41b on >> 10:292cf221:::200.:head >> >> Was deep-scrub supposed to detect the wrong crc? If yes, them it sounds like >> a bug. >> Can I force the repair someway? >> Thanks, >> >>Alessandro >> >>>
Re: [ceph-users] MDS damaged
Bluestore. On Fri, Jul 13, 2018, 05:56 Dan van der Ster wrote: > Hi Adam, > > Are your osds bluestore or filestore? > > -- dan > > > On Fri, Jul 13, 2018 at 7:38 AM Adam Tygart wrote: > > > > I've hit this today with an upgrade to 12.2.6 on my backup cluster. > > Unfortunately there were issues with the logs (in that the files > > weren't writable) until after the issue struck. > > > > 2018-07-13 00:16:54.437051 7f5a0a672700 -1 log_channel(cluster) log > > [ERR] : 5.255 full-object read crc 0x4e97b4e != expected 0x6cfe829d on > > 5:aa448500:::500.:head > > > > It is a backup cluster and I can keep it around or blow away the data > > (in this instance) as needed for testing purposes. > > > > -- > > Adam > > > > On Thu, Jul 12, 2018 at 10:39 AM, Alessandro De Salvo > > wrote: > > > Some progress, and more pain... > > > > > > I was able to recover the 200. using the ceph-objectstore-tool > for > > > one of the OSDs (all identical copies) but trying to re-inject it just > with > > > rados put was giving no error while the get was still giving the same > I/O > > > error. So the solution was to rm the object and the put it again, that > > > worked. > > > > > > However, after restarting one of the MDSes and seeting it to repaired, > I've > > > hit another, similar problem: > > > > > > > > > 2018-07-12 17:04:41.999136 7f54c3f4e700 -1 log_channel(cluster) log > [ERR] : > > > error reading table object 'mds0_inotable' -5 ((5) Input/output error) > > > > > > > > > Can I safely try to do the same as for object 200.? Should I > check > > > something before trying it? Again, checking the copies of the object, > they > > > have identical md5sums on all the replicas. > > > > > > Thanks, > > > > > > > > > Alessandro > > > > > > > > > Il 12/07/18 16:46, Alessandro De Salvo ha scritto: > > > > > > Unfortunately yes, all the OSDs were restarted a few times, but no > change. > > > > > > Thanks, > > > > > > > > > Alessandro > > > > > > > > > Il 12/07/18 15:55, Paul Emmerich ha scritto: > > > > > > This might seem like a stupid suggestion, but: have you tried to > restart the > > > OSDs? > > > > > > I've also encountered some random CRC errors that only showed up when > trying > > > to read an object, > > > but not on scrubbing, that magically disappeared after restarting the > OSD. > > > > > > However, in my case it was clearly related to > > > https://tracker.ceph.com/issues/22464 which doesn't > > > seem to be the issue here. > > > > > > Paul > > > > > > 2018-07-12 13:53 GMT+02:00 Alessandro De Salvo > > > : > > >> > > >> > > >> Il 12/07/18 11:20, Alessandro De Salvo ha scritto: > > >> > > >>> > > >>> > > >>> Il 12/07/18 10:58, Dan van der Ster ha scritto: > > > > On Wed, Jul 11, 2018 at 10:25 PM Gregory Farnum > > > wrote: > > > > > > On Wed, Jul 11, 2018 at 9:23 AM Alessandro De Salvo > > > wrote: > > >> > > >> OK, I found where the object is: > > >> > > >> > > >> ceph osd map cephfs_metadata 200. > > >> osdmap e632418 pool 'cephfs_metadata' (10) object '200.' > -> pg > > >> 10.844f3494 (10.14) -> up ([23,35,18], p23) acting ([23,35,18], > p23) > > >> > > >> > > >> So, looking at the osds 23, 35 and 18 logs in fact I see: > > >> > > >> > > >> osd.23: > > >> > > >> 2018-07-11 15:49:14.913771 7efbee672700 -1 log_channel(cluster) > log > > >> [ERR] : 10.14 full-object read crc 0x976aefc5 != expected > 0x9ef2b41b > > >> on > > >> 10:292cf221:::200.:head > > >> > > >> > > >> osd.35: > > >> > > >> 2018-07-11 18:01:19.989345 7f760291a700 -1 log_channel(cluster) > log > > >> [ERR] : 10.14 full-object read crc 0x976aefc5 != expected > 0x9ef2b41b > > >> on > > >> 10:292cf221:::200.:head > > >> > > >> > > >> osd.18: > > >> > > >> 2018-07-11 18:18:06.214933 7fabaf5c1700 -1 log_channel(cluster) > log > > >> [ERR] : 10.14 full-object read crc 0x976aefc5 != expected > 0x9ef2b41b > > >> on > > >> 10:292cf221:::200.:head > > >> > > >> > > >> So, basically the same error everywhere. > > >> > > >> I'm trying to issue a repair of the pg 10.14, but I'm not sure if > it > > >> may > > >> help. > > >> > > >> No SMART errors (the fileservers are SANs, in RAID6 + LVM > volumes), > > >> and > > >> no disk problems anywhere. No relevant errors in syslogs, the > hosts > > >> are > > >> just fine. I cannot exclude an error on the RAID controllers, but > 2 of > > >> the OSDs with 10.14 are on a SAN system and one on a different > one, so > > >> I > > >> would tend to exclude they both had (silent) errors at the same > time. > > > > > > > > > That's fairly distressing. At this point I'd probably try > extracting > > > the object using ceph-objectstore-tool and seeing if it decodes > properly as > > > an mds journal. If it does, you
[ceph-users] Ceph balancer module algorithm learning
Hi, all: I am now looking at the mgr balancer module. How do the two algorithms in it calculate? I just used ceph, the code reading ability is very poor. Can anyone help me explain how to calculate the score of the data balance?Especially `def calc_eval()` and `def calc_stats()`. Best ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] upgrading to 12.2.6 damages cephfs (crc errors)
The problem seems similar to https://tracker.ceph.com/issues/23871 which was fixed in mimic but not luminous: fe5038c7f9 osd/PrimaryLogPG: clear data digest on WRITEFULL if skip_data_digest .. dan On Fri, Jul 13, 2018 at 12:45 PM Dan van der Ster wrote: > > Hi, > > Following the reports on ceph-users about damaged cephfs after > updating to 12.2.6 I spun up a 1 node cluster to try the upgrade. > I started with two OSDs on 12.2.5, wrote some data. > Then I restarted the OSDs one by one while continuing to write to the > cephfs mountpoint. > Then I restarted the (single) MDS, and it is indeed damaged with a crc error: > > 2018-07-13 12:38:55.261379 osd.1 osd.1 137.138.62.86:6805/35320 2 : > cluster [ERR] 2.15 full-object read crc 0xed77af7c != expected > 0x1a1d319d on 2:aa448500:::500.:head > 2018-07-13 12:38:55.285994 osd.0 osd.0 137.138.62.86:6801/34755 2 : > cluster [ERR] 2.13 full-object read crc 0xa73a97ef != expected > 0x3e6fdb4a on 2:c91d4a1d:::mds0_inotable:head > > I think it goes without saying that nobody should upgrade a cephfs to > 12.2.6 until this is understood. > > -- Dan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] mds daemon damaged
Hi Kevin, Are your OSDs bluestore or filestore? -- dan On Thu, Jul 12, 2018 at 11:30 PM Kevin wrote: > > Sorry for the long posting but trying to cover everything > > I woke up to find my cephfs filesystem down. This was in the logs > > 2018-07-11 05:54:10.398171 osd.1 [ERR] 2.4 full-object read crc > 0x6fc2f65a != expected 0x1c08241c on 2:292cf221:::200.:head > > I had one standby MDS, but as far as I can tell it did not fail over. > This was in the logs > > (insufficient standby MDS daemons available) > > Currently my ceph looks like this >cluster: > id: .. > health: HEALTH_ERR > 1 filesystem is degraded > 1 mds daemon damaged > >services: > mon: 6 daemons, quorum ds26,ds27,ds2b,ds2a,ds28,ds29 > mgr: ids27(active) > mds: test-cephfs-1-0/1/1 up , 3 up:standby, 1 damaged > osd: 5 osds: 5 up, 5 in > >data: > pools: 3 pools, 202 pgs > objects: 1013k objects, 4018 GB > usage: 12085 GB used, 6544 GB / 18630 GB avail > pgs: 201 active+clean > 1 active+clean+scrubbing+deep > >io: > client: 0 B/s rd, 0 op/s rd, 0 op/s wr > > I started trying to get the damaged MDS back online > > Based on this page > http://docs.ceph.com/docs/master/cephfs/disaster-recovery-experts/#disaster-recovery-experts > > # cephfs-journal-tool journal export backup.bin > 2018-07-12 13:35:15.675964 7f3e1389bf00 -1 Header 200. is > unreadable > 2018-07-12 13:35:15.675977 7f3e1389bf00 -1 journal_export: Journal not > readable, attempt object-by-object dump with `rados` > Error ((5) Input/output error) > > # cephfs-journal-tool event recover_dentries summary > Events by type: > 2018-07-12 13:36:03.000590 7fc398a18f00 -1 Header 200. is > unreadableErrors: 0 > > cephfs-journal-tool journal reset - (I think this command might have > worked) > > Next up, tried to reset the filesystem > > ceph fs reset test-cephfs-1 --yes-i-really-mean-it > > Each time same errors > > 2018-07-12 11:56:35.760449 mon.ds26 [INF] Health check cleared: > MDS_DAMAGE (was: 1 mds daemon damaged) > 2018-07-12 11:56:35.856737 mon.ds26 [INF] Standby daemon mds.ds27 > assigned to filesystem test-cephfs-1 as rank 0 > 2018-07-12 11:56:35.947801 mds.ds27 [ERR] Error recovering journal > 0x200: (5) Input/output error > 2018-07-12 11:56:36.900807 mon.ds26 [ERR] Health check failed: 1 mds > daemon damaged (MDS_DAMAGE) > 2018-07-12 11:56:35.945544 osd.0 [ERR] 2.4 full-object read crc > 0x6fc2f65a != expected 0x1c08241c on 2:292cf221:::200.:head > 2018-07-12 12:00:00.000142 mon.ds26 [ERR] overall HEALTH_ERR 1 > filesystem is degraded; 1 mds daemon damaged > > Tried to 'fail' mds.ds27 > # ceph mds fail ds27 > # failed mds gid 1929168 > > Command worked, but each time I run the reset command the same errors > above appear > > Online searches say the object read error has to be removed. But there's > no object listed. This web page is the closest to the issue > http://tracker.ceph.com/issues/20863 > > Recommends fixing error by hand. Tried running deep scrub on pg 2.4, it > completes but still have the same issue above > > Final option is to attempt removing mds.ds27. If mds.ds29 was a standby > and has data it should become live. If it was not > I assume we will lose the filesystem at this point > > Why didn't the standby MDS failover? > > Just looking for any way to recover the cephfs, thanks! > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] MDS damaged
Hi Adam, Are your osds bluestore or filestore? -- dan On Fri, Jul 13, 2018 at 7:38 AM Adam Tygart wrote: > > I've hit this today with an upgrade to 12.2.6 on my backup cluster. > Unfortunately there were issues with the logs (in that the files > weren't writable) until after the issue struck. > > 2018-07-13 00:16:54.437051 7f5a0a672700 -1 log_channel(cluster) log > [ERR] : 5.255 full-object read crc 0x4e97b4e != expected 0x6cfe829d on > 5:aa448500:::500.:head > > It is a backup cluster and I can keep it around or blow away the data > (in this instance) as needed for testing purposes. > > -- > Adam > > On Thu, Jul 12, 2018 at 10:39 AM, Alessandro De Salvo > wrote: > > Some progress, and more pain... > > > > I was able to recover the 200. using the ceph-objectstore-tool for > > one of the OSDs (all identical copies) but trying to re-inject it just with > > rados put was giving no error while the get was still giving the same I/O > > error. So the solution was to rm the object and the put it again, that > > worked. > > > > However, after restarting one of the MDSes and seeting it to repaired, I've > > hit another, similar problem: > > > > > > 2018-07-12 17:04:41.999136 7f54c3f4e700 -1 log_channel(cluster) log [ERR] : > > error reading table object 'mds0_inotable' -5 ((5) Input/output error) > > > > > > Can I safely try to do the same as for object 200.? Should I check > > something before trying it? Again, checking the copies of the object, they > > have identical md5sums on all the replicas. > > > > Thanks, > > > > > > Alessandro > > > > > > Il 12/07/18 16:46, Alessandro De Salvo ha scritto: > > > > Unfortunately yes, all the OSDs were restarted a few times, but no change. > > > > Thanks, > > > > > > Alessandro > > > > > > Il 12/07/18 15:55, Paul Emmerich ha scritto: > > > > This might seem like a stupid suggestion, but: have you tried to restart the > > OSDs? > > > > I've also encountered some random CRC errors that only showed up when trying > > to read an object, > > but not on scrubbing, that magically disappeared after restarting the OSD. > > > > However, in my case it was clearly related to > > https://tracker.ceph.com/issues/22464 which doesn't > > seem to be the issue here. > > > > Paul > > > > 2018-07-12 13:53 GMT+02:00 Alessandro De Salvo > > : > >> > >> > >> Il 12/07/18 11:20, Alessandro De Salvo ha scritto: > >> > >>> > >>> > >>> Il 12/07/18 10:58, Dan van der Ster ha scritto: > > On Wed, Jul 11, 2018 at 10:25 PM Gregory Farnum > wrote: > > > > On Wed, Jul 11, 2018 at 9:23 AM Alessandro De Salvo > > wrote: > >> > >> OK, I found where the object is: > >> > >> > >> ceph osd map cephfs_metadata 200. > >> osdmap e632418 pool 'cephfs_metadata' (10) object '200.' -> pg > >> 10.844f3494 (10.14) -> up ([23,35,18], p23) acting ([23,35,18], p23) > >> > >> > >> So, looking at the osds 23, 35 and 18 logs in fact I see: > >> > >> > >> osd.23: > >> > >> 2018-07-11 15:49:14.913771 7efbee672700 -1 log_channel(cluster) log > >> [ERR] : 10.14 full-object read crc 0x976aefc5 != expected 0x9ef2b41b > >> on > >> 10:292cf221:::200.:head > >> > >> > >> osd.35: > >> > >> 2018-07-11 18:01:19.989345 7f760291a700 -1 log_channel(cluster) log > >> [ERR] : 10.14 full-object read crc 0x976aefc5 != expected 0x9ef2b41b > >> on > >> 10:292cf221:::200.:head > >> > >> > >> osd.18: > >> > >> 2018-07-11 18:18:06.214933 7fabaf5c1700 -1 log_channel(cluster) log > >> [ERR] : 10.14 full-object read crc 0x976aefc5 != expected 0x9ef2b41b > >> on > >> 10:292cf221:::200.:head > >> > >> > >> So, basically the same error everywhere. > >> > >> I'm trying to issue a repair of the pg 10.14, but I'm not sure if it > >> may > >> help. > >> > >> No SMART errors (the fileservers are SANs, in RAID6 + LVM volumes), > >> and > >> no disk problems anywhere. No relevant errors in syslogs, the hosts > >> are > >> just fine. I cannot exclude an error on the RAID controllers, but 2 of > >> the OSDs with 10.14 are on a SAN system and one on a different one, so > >> I > >> would tend to exclude they both had (silent) errors at the same time. > > > > > > That's fairly distressing. At this point I'd probably try extracting > > the object using ceph-objectstore-tool and seeing if it decodes > > properly as > > an mds journal. If it does, you might risk just putting it back in > > place to > > overwrite the crc. > > > Wouldn't it be easier to scrub repair the PG to fix the crc? > >>> > >>> > >>> this is what I already instructed the cluster to do, a deep scrub, but > >>> I'm not sure it could repair in case all replicas are bad, as it seems to > >>> be > >>> the case. > >> > >> > >> I finally managed (with the help of
Re: [ceph-users] Bluestore and number of devices
You can keep the same layout as before. Most place DB/WAL combined in one partition (similar to the journal on filestore). Kevin 2018-07-13 12:37 GMT+02:00 Robert Stanford : > > I'm using filestore now, with 4 data devices per journal device. > > I'm confused by this: "BlueStore manages either one, two, or (in certain > cases) three storage devices." > (http://docs.ceph.com/docs/luminous/rados/configuration/ > bluestore-config-ref/) > > When I convert my journals to bluestore, will they still be four data > devices (osds) per journal, or will they each require a dedicated journal > drive now? > > Regards > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] upgrading to 12.2.6 damages cephfs (crc errors)
Hi, Following the reports on ceph-users about damaged cephfs after updating to 12.2.6 I spun up a 1 node cluster to try the upgrade. I started with two OSDs on 12.2.5, wrote some data. Then I restarted the OSDs one by one while continuing to write to the cephfs mountpoint. Then I restarted the (single) MDS, and it is indeed damaged with a crc error: 2018-07-13 12:38:55.261379 osd.1 osd.1 137.138.62.86:6805/35320 2 : cluster [ERR] 2.15 full-object read crc 0xed77af7c != expected 0x1a1d319d on 2:aa448500:::500.:head 2018-07-13 12:38:55.285994 osd.0 osd.0 137.138.62.86:6801/34755 2 : cluster [ERR] 2.13 full-object read crc 0xa73a97ef != expected 0x3e6fdb4a on 2:c91d4a1d:::mds0_inotable:head I think it goes without saying that nobody should upgrade a cephfs to 12.2.6 until this is understood. -- Dan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Bluestore and number of devices
I'm using filestore now, with 4 data devices per journal device. I'm confused by this: "BlueStore manages either one, two, or (in certain cases) three storage devices." ( http://docs.ceph.com/docs/luminous/rados/configuration/bluestore-config-ref/ ) When I convert my journals to bluestore, will they still be four data devices (osds) per journal, or will they each require a dedicated journal drive now? Regards ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD tuning no longer required?
This is what leads me to believe it's other settings being referred to as well: https://ceph.com/community/new-luminous-rados-improvements/ *"There are dozens of documents floating around with long lists of Ceph configurables that have been tuned for optimal performance on specific hardware or for specific workloads. In most cases these ceph.conf fragments tend to induce funny looks on developers’ faces because the settings being adjusted seem counter-intuitive, unrelated to the performance of the system, and/or outright dangerous. Our goal is to make Ceph work as well as we can out of the box without requiring any tuning at all, so we are always striving to choose sane defaults. And generally, we discourage tuning by users. "* To me it's not just bluestore settings / sdd vs. hdd they're talking about ("dozens of documents floating around"... "our goal... without any tuning at all". Am I off base? Regards On Thu, Jul 12, 2018 at 9:12 PM, Konstantin Shalygin wrote: > I saw this in the Luminous release notes: >> >> "Each OSD now adjusts its default configuration based on whether the >> backing device is an HDD or SSD. Manual tuning generally not required" >> >> Which tuning in particular? The ones in my configuration are >> osd_op_threads, osd_disk_threads, osd_recovery_max_active, >> osd_op_thread_suicide_timeout, and osd_crush_chooseleaf_type, among >> others. Can I rip these out when I upgrade to >> Luminous? >> > > This mean that some "bluestore_*" settings tuned for nvme/hdd separately. > > Also with Luminous we have: > > osd_op_num_shards_(ssd|hdd) > > osd_op_num_threads_per_shard_(ssd|hdd) > > osd_recovery_sleep_(ssd|hdd) > > > > > k > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] [Ceph Admin & Monitoring] Inkscope is back
Hi, Inkscope, a ceph admin and monitoring GUI, is still alive. It can be now installed with an ansible playbook. https://github.com/inkscope/inkscope-ansible Best regards - - - - - - - - - - - - - - - - - Ghislain Chevalier ORANGE/IMT/OLS/DIESE/LCP/DDSD Software-Defined Storage Architect +33299124432 +33788624370 _ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Increase queue_depth in KVM
Konstantin, Thanks for explanation. But unfortunately, upgrading qemu is nearly impossible in my case. So is there something else I can do, or I have to agree with fact that write IOPS had to be 8x smaller inside KVM rather than outside KVM? :| pt., 13 lip 2018 o 04:22 Konstantin Shalygin napisał(a): > > I've seen some people using 'num_queues' but I don't have this parameter > > in my schemas(libvirt version = 1.3.1, qemu version = 2.5.0 > > > num-queues is available from qemu 2.7 [1] > > > [1] https://wiki.qemu.org/ChangeLog/2.7 > > > > > k > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] mds daemon damaged
Hi Kevin, Am 13.07.2018 um 04:21 schrieb Kevin: > That thread looks exactly like what I'm experiencing. Not sure why my > repeated googles didn't find it! maybe the thread was still too "fresh" for Google's indexing. > > I'm running 12.2.6 and CentOS 7 > > And yes, I recently upgraded from jewel to luminous following the > instructions of changing the repo and then updating. Everything has been > working fine up until this point > > Given that previous thread I feel at a bit of a loss as to what to try now > since that thread ended with no resolution I could see. I hope the thread is still continuing, given that another affected person just commented on it. We also planned to upgrade our production cluster to 12.2.6 (also on CentOS 7) in the weekend since we are affected by two Ceph-fuse bugs causing inconsistency of directory contents since months which have been fixed in 12.2.6, but given this situation, we'll rather live with that a bit longer and hold off on the update... > > Thanks for pointing that out though, it seems like almost the exact same > situation > > On 2018-07-12 18:23, Oliver Freyermuth wrote: >> Hi, >> >> all this sounds an awful lot like: >> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-July/027992.html >> In htat case, things started with an update to 12.2.6. Which version >> are you running? >> >> Cheers, >> Oliver >> >> Am 12.07.2018 um 23:30 schrieb Kevin: >>> Sorry for the long posting but trying to cover everything >>> >>> I woke up to find my cephfs filesystem down. This was in the logs >>> >>> 2018-07-11 05:54:10.398171 osd.1 [ERR] 2.4 full-object read crc 0x6fc2f65a >>> != expected 0x1c08241c on 2:292cf221:::200.:head >>> >>> I had one standby MDS, but as far as I can tell it did not fail over. This >>> was in the logs >>> >>> (insufficient standby MDS daemons available) >>> >>> Currently my ceph looks like this >>> cluster: >>> id: .. >>> health: HEALTH_ERR >>> 1 filesystem is degraded >>> 1 mds daemon damaged >>> >>> services: >>> mon: 6 daemons, quorum ds26,ds27,ds2b,ds2a,ds28,ds29 >>> mgr: ids27(active) >>> mds: test-cephfs-1-0/1/1 up , 3 up:standby, 1 damaged >>> osd: 5 osds: 5 up, 5 in >>> >>> data: >>> pools: 3 pools, 202 pgs >>> objects: 1013k objects, 4018 GB >>> usage: 12085 GB used, 6544 GB / 18630 GB avail >>> pgs: 201 active+clean >>> 1 active+clean+scrubbing+deep >>> >>> io: >>> client: 0 B/s rd, 0 op/s rd, 0 op/s wr >>> >>> I started trying to get the damaged MDS back online >>> >>> Based on this page >>> http://docs.ceph.com/docs/master/cephfs/disaster-recovery-experts/#disaster-recovery-experts >>> >>> # cephfs-journal-tool journal export backup.bin >>> 2018-07-12 13:35:15.675964 7f3e1389bf00 -1 Header 200. is unreadable >>> 2018-07-12 13:35:15.675977 7f3e1389bf00 -1 journal_export: Journal not >>> readable, attempt object-by-object dump with `rados` >>> Error ((5) Input/output error) >>> >>> # cephfs-journal-tool event recover_dentries summary >>> Events by type: >>> 2018-07-12 13:36:03.000590 7fc398a18f00 -1 Header 200. is >>> unreadableErrors: 0 >>> >>> cephfs-journal-tool journal reset - (I think this command might have worked) >>> >>> Next up, tried to reset the filesystem >>> >>> ceph fs reset test-cephfs-1 --yes-i-really-mean-it >>> >>> Each time same errors >>> >>> 2018-07-12 11:56:35.760449 mon.ds26 [INF] Health check cleared: MDS_DAMAGE >>> (was: 1 mds daemon damaged) >>> 2018-07-12 11:56:35.856737 mon.ds26 [INF] Standby daemon mds.ds27 assigned >>> to filesystem test-cephfs-1 as rank 0 >>> 2018-07-12 11:56:35.947801 mds.ds27 [ERR] Error recovering journal 0x200: >>> (5) Input/output error >>> 2018-07-12 11:56:36.900807 mon.ds26 [ERR] Health check failed: 1 mds daemon >>> damaged (MDS_DAMAGE) >>> 2018-07-12 11:56:35.945544 osd.0 [ERR] 2.4 full-object read crc 0x6fc2f65a >>> != expected 0x1c08241c on 2:292cf221:::200.:head >>> 2018-07-12 12:00:00.000142 mon.ds26 [ERR] overall HEALTH_ERR 1 filesystem >>> is degraded; 1 mds daemon damaged >>> >>> Tried to 'fail' mds.ds27 >>> # ceph mds fail ds27 >>> # failed mds gid 1929168 >>> >>> Command worked, but each time I run the reset command the same errors above >>> appear >>> >>> Online searches say the object read error has to be removed. But there's no >>> object listed. This web page is the closest to the issue >>> http://tracker.ceph.com/issues/20863 >>> >>> Recommends fixing error by hand. Tried running deep scrub on pg 2.4, it >>> completes but still have the same issue above >>> >>> Final option is to attempt removing mds.ds27. If mds.ds29 was a standby and >>> has data it should become live. If it was not >>> I assume we will lose the filesystem at this point >>> >>> Why didn't the standby MDS failover? >>> >>> Just looking for any way to recover the cephfs, thanks! >>> >>>