[ceph-users] MDS recovery

jack Tue, 25 Apr 2023 14:38:02 -0700

Hi All,

We have a CephFS cluster running Octopus with three control nodes each running 
an MDS, Monitor, and Manager on Ubuntu 20.04. The OS drive on one of these 
nodes failed recently and we had to do a fresh install, but made the mistake of 
installing Ubuntu 22.04 where Octopus is not available. We tried to force apt 
to use the Ubuntu 20.04 repo when installing Ceph so that it would install 
Octopus, but for some reason Quincy was still installed. We re-integrated this 
node and it seemed to work fine for about a week until our cluster reported 
damage to an MDS daemon and placed our filesystem into a degraded state.


cluster:
    id:     692905c0-f271-4cd8-9e43-1c32ef8abd13
    health: HEALTH_ERR
            mons are allowing insecure global_id reclaim
            1 filesystem is degraded
            1 filesystem is offline
            1 mds daemon damaged
            noout flag(s) set
            161 scrub errors
            Possible data damage: 24 pgs inconsistent
            8 pgs not deep-scrubbed in time
            4 pgs not scrubbed in time
            6 daemons have recently crashed

  services:
    mon: 3 daemons, quorum database-0,file-server,webhost (age 12d)
    mgr: database-0(active, since 4w), standbys: webhost, file-server
    mds: cephfs:0/1 3 up:standby, 1 damaged
    osd: 91 osds: 90 up (since 32h), 90 in (since 5M)
         flags noout

  task status:

  data:
    pools:   7 pools, 633 pgs
    objects: 169.18M objects, 640 TiB
    usage:   883 TiB used, 251 TiB / 1.1 PiB avail
    pgs:     605 active+clean
             23  active+clean+inconsistent
             4   active+clean+scrubbing+deep
             1   active+clean+scrubbing+deep+inconsistent

We are not sure if the Quincy/Octopus version mismatch is the problem, but we 
are in the process of downgrading this node now to ensure all nodes are running 
Octopus. Before doing that, we ran the following commands to try and recover:

$ cephfs-journal-tool --rank=cephfs:all journal export backup.bin

$ sudo cephfs-journal-tool --rank=cephfs:all event recover_dentries summary:

Events by type:
  OPEN: 29589
  PURGED: 1
  SESSION: 16
  SESSIONS: 4
  SUBTREEMAP: 127
  UPDATE: 70438
Errors: 0

$ cephfs-journal-tool --rank=cephfs:0 journal reset:

old journal was 170234219175~232148677
new journal start will be 170469097472 (2729620 bytes past old end)
writing journal head
writing EResetJournal entry
done

$ cephfs-table-tool all reset session

All of our MDS daemons are down and fail to restart with the following errors:

-3> 2023-04-20T10:25:15.072-0700 7f0465069700 -1 log_channel(cluster) log [ERR] 
: journal replay alloc 0x1000053af79 not in free 
[0x1000053af7d~0x1e8,0x1000053b35c~0x1f7,0x1000053b555~0x2,0x1000053b559~0x2,0x1000053b55d~0x2,0x1000053b561~0x2,0x1000053b565~0x1de,0x1000053b938~0x1fd,0x1000053bd2a~0x4,0x1000053bf23~0x4,0x1000053c11c~0x4,0x1000053cd7b~0x158,0x1000053ced8~0xffffac3128]
    -2> 2023-04-20T10:25:15.072-0700 7f0465069700 -1 log_channel(cluster) log 
[ERR] : journal replay alloc 
[0x1000053af7a~0x1eb,0x1000053b35c~0x1f7,0x1000053b555~0x2,0x1000053b559~0x2,0x1000053b55d~0x2],
 only 
[0x1000053af7d~0x1e8,0x1000053b35c~0x1f7,0x1000053b555~0x2,0x1000053b559~0x2,0x1000053b55d~0x2]
 is in free 
[0x1000053af7d~0x1e8,0x1000053b35c~0x1f7,0x1000053b555~0x2,0x1000053b559~0x2,0x1000053b55d~0x2,0x1000053b561~0x2,0x1000053b565~0x1de,0x1000053b938~0x1fd,0x1000053bd2a~0x4,0x1000053bf23~0x4,0x1000053c11c~0x4,0x1000053cd7b~0x158,0x1000053ced8~0xffffac3128]
    -1> 2023-04-20T10:25:15.072-0700 7f0465069700 -1 
/build/ceph-15.2.15/src/mds/journal.cc: In function 'void 
EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)' thread 7f0465069700 
time 2023-04-20T10:25:15.076784-0700
/build/ceph-15.2.15/src/mds/journal.cc: 1513: FAILED ceph_assert(inotablev == 
mds->inotable->get_version())

 ceph version 15.2.15 (2dfb18841cfecc2f7eb7eb2afd65986ca4d95985) octopus 
(stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x155) [0x7f04717a3be1]
 2: (()+0x26ade9) [0x7f04717a3de9]
 3: (EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)+0x67e2) 
[0x560feaca36f2]
 4: (EUpdate::replay(MDSRank*)+0x42) [0x560feaca5bd2]
 5: (MDLog::_replay_thread()+0x90c) [0x560feac393ac]
 6: (MDLog::ReplayThread::entry()+0x11) [0x560fea920821]
 7: (()+0x8609) [0x7f0471318609]
 8: (clone()+0x43) [0x7f0470ee9163]

     0> 2023-04-20T10:25:15.076-0700 7f0465069700 -1 *** Caught signal 
(Aborted) **
 in thread 7f0465069700 thread_name:md_log_replay

 ceph version 15.2.15 (2dfb18841cfecc2f7eb7eb2afd65986ca4d95985) octopus 
(stable)
 1: (()+0x143c0) [0x7f04713243c0]
 2: (gsignal()+0xcb) [0x7f0470e0d03b]
 3: (abort()+0x12b) [0x7f0470dec859]
 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x1b0) [0x7f04717a3c3c]
 5: (()+0x26ade9) [0x7f04717a3de9]
 6: (EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)+0x67e2) 
[0x560feaca36f2]
 7: (EUpdate::replay(MDSRank*)+0x42) [0x560feaca5bd2]
 8: (MDLog::_replay_thread()+0x90c) [0x560feac393ac]
 9: (MDLog::ReplayThread::entry()+0x11) [0x560fea920821]
 10: (()+0x8609) [0x7f0471318609]
 11: (clone()+0x43) [0x7f0470ee9163]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to 
interpret this.

At this point, we decided it's best to ask for some guidance before issuing any 
other recovery commands.

Can anyone advise what we should do?
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] MDS recovery

Reply via email to