Hello,
Followed standard upgrade procedure to upgrade from 13.2.1 to 13.2.2.
After upgrade MDS cluster is down, mds rank 0 and purge_queue journal are
damaged. Resetting purge_queue does not seem to work well as journal still
appears to be damaged.
Can anybody help?
mds log:
-789> 2018-09-26 18:42:32.527 7f70f78b1700 1 mds.mds2 Updating MDS map to
version 586 from mon.2
-788> 2018-09-26 18:42:32.527 7f70f78b1700 1 mds.0.583 handle_mds_map i am
now mds.0.583
-787> 2018-09-26 18:42:32.527 7f70f78b1700 1 mds.0.583 handle_mds_map state
change up:rejoin --> up:active
-786> 2018-09-26 18:42:32.527 7f70f78b1700 1 mds.0.583 recovery_done --
successful recovery!
<skip>
-38> 2018-09-26 18:42:32.707 7f70f28a7700 -1 mds.0.purge_queue _consume:
Decode error at read_pos=0x322ec6636
-37> 2018-09-26 18:42:32.707 7f70f28a7700 5 mds.beacon.mds2 set_want_state:
up:active -> down:damaged
-36> 2018-09-26 18:42:32.707 7f70f28a7700 5 mds.beacon.mds2 _send
down:damaged seq 137
-35> 2018-09-26 18:42:32.707 7f70f28a7700 10 monclient: _send_mon_message to
mon.ceph3 at mon:6789/0
-34> 2018-09-26 18:42:32.707 7f70f28a7700 1 -- mds:6800/e4cc09cf -->
mon:6789/0 -- mdsbeacon(14c72/mds2 down:damaged seq 137 v24a) v7 --
0x563b321ad480 con 0
<skip>
-3> 2018-09-26 18:42:32.743 7f70f98b5700 5 -- mds:6800/3838577103 >>
mon:6789/0 conn(0x563b3213e000 :-1
s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=8 cs=1 l=1). rx mon.2 seq 29
0x563b321ab880 mdsbeaco
n(85106/mds2 down:damaged seq 311 v587) v7
-2> 2018-09-26 18:42:32.743 7f70f98b5700 1 -- mds:6800/3838577103 <==
mon.2 mon:6789/0 29 ==== mdsbeacon(85106/mds2 down:damaged seq 311 v587) v7
==== 129+0+0 (3296573291 0 0) 0x563b321ab880 con 0x563b3213e
000
-1> 2018-09-26 18:42:32.743 7f70f98b5700 5 mds.beacon.mds2
handle_mds_beacon down:damaged seq 311 rtt 0.038261
0> 2018-09-26 18:42:32.743 7f70f28a7700 1 mds.mds2 respawn!
# cephfs-journal-tool --journal=purge_queue journal inspect
Overall journal integrity: DAMAGED
Corrupt regions:
0x322ec65d9-ffffffffffffffff
# cephfs-journal-tool --journal=purge_queue journal reset
old journal was 13470819801~8463
new journal start will be 13472104448 (1276184 bytes past old end)
writing journal head
done
# cephfs-journal-tool --journal=purge_queue journal inspect
2018-09-26 19:00:52.848 7f3f9fa50bc0 -1 Missing object 500.00000c8c
Overall journal integrity: DAMAGED
Objects missing:
0xc8c
Corrupt regions:
0x323000000-ffffffffffffffff
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com