Scrubs discovered the following inconsistency:

2018-08-23 17:21:07.933458 osd.62 osd.62 10.122.0.140:6805/77767 6 :
cluster [ERR] 9.3cd shard 113: soid
9:b3cd8d89:::.dir.default.153398310.112:head omap_digest 0xea4ba012 !=
omap_digest 0xc5acebfd from shard 62, omap_digest 0xea4ba012 != omap_digest
0xc5acebfd from auth oi
9:b3cd8d89:::.dir.default.153398310.112:head(138609'2009129
osd.250.0:64658209 dirty|omap|data_digest|omap_digest s 0 uv 1995230 dd
ffffffff od c5acebfd alloc_hint [0 0 0])

The omap_digest_mismatch appears on a non-primary OSD in a pool with 4
replicas. In this situation I decided to issue "pg repair" as I expected
ceph will repair the broken object. The command was successful but repair
on 9.3cd didn't start.

Then I have tried the procedure described here (setting a temporary key on
the object to force recalculation of omap_digest):
https://www.mail-archive.com/[email protected]/msg47219.html
But deep-scrub on 9.3cd didn't start. The OSD marked the 9.3cd for
scrubbing, but that's all what happened:

2018-08-27 14:36:22.703848 7faa7e860700 20 osd.62 713813 OSD::ms_dispatch:
scrub([9.3cd] deep) v2
2018-08-27 14:36:22.703869 7faa7e860700 20 osd.62 713813 _dispatch
0x55725b76d180 scrub([9.3cd] deep) v2
2018-08-27 14:36:22.703871 7faa7e860700 10 osd.62 713813 handle_scrub
scrub([9.3cd] deep) v2
2018-08-27 14:36:22.703878 7faa7e860700 10 osd.62 713813 marking pg[9.3cd(
v 713813'2359292 (713107'2357731,713813'2359292]
local-lis/les=711049/711050 n=41419 ec=178/178 lis/c 711049/711049 les/c/f
711050/711149/222921 711049/711049/710352) [62,53,163,113] r=0 lpr=711049
crt=713813'2359292 lcod 713813'2359291 mlcod 713813'2359291
active+clean+inconsistent MUST_DEEP_SCRUB MUST_SCRUB] for scrub

Does anyone know how to recover from inconsistency in such case?
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to