Hi to all, my cluster stuck after upgrade from hammer 0.94.5 to luminous.
Iit seems somehow osds stuck at hammer version despite
$ceph-osd --version
ceph version 12.0.1 (5456408827a1a31690514342624a4ff9b66be1d5)
All OSDs are down in preboot state, on every osd log it says "osdmap
SORTBITWISE OSDMap flag is NOT set; please set it"
When I try osd set sortbitwise I get "Error EPERM: not all up OSDs have
OSD_BITWISE_HOBJ_SORT feature"
$ceph osd dump -f json-pretty | grep features
"features": 37154696925806591,
"features": 37154696925806591,
"features": 37154696925806591,
"features": 37154696925806591,
"features": 37154696925806591,
"features": 37154696925806591,
"features": 37154696925806591,
"features": 0,
"features": 37154696925806591,
"features": 0,
$ceph osd metadata|egrep "id|ceph_version"
"id": 0,
"ceph_version": "ceph version 0.94.5
(9764da52395923e0b32908d83a9f7304401fee43)",
"id": 1,
"ceph_version": "ceph version 0.94.5
(9764da52395923e0b32908d83a9f7304401fee43)",
"id": 2,
"ceph_version": "ceph version 0.94.5
(9764da52395923e0b32908d83a9f7304401fee43)",
"id": 3,
"ceph_version": "ceph version 0.94.5
(9764da52395923e0b32908d83a9f7304401fee43)",
"id": 4,
"ceph_version": "ceph version 0.94.5
(9764da52395923e0b32908d83a9f7304401fee43)",
"id": 5,
"ceph_version": "ceph version 0.94.5
(9764da52395923e0b32908d83a9f7304401fee43)",
"id": 6,
"ceph_version": "ceph version 0.94.5
(9764da52395923e0b32908d83a9f7304401fee43)",
"id": 7
"id": 8,
"ceph_version": "ceph version 0.94.5
(9764da52395923e0b32908d83a9f7304401fee43)",
"id": 9
On one of stuck OSD I gues this superblock is upgrade due to features 14, 15 so
I can't downgrade to hammer
$ ceph-objectstore-tool --data-path=/var/lib/ceph/osd/ceph-1 --op dump-super
{
"cluster_fsid": "630c11ff-ff8d-4bf0-8217-860eb684e78c",
"osd_fsid": "69a974e1-fdfa-434e-8279-8411196a127f",
"whoami": 1,
"current_epoch": 9126,
"oldest_map": 8591,
"newest_map": 9126,
"weight": 0.000000,
"compat": {
"compat": {},
"ro_compat": {},
"incompat": {
"feature_1": "initial feature set(~v.18)",
"feature_2": "pginfo object",
"feature_3": "object locator",
"feature_4": "last_epoch_clean",
"feature_5": "categories",
"feature_6": "hobjectpool",
"feature_7": "biginfo",
"feature_8": "leveldbinfo",
"feature_9": "leveldblog",
"feature_10": "snapmapper",
"feature_12": "transaction hints",
"feature_13": "pg meta object",
"feature_14": "explicit missing set",
"feature_15": "fastinfo pg attr"
}
},
"clean_thru": 9126,
"last_epoch_mounted": 0
}
root@ceph-node03:~# export ms="/home/ceph/monstore";for osd in
/var/lib/ceph/osd/ceph-*; do ceph-objectstore-tool --data-path $osd --op
update-mon-db --mon-store-path "$ms";done
mismatched full crc: 3120238035 != 1569055237
mismatched full crc: 3120238035 != 1569055237
mismatched full crc: 3120238035 != 1569055237
root@ceph-node02:~# export ms="/home/ceph/monstore"; for osd in
/var/lib/ceph/osd/ceph-*; do ceph-objectstore-tool --data-path $osd --op
update-mon-db --mon-store-path "$ms";done
mismatched full crc: 2310723283 != 1012422761
mismatched full crc: 2310723283 != 1012422761
missing #-1:edbd1965:::inc_osdmap.8591:0#
Some more info:
root@ceph-node01:/var/lib/ceph# ceph -s
cluster 630c11ff-ff8d-4bf0-8217-860eb684e78c
health HEALTH_ERR
384 pgs are stuck inactive for more than 300 seconds
236 pgs degraded
384 pgs stale
236 pgs stuck degraded
384 pgs stuck stale
277 pgs stuck unclean
236 pgs stuck undersized
236 pgs undersized
recovery 761841/2108609 objects degraded (36.130%)
recovery 17768/2108609 objects misplaced (0.843%)
monmap e9: 1 mons at {ceph-node01=192.168.137.68:6789/0}
election epoch 12729, quorum 0 ceph-node01
mgr active: ceph-node01
osdmap e9103: 10 osds: 0 up, 0 in
pgmap v13292951: 384 pgs, 3 pools, 722 GB data, 1008 kobjects
0 kB used, 0 kB / 0 kB avail
761841/2108609 objects degraded (36.130%)
17768/2108609 objects misplaced (0.843%)
236 stale+active+undersized+degraded
107 stale+active+clean
41 stale+active+remapped
root@ceph-node03:~# ceph daemon osd.5 status
{
"cluster_fsid": "630c11ff-ff8d-4bf0-8217-860eb684e78c",
"osd_fsid": "7fb02cdb-8e15-4d41-8c0f-5a59301933ec",
"whoami": 5,
"state": "preboot",
"oldest_map": 8701,
"newest_map": 9126,
"num_pgs": 128
}
Can I somehow overcome this situation and what could happened during the
upgrade?
I performed upgrade from hammer by ceph-deploy install --release luminous
Thank you, best regards.
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com