I have tested this ussuri-proposed packages and it fixes the issue.
Setup a Nautilus cluster with the following versions:
# ceph versions
{
"mon": {
"ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0)
nautilus (stable)": 1
},
"mgr": {
"ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0)
nautilus (stable)": 1
},
"osd": {
"ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0)
nautilus (stable)": 3
},
"mds": {},
"overall": {
"ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0)
nautilus (stable)": 5
}
}
# dpkg -l | grep -i ceph
ii ceph 14.2.9-0ubuntu0.19.10.1~cloud0
amd64 distributed storage and file system
ii ceph-base 14.2.9-0ubuntu0.19.10.1~cloud0
amd64 common ceph daemon libraries and management tools
ii ceph-common 14.2.9-0ubuntu0.19.10.1~cloud0
amd64 common utilities to mount and interact with a ceph storage cluster
ii ceph-mgr 14.2.9-0ubuntu0.19.10.1~cloud0
amd64 manager for the ceph distributed file system
ii ceph-mon 14.2.9-0ubuntu0.19.10.1~cloud0
amd64 monitor server for the ceph storage system
ii ceph-osd 14.2.9-0ubuntu0.19.10.1~cloud0
amd64 OSD server for the ceph storage system
ii libcephfs2 14.2.9-0ubuntu0.19.10.1~cloud0
amd64 Ceph distributed file system client library
ii python3-ceph-argparse 14.2.9-0ubuntu0.19.10.1~cloud0
amd64 Python 3 utility libraries for Ceph CLI
ii python3-cephfs 14.2.9-0ubuntu0.19.10.1~cloud0
amd64 Python 3 libraries for the Ceph libcephfs library
ii python3-rados 14.2.9-0ubuntu0.19.10.1~cloud0
amd64 Python 3 libraries for the Ceph librados library
Then upgraded the cluster to:
# ceph versions
{
"mon": {
"ceph version 15.2.3 (d289bbdec69ed7c1f516e0a093594580a76b78d0) octopus
(stable)": 1
},
"mgr": {
"ceph version 15.2.3 (d289bbdec69ed7c1f516e0a093594580a76b78d0) octopus
(stable)": 1
},
"osd": {
"ceph version 15.2.3 (d289bbdec69ed7c1f516e0a093594580a76b78d0) octopus
(stable)": 3
},
"mds": {},
"overall": {
"ceph version 15.2.3 (d289bbdec69ed7c1f516e0a093594580a76b78d0) octopus
(stable)": 5
}
}
# dpkg -l | grep -i ceph
ii ceph 15.2.3-0ubuntu0.20.04.2~cloud0
amd64 distributed storage and file system
ii ceph-base 15.2.3-0ubuntu0.20.04.2~cloud0
amd64 common ceph daemon libraries and management tools
ii ceph-common 15.2.3-0ubuntu0.20.04.2~cloud0
amd64 common utilities to mount and interact with a ceph storage cluster
ii ceph-mds 15.2.3-0ubuntu0.20.04.2~cloud0
amd64 metadata server for the ceph distributed file system
ii ceph-mgr 15.2.3-0ubuntu0.20.04.2~cloud0
amd64 manager for the ceph distributed file system
ii ceph-mgr-modules-core 15.2.3-0ubuntu0.20.04.2~cloud0
all ceph manager modules which are always enabled
ii ceph-mon 15.2.3-0ubuntu0.20.04.2~cloud0
amd64 monitor server for the ceph storage system
ii ceph-osd 15.2.3-0ubuntu0.20.04.2~cloud0
amd64 OSD server for the ceph storage system
ii libcephfs2 15.2.3-0ubuntu0.20.04.2~cloud0
amd64 Ceph distributed file system client library
ii python3-ceph-argparse 15.2.3-0ubuntu0.20.04.2~cloud0
amd64 Python 3 utility libraries for Ceph CLI
ii python3-ceph-common 15.2.3-0ubuntu0.20.04.2~cloud0
all Python 3 utility libraries for Ceph
ii python3-cephfs 15.2.3-0ubuntu0.20.04.2~cloud0
amd64 Python 3 libraries for the Ceph libcephfs library
ii python3-rados 15.2.3-0ubuntu0.20.04.2~cloud0
amd64 Python 3 libraries for the Ceph librados library
ii python3-rbd 15.2.3-0ubuntu0.20.04.2~cloud0
amd64 Python 3 libraries for the Ceph librbd library
Then tested cluster as noted in the description (set
`osd_inject_bad_map_crc_probability` to 1 on one OSD and then restart a
different OSD). And observed no OSD crash happened and the cluster is OK.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1891567
Title:
[SRU] ceph_osd crash in _committed_osd_maps when failed to encode
first inc map
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1891567/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs