** Description changed:
[Impact]
Upstream tracker: issue#46443 [0].
The ceph-osd service can crash when processing osd map updates.
When the osd encounters a CRC error while processing an incremental map
update, it will request a full map update from its peers. In this code
path, an uninitialized variable was recently introduced and that will
get de-referenced causing a crash.
The uninitialized variable was introduced in nautilus 14.2.10, and
octopus 15.2.1.
[Test Case]
# Inject osd_inject_bad_map_crc_probability = 1
sudo ceph daemon osd.{id} config set osd_inject_bad_map_crc_probability 1
# Trigger some osd map updates by restarting a different osd
sudo systemctl restart osd@{diff-id}
+ [Regression Potential]
+ The code has been updated to leave handle_osd_maps() early if a CRC error is
encountered, therefore preventing the commit if the failure is encountered
while processing an incremental map update. This will make the full map update
take longer but should prevent the crash that resulted in this bug.
Additionally _committed_osd_maps() is now coded to abort if first <= last, but
it is assumed that code should never be reached.
+
[Other Info]
Upstream has released a fix for this issue in Nautilus 14.2.11. The SRU for
this point release is being tracked by LP: #1891077
Upstream has merged a fix for this issue in Octopus [1], but there is no
current release target. The ceph packages in focal, groovy, and the
ussuri cloud archive are exposed to this critical regression.
[0] https://tracker.ceph.com/issues/46443
[1] https://github.com/ceph/ceph/pull/36340
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1891567
Title:
[SRU] ceph_osd crash in _committed_osd_maps when failed to encode
first inc map
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1891567/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs