Hi Jan,

We encountered this issue after upgrading to Tentacle 20.2.1 and attempting to 
rebuild some of our OSDs that use an NVMe device for the WAL/DB.

I was unable to find any instance of this issue being reported upstream, either 
via a GitHub pull request or a bug report.

My colleague has submitted a bug report here with the code regression, 
including a workaround:
https://tracker.ceph.com/issues/77127

Justin Mammarella

Storage Infrastructure Lead
Research Computing Services
The University of Melbourne
Victoria 3010, Australia


From: Jan Kasprzak via ceph-users <[email protected]>
Date: Monday, 30 March 2026 at 6:17 pm
To: [email protected] <[email protected]>
Subject: [EXT] [ceph-users] Crash on newly created NVMe+HDD OSD

External email: Please exercise caution

        Hello, Ceph users,

I wanted to replace a HDD with bad blocks in an OSD which has block.db on NVMe.
Removing the OSD (ceph-volume lvm zap, etc.) went well, no VG/LV remained
for data or metadata. But after replacing the HDD, preparing a new
HDD+NVMe OSD, and starting it with "ceph-volume lvm activate --all",
the OSD daemon crashed with

terminate called after throwing an instance of 
'ceph::buffer::v15_2_0::malformed_input

Retrying the proces once more I noticed that ceph-volume created a VG/PV/LV
on the HDD (/dev/sda), but did not create a VG/PV/LV on /dev/nvme1n1p4.

Zapping the result once more and creating a HDD-only OSD helped,
the OSD daemon started and joined the cluster correctly. I then stopped
it once more, and tried to create a db volume myself, and migrate db to it:

systemctl stop ceph-osd@$OSD_ID
NEW_DB_VG=ceph-`uuidgen`
NEW_DB_LV=osd-db-`uuidgen`
vgcreate $NEW_DB_VG /dev/nvme1n1p4
lvcreate -a y -l 100%FREE -n $NEW_DB_VG/$NEW_DB_LV
ceph-volume lvm new-db --osd-id $OSD_ID --osd-fsid $OSD_FSID --target 
$NEW_DB_VG/$NEW_DB_LV
ceph-volume lvm migrate --osd-id $OSD_ID --osd-fsid $OSD_FSID --from data 
--target $NEW_DB_VG/$NEW_DB_LV
systemctl start ceph-osd@$OSD_ID

... and it started correctly and did not crash.

I used the same commands previously on Ceph 19 when replacing HDDs, and it
worked without problem. This is the first time I replaced a HDD after upgrading
to Ceph 20.2.0, so it might be a regression in ceph-volume of 20.2.0.

Below is the output of "ceph-volume lvm prepare", and the crash report from
journal. Did I do something incorrect here? Thanks!

-Yenya

# ceph-volume lvm prepare --bluestore --block.db /dev/nvme1n1p4 --data /dev/sda
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd 
--keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 
db38dbfb-e161-4680-93e3-ba4abaa36963
Running command: vgcreate --force --yes 
ceph-8805a195-57fd-4054-8282-a307d15955a5 /dev/sda
 stdout: Physical volume "/dev/sda" successfully created.
 stdout: Volume group "ceph-8805a195-57fd-4054-8282-a307d15955a5" successfully 
created
Running command: lvcreate --yes -l 2861055 -n 
osd-block-db38dbfb-e161-4680-93e3-ba4abaa36963 
ceph-8805a195-57fd-4054-8282-a307d15955a5
 stdout: Logical volume "osd-block-db38dbfb-e161-4680-93e3-ba4abaa36963" 
created.
Running command: /usr/bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-25
Running command: /usr/sbin/restorecon /var/lib/ceph/osd/ceph-25
Running command: /usr/bin/chown -h ceph:ceph 
/dev/ceph-8805a195-57fd-4054-8282-a307d15955a5/osd-block-db38dbfb-e161-4680-93e3-ba4abaa36963
Running command: /usr/bin/chown -R ceph:ceph /dev/dm-0
Running command: /usr/bin/ln -s 
/dev/ceph-8805a195-57fd-4054-8282-a307d15955a5/osd-block-db38dbfb-e161-4680-93e3-ba4abaa36963
 /var/lib/ceph/osd/ceph-25/block
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd 
--keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o 
/var/lib/ceph/osd/ceph-25/activate.monmap
 stderr: got monmap epoch 39
--> Creating keyring file for osd.25
Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-25/keyring
Running command: /usr/bin/chown -R ceph:ceph /dev/nvme1n1p4
Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-25/
Running command: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore 
--mkfs -i 25 --monmap /var/lib/ceph/osd/ceph-25/activate.monmap --keyfile - 
--bluestore-block-db-path /dev/nvme1n1p4 --osd-data /var/lib/ceph/osd/ceph-25/ 
--osd-uuid db38dbfb-e161-4680-93e3-ba4abaa36963 --setuser ceph --setgroup ceph
 stderr: 2026-03-30T07:59:06.150+0200 7f2ae8102900 -1 
bluestore(/var/lib/ceph/osd/ceph-25//block) No valid bdev label found
 stderr: 2026-03-30T07:59:06.175+0200 7f2ae8102900 -1 
bluestore(/var/lib/ceph/osd/ceph-25/) _read_fsid unparsable uuid
--> ceph-volume lvm prepare successful for: /dev/sda

Mar 30 07:52:31 osdhost systemd[1]: Starting Ceph object storage daemon 
osd.25...
Mar 30 07:52:31 osdhost systemd[1]: Started Ceph object storage daemon osd.25.
Mar 30 07:52:31 osdhost ceph-osd[1140285]: 2026-03-30T07:52:31.738+0200 
7f2dc400c900 -1 Falling back to public interface
Mar 30 07:52:31 osdhost ceph-osd[1140285]: terminate called after throwing an 
instance of 'ceph::buffer::v15_2_0::malformed_input'
Mar 30 07:52:31 osdhost ceph-osd[1140285]:   what():  static void 
bluefs_fnode_t::_denc_finish(ceph::buffer::v15_2_0::ptr::const_iterator&, 
char**, uint32_t*): Malformed input [buffer:3]
Mar 30 07:52:31 osdhost ceph-osd[1140285]: *** Caught signal (Aborted) **
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  in thread 7f2dc400c900 
thread_name:ceph-osd
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  ceph version 20.2.0 
(69f84cc2651aa259a15bc192ddaabd3baba07489) tentacle (stable - RelWithDebInfo)
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  1: /lib64/libc.so.6(+0x3fc30) 
[0x7f2dc3e3fc30]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  2: /lib64/libc.so.6(+0x8d02c) 
[0x7f2dc3e8d02c]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  3: raise()
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  4: abort()
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  5: /lib64/libstdc++.so.6(+0xa1b21) 
[0x7f2dc4ca1b21]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  6: /lib64/libstdc++.so.6(+0xad53c) 
[0x7f2dc4cad53c]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  7: /lib64/libstdc++.so.6(+0xad5a7) 
[0x7f2dc4cad5a7]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  8: /lib64/libstdc++.so.6(+0xad809) 
[0x7f2dc4cad809]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  9: /usr/bin/ceph-osd(+0x45ceef) 
[0x558458ee6eef]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  10: 
(bluefs_super_t::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x1a0)
 [0x55845962f140]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  11: (BlueFS::_open_super()+0xe6) 
[0x5584596123a6]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  12: (BlueFS::mount()+0xc2) 
[0x5584595fbf42]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  13: (BlueStore::_open_bluefs(bool, 
bool)+0x294) [0x55845953ed34]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  14: 
(BlueStore::_prepare_db_environment(bool, bool, 
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> 
>*, std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> >*)+0x5d0) [0x55845953f8b0]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  15: (BlueStore::_open_db(bool, 
bool, bool)+0x135) [0x5584595407b5]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  16: 
(BlueStore::_open_db_and_around(bool, bool)+0x31a) [0x55845954137a]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  17: (BlueStore::_mount()+0x281) 
[0x558459546a31]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  18: (OSD::init()+0x509) 
[0x5584590bea09]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  19: main()
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  20: /lib64/libc.so.6(+0x2a610) 
[0x7f2dc3e2a610]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  21: __libc_start_main()
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  22: _start()
Mar 30 07:52:31 osdhost ceph-osd[1140285]: 2026-03-30T07:52:31.931+0200 
7f2dc400c900 -1 *** Caught signal (Aborted) **
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  in thread 7f2dc400c900 
thread_name:ceph-osd
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  ceph version 20.2.0 
(69f84cc2651aa259a15bc192ddaabd3baba07489) tentacle (stable - RelWithDebInfo)
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  1: /lib64/libc.so.6(+0x3fc30) 
[0x7f2dc3e3fc30]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  2: /lib64/libc.so.6(+0x8d02c) 
[0x7f2dc3e8d02c]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  3: raise()
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  4: abort()
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  5: /lib64/libstdc++.so.6(+0xa1b21) 
[0x7f2dc4ca1b21]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  6: /lib64/libstdc++.so.6(+0xad53c) 
[0x7f2dc4cad53c]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  7: /lib64/libstdc++.so.6(+0xad5a7) 
[0x7f2dc4cad5a7]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  8: /lib64/libstdc++.so.6(+0xad809) 
[0x7f2dc4cad809]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  9: /usr/bin/ceph-osd(+0x45ceef) 
[0x558458ee6eef]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  10: 
(bluefs_super_t::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x1a0)
 [0x55845962f140]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  11: (BlueFS::_open_super()+0xe6) 
[0x5584596123a6]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  12: (BlueFS::mount()+0xc2) 
[0x5584595fbf42]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  13: (BlueStore::_open_bluefs(bool, 
bool)+0x294) [0x55845953ed34]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  14: 
(BlueStore::_prepare_db_environment(bool, bool, 
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> 
>*, std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> >*)+0x5d0) [0x55845953f8b0]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  15: (BlueStore::_open_db(bool, 
bool, bool)+0x135) [0x5584595407b5]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  16: 
(BlueStore::_open_db_and_around(bool, bool)+0x31a) [0x55845954137a]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  17: (BlueStore::_mount()+0x281) 
[0x558459546a31]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  18: (OSD::init()+0x509) 
[0x5584590bea09]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  19: main()
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  20: /lib64/libc.so.6(+0x2a610) 
[0x7f2dc3e2a610]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  21: __libc_start_main()
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  22: _start()
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  NOTE: a copy of the executable, or 
`objdump -rdS <executable>` is needed to interpret this.
Mar 30 07:52:31 osdhost ceph-osd[1140285]:    -51> 2026-03-30T07:52:31.738+0200 
7f2dc400c900 -1 Falling back to public interface
Mar 30 07:52:31 osdhost ceph-osd[1140285]:      0> 2026-03-30T07:52:31.931+0200 
7f2dc400c900 -1 *** Caught signal (Aborted) **
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  in thread 7f2dc400c900 
thread_name:ceph-osd
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  ceph version 20.2.0 
(69f84cc2651aa259a15bc192ddaabd3baba07489) tentacle (stable - RelWithDebInfo)
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  1: /lib64/libc.so.6(+0x3fc30) 
[0x7f2dc3e3fc30]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  2: /lib64/libc.so.6(+0x8d02c) 
[0x7f2dc3e8d02c]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  3: raise()
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  4: abort()
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  5: /lib64/libstdc++.so.6(+0xa1b21) 
[0x7f2dc4ca1b21]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  6: /lib64/libstdc++.so.6(+0xad53c) 
[0x7f2dc4cad53c]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  7: /lib64/libstdc++.so.6(+0xad5a7) 
[0x7f2dc4cad5a7]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  8: /lib64/libstdc++.so.6(+0xad809) 
[0x7f2dc4cad809]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  9: /usr/bin/ceph-osd(+0x45ceef) 
[0x558458ee6eef]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  10: 
(bluefs_super_t::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x1a0)
 [0x55845962f140]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  11: (BlueFS::_open_super()+0xe6) 
[0x5584596123a6]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  12: (BlueFS::mount()+0xc2) 
[0x5584595fbf42]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  13: (BlueStore::_open_bluefs(bool, 
bool)+0x294) [0x55845953ed34]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  14: 
(BlueStore::_prepare_db_environment(bool, bool, 
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> 
>*, std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> >*)+0x5d0) [0x55845953f8b0]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  15: (BlueStore::_open_db(bool, 
bool, bool)+0x135) [0x5584595407b5]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  16: 
(BlueStore::_open_db_and_around(bool, bool)+0x31a) [0x55845954137a]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  17: (BlueStore::_mount()+0x281) 
[0x558459546a31]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  18: (OSD::init()+0x509) 
[0x5584590bea09]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  19: main()
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  20: /lib64/libc.so.6(+0x2a610) 
[0x7f2dc3e2a610]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  21: __libc_start_main()
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  22: _start()
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  NOTE: a copy of the executable, or 
`objdump -rdS <executable>` is needed to interpret this.
Mar 30 07:52:31 osdhost ceph-osd[1140285]:    -51> 2026-03-30T07:52:31.738+0200 
7f2dc400c900 -1 Falling back to public interface
Mar 30 07:52:31 osdhost ceph-osd[1140285]:      0> 2026-03-30T07:52:31.931+0200 
7f2dc400c900 -1 *** Caught signal (Aborted) **
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  in thread 7f2dc400c900 
thread_name:ceph-osd
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  ceph version 20.2.0 
(69f84cc2651aa259a15bc192ddaabd3baba07489) tentacle (stable - RelWithDebInfo)
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  1: /lib64/libc.so.6(+0x3fc30) 
[0x7f2dc3e3fc30]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  2: /lib64/libc.so.6(+0x8d02c) 
[0x7f2dc3e8d02c]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  3: raise()
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  4: abort()
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  5: /lib64/libstdc++.so.6(+0xa1b21) 
[0x7f2dc4ca1b21]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  6: /lib64/libstdc++.so.6(+0xad53c) 
[0x7f2dc4cad53c]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  7: /lib64/libstdc++.so.6(+0xad5a7) 
[0x7f2dc4cad5a7]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  8: /lib64/libstdc++.so.6(+0xad809) 
[0x7f2dc4cad809]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  9: /usr/bin/ceph-osd(+0x45ceef) 
[0x558458ee6eef]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  10: 
(bluefs_super_t::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x1a0)
 [0x55845962f140]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  11: (BlueFS::_open_super()+0xe6) 
[0x5584596123a6]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  12: (BlueFS::mount()+0xc2) 
[0x5584595fbf42]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  13: (BlueStore::_open_bluefs(bool, 
bool)+0x294) [0x55845953ed34]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  14: 
(BlueStore::_prepare_db_environment(bool, bool, 
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> 
>*, std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> >*)+0x5d0) [0x55845953f8b0]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  15: (BlueStore::_open_db(bool, 
bool, bool)+0x135) [0x5584595407b5]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  16: 
(BlueStore::_open_db_and_around(bool, bool)+0x31a) [0x55845954137a]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  17: (BlueStore::_mount()+0x281) 
[0x558459546a31]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  18: (OSD::init()+0x509) 
[0x5584590bea09]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  19: main()
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  20: /lib64/libc.so.6(+0x2a610) 
[0x7f2dc3e2a610]
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  21: __libc_start_main()
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  22: _start()
Mar 30 07:52:31 osdhost ceph-osd[1140285]:  NOTE: a copy of the executable, or 
`objdump -rdS <executable>` is needed to interpret this.
Mar 30 07:52:32 osdhost systemd[1]: [email protected]: Main process exited, 
code=dumped, status=6/ABRT
Mar 30 07:52:32 osdhost systemd[1]: [email protected]: Failed with result 
'core-dump'.

--
| Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
| https://www.fi.muni.cz/~kas                        GPG: 4096R/A45477D5 |
    I don't like Python; its lack of inline, anonymous, multi-statement
    functions makes me sad.                                --Eric Wastl
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to