Hi everyone ,

We had a ceph cluster version 18.2.7 deployed with four osds per nvme device.

Two weeks ago, we lost one hard drive ( so 4 osds ) , just after the osds had crashed , we had another "healthy" osd which crashed ( few minutes after the initial hard drive failure ) .

Again , this week we lost another hard drive ( lost 4 osds ) and again the same osd crashed too ( few minutes later ) .

Here below the crash information , the same crash information for both crashes :


 ceph crash info 2025-08-21T00:31:13.929426Z_576e6e42-7c6b-49b8-90f1-9d51730f8ac2
{
    "assert_condition": "r == 0",
    "assert_file": "/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/release/18.2.7/rpm/el9/BUILD/ceph-18.2.7/src/os/bluestore/BlueStore.cc",     "assert_func": "void BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)",
    "assert_line": 12944,
    "assert_msg": "/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/release/18.2.7/rpm/el9/BUILD/ceph-18.2.7/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::_txc_apply_kv(BlueStore::TransContext*, bool)' thread 7f5fba6a3640 time 2025-08-21T00:31:13.926894+0000\n/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/release/18.2.7/rpm/el9/BUILD/ceph-18.2.7/src/os/bluestore/BlueStore.cc: 12944: FAILED ceph_assert(r == 0)\n",
    "assert_thread_name": "bstore_kv_sync",
    "backtrace": [
        "/lib64/libc.so.6(+0x3ebf0) [0x7f5fce5a5bf0]",
        "/lib64/libc.so.6(+0x8bf5c) [0x7f5fce5f2f5c]",
        "raise()",
        "abort()",
        "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x169) [0x560bc2d0e44d]",
        "/usr/bin/ceph-osd(+0x3cc5ae) [0x560bc2d0e5ae]",
        "/usr/bin/ceph-osd(+0x3b8864) [0x560bc2cfa864]",
        "(BlueStore::_kv_sync_thread()+0x1073) [0x560bc32bb623]",
        "/usr/bin/ceph-osd(+0x8fd3b1) [0x560bc323f3b1]",
        "/lib64/libc.so.6(+0x8a21a) [0x7f5fce5f121a]",
        "/lib64/libc.so.6(+0x10f290) [0x7f5fce676290]"
    ],
    "ceph_version": "18.2.7",
    "crash_id": "2025-08-21T00:31:13.929426Z_576e6e42-7c6b-49b8-90f1-9d51730f8ac2",
    "entity_name": "osd.108",
    "os_id": "centos",
    "os_name": "CentOS Stream",
    "os_version": "9",
    "os_version_id": "9",
    "process_name": "ceph-osd",
    "stack_sig": "0080731b49e5583e6d168903c1ea7df8bc2caded6b5e24ee4381077d54b045e2",
    "timestamp": "2025-08-21T00:31:13.929426Z",
    "utsname_hostname": "pub1-cephosd-9",
    "utsname_machine": "x86_64",
    "utsname_release": "6.1.0-31-amd64",
    "utsname_sysname": "Linux",
    "utsname_version": "#1 SMP PREEMPT_DYNAMIC Debian 6.1.128-1 (2025-02-07)"


Regards
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to