Greetings — 

Have been seeing repeated crashes on my mgr module. Seems to run for about 45 
to 50 seconds and then boom. Cephadm setup here. Did (try to) enable a couple 
modules lately. iostat, stats diskprediction_local, but have toggled them back 
off - unfortunately it hasn’t fixed the issue. 

I was seeing “no mgr” with ceph -s — to get around that I’ve tweaked the 
“StartLimitInterval” setting in the ceph-[CLUSTER-UID]@.service file down to 
1m, and have 4x mgr’s setup so that I can get a couple commands run before mgr 
crashes and another starts. The 30m default there seems… high, imo - I was 
having intervals with no mgr which makes it tough to do much with the cluster - 
even “ceph -s” was hanging. Everything else in the cluster seems “normal” - 
still serving data. 


One other note — which I think is generally unrelated — I did upgrade one of my 
cluster nodes from “Plucky Puffin" (25.04) ubuntu, to “Questing Quokka” (25.10) 
ubuntu. After the upgrade, cephadm managed containers didn’t want to start. I 
tracked that down to having the ceph user userid in /etc/passwd set at 64045, 
but the container seeming to want userid 167. Most things under 
/var/lib/ceph/[CLUSTER UID]/ … appear to be owned by user/group 167:167 — I 
assume this is a default inside the container. Workaround here was to manually 
change the UID/GID for ceph in /etc/passwd and /etc/group. I’m going to imagine 
this is some collision between cephadm managed deployments, and how Ubuntu / 
apt installs cephadm. 



The aforementioned mgr Crashes look like: 


{
    "assert_condition": "cursor != root",
    "assert_file": "/ceph/rpmbuild/BUILD/ceph-20.2.0/src/mgr/PyFormatter.h",
    "assert_func": "virtual void PyFormatter::close_section()",
    "assert_line": 84,
    "assert_msg": "/ceph/rpmbuild/BUILD/ceph-20.2.0/src/mgr/PyFormatter.h: In 
function 'virtual void PyFormatter::close_section()' thread ffff34e55700 time 
2026-02-08T13:10:07.526894+0000\n/ceph/rpmbuild/BUILD/ceph-20.2.0/src/mgr/PyFormatter.h:
 84: FAILED ceph_assert(cursor != root)\n",
    "assert_thread_name": "telemetry",
    "backtrace": [
        "__kernel_rt_sigreturn()",
        "/lib64/libc.so.6(+0x82a78) [0xffff83603a78]",
        "raise()",
        "abort()",
        "(ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x190) [0xffff84039874]",
        "/usr/bin/ceph-mgr(+0xcf540) [0xaaaacdf2f540]",
        
"(ActivePyModules::get_perf_schema_python(std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const&, 
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > 
const&)+0xf6c) [0xaaaacdf46ec0]",
        "/usr/bin/ceph-mgr(+0x105528) [0xaaaacdf65528]",
        "/lib64/libpython3.9.so.1.0(+0xf7bfc) [0xffff84ae9bfc]",
        "_PyEval_EvalFrameDefault()",
        "/lib64/libpython3.9.so.1.0(+0xda3f0) [0xffff84acc3f0]",
        "_PyEval_EvalFrameDefault()",
        "/lib64/libpython3.9.so.1.0(+0xc47e8) [0xffff84ab67e8]",
        "_PyFunction_Vectorcall()",
        "_PyEval_EvalFrameDefault()",
        "/lib64/libpython3.9.so.1.0(+0xc47e8) [0xffff84ab67e8]",
        "_PyFunction_Vectorcall()",
        "_PyEval_EvalFrameDefault()",
        "/lib64/libpython3.9.so.1.0(+0xc47e8) [0xffff84ab67e8]",
        "_PyFunction_Vectorcall()",
        "_PyEval_EvalFrameDefault()",
        "/lib64/libpython3.9.so.1.0(+0xc47e8) [0xffff84ab67e8]",
        "_PyFunction_Vectorcall()",
        "_PyEval_EvalFrameDefault()",
        "/lib64/libpython3.9.so.1.0(+0xda3f0) [0xffff84acc3f0]",
        "/lib64/libpython3.9.so.1.0(+0xea93c) [0xffff84adc93c]",
        "/lib64/libpython3.9.so.1.0(+0xcf304) [0xffff84ac1304]",
        "/lib64/libpython3.9.so.1.0(+0x197d78) [0xffff84b89d78]",
        "_PyObject_CallMethod_SizeT()",
        "(PyModuleRunner::serve()+0x6c) [0xaaaacdfdf6cc]",
        "(PyModuleRunner::PyModuleRunnerThread::entry()+0x148) [0xaaaacdfdff08]"
    ],
    "ceph_version": "20.2.0",
    "crash_id": 
"2026-02-08T13:10:07.528739Z_f18e6b74-438b-47db-9438-5a3861fdef2d",
    "entity_name": "mgr.hc-945901a5cad1b6e3.mtijbv",
    "os_id": "centos",
    "os_name": "CentOS Stream",
    "os_version": "9",
    "os_version_id": "9",
    "process_name": "ceph-mgr",
    "stack_sig": 
"319d76a0d71a4644f9d65f592f6e621cca918d9205fc759ab7acf6944bc77fdd",
    "timestamp": "2026-02-08T13:10:07.528739Z",
    "utsname_hostname": "hc-945901a5cad1b6e3",
    "utsname_machine": "aarch64",
    "utsname_release": "6.14.0-1010-raspi",
    "utsname_sysname": "Linux",
    "utsname_version": "#10-Ubuntu SMP PREEMPT_DYNAMIC Tue Jul 15 19:09:05 UTC 
2025"
}
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to