On 2025-12-03 15:57, Marek Szuba via ceph-users wrote:

> Well, things have got interesting now. The above appears to have worked
> for 4 out of 5 mgr daemons - but now most "ceph orch" commands give me
> the error "Error ENOENT: Module not found". I can also see many
> cephadm-related "module not found" error in system logs.

Progress! Having carefully searched through all the logs, I discovered 
the module was unable to load due to a spurious field in certificate 
JSON. A quick look at the relevant bit of Python code then allowed me to 
identify the field "editable", present in all six cert/key config keys, 
as the culprit.

Having carefully edited and reset the offending JSON, I was then able to 
successfully run 'ceph orch upgrade start --ceph-version 20.2.0' (which 
redid the mgr and mon upgrade - that was intentional, in case the manual 
redeployment with new image missed something). This has just 
successfully completed.

@Michael: I did see that bit of the documentation, I just had no clue 
which config keys exactly to examine and what to look for in them. 
Fortunately it turned out that some of the operations resulting in 
"Module not found" also crashed the active mgr (interestingly enough, it 
wasn't all such operations - Ceph only noted two mgr crashes), and I was 
able to find the traceback in crash metadata.

-- 
MS
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to