Hi,
Our Ceph cluster is in an error state with the message:
# ceph status
cluster:
id: 58140ed2-4ed4-11ed-b4db-5c6f69756a60
health: HEALTH_ERR
Module 'cephadm' has failed: invalid literal for int() with base
10: '352.broken'
This happened after trying to re-add an OSD which had failed. Adopting it back
in to the Ceph failed because a directory was causing problems in
/var/lib/ceph/{cephid}/osd.352. To re-add the OSD I renamed it to
osd.352.broken (rather than delete it), re-ran the command and then everything
worked perfectly. Then 5 minutes later the ceph orchestrator went into
"HEALTH_ERR"
I've removed that directory, but "cephadm" isn't cleaning up after itself. Does
anyone know if there's a way I can clear the cache for this directory it's
tried to inventory and failed?
Thanks,
Duncan
--
Dr Duncan Tooke | Research Cluster Administrator
Centre for Computational Biology, Weatherall Institute of Molecular Medicine,
University of Oxford, OX3 9DS
www.imm.ox.ac.uk<http://www.imm.ox.ac.uk>
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]