[ceph-users] new feature: auto removal of osds causing "stuck inactive"

Chad William Seys Fri, 28 Oct 2016 06:53:21 -0700

Hi all,

I recently encountered a situation where some partially removed OSDscaused my cluster to enter a "stuck inactive" state. The eventuallysolution was to tell ceph the OSDs were "lost". Because all the PGswere replicated elsewhere on the cluster, no data was lost.

Would it make sense or be possible for Ceph to automatically detectthis situation ("stuck inactive" and PGs replicated elsewhere) andautomatically take action to un-stuck the cluster? E.g. automaticallymark the OSD as lost or cause the OSD be down and out to have the sameeffect?


  Ideally anything that can be safely automated should be.  :)

Thanks!
C.
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] new feature: auto removal of osds causing "stuck inactive"

Reply via email to