Hello, i have a cluster running Jewel 10.2.0, 25 OSD's + 4 Mon. Today my cluster suddenly went unhealth with lots of stuck pg's due unfound objects, no disks failures nor node crashes, it just went bad.
I managed to put the cluster on health state again by marking lost objects to delete "ceph pg <id> mark_unfound_lost delete". Regarding the fact that i have no idea why the cluster gone bad, i realized restarting the osd' daemons to unlock stuck clients put the cluster on unhealth and pg gone stuck again due unfound objects. Does anyone have this issue? --- Diego Castro / The CloudFather GetupCloud.com - Eliminamos a Gravidade
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
