This time at the end of recovery procedure you described it was like most pgs active+clean 20 pgs incomplete. After that when trying to use the cluster I got "request blocked more than" and no vm can start. I know that something has happened after the broken disk, probably a server reboot. I am investigating. But even if I find the origin of the problem it will not help in finding a solution now. So I am using my time in repairing the pool only to save the production data and I will throw away the rest. Now after marking all pgs as complete with ceph_objectstore_tool I see that:
1) ceph has put out three hdds ( I suppose due to scrub but it is my only my idea, I will check logs) BAD 2) it is recovering for objects degraded and misplaced GOOD 3) vm are not usable yet BAD 4) I see some pgs in state down+peering (I hope is not BAD) Regarding 1) how I can put again that three hdds in the cluster? Should I remove them from crush and start again? Can I tell ceph that they are not bad? Mario Il giorno mer 29 giu 2016 alle ore 15:34 Lionel Bouton < [email protected]> ha scritto: > Hi, > > Le 29/06/2016 12:00, Mario Giammarco a écrit : > > Now the problem is that ceph has put out two disks because scrub has > > failed (I think it is not a disk fault but due to mark-complete) > > There is something odd going on. I've only seen deep-scrub failing (ie > detect one inconsistency and marking the pg so) so I'm not sure what > happens in the case of a "simple" scrub failure but what should not > happen is the whole OSD going down on scrub of deepscrub fairure which > you seem to imply did happen. > Do you have logs for these two failures giving a hint at what happened > (probably /var/log/ceph/ceph-osd.<n>.log) ? Any kernel log pointing to > hardware failure(s) around the time these events happened ? > > Another point : you said that you had one disk "broken". Usually ceph > handles this case in the following manner : > - the OSD detects the problem and commit suicide (unless it's configured > to ignore IO errors which is not the default), > - your cluster is then in degraded state with one OSD down/in, > - after a timeout (several minutes), Ceph decides that the OSD won't > come up again soon and marks the OSD "out" (so one OSD down/out), > - as the OSD is out, crush adapts pg positions based on the remaining > available OSDs and bring back all degraded pg to clean state by creating > missing replicas while moving pgs around. You see a lot of IO, many pg > in wait_backfill/backfilling states at this point, > - when all is done the cluster is back to HEALTH_OK > > When your disk was broken and you waited 24 hours how far along this > process was your cluster ? > > Best regards, > > Lionel >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
