This time at the end of recovery procedure you described it was like most
pgs active+clean 20 pgs incomplete.
After that when trying to use the cluster I got "request blocked more than"
and no vm can start.
I know that something has happened after the broken disk, probably a server
reboot. I am investigating.
But even if I find the origin of the problem it will not help in finding a
solution now.
So I am using my time in repairing the pool only to save the production
data and I will throw away the rest.
Now after marking all pgs as complete with ceph_objectstore_tool I see that:

1) ceph has put out three hdds ( I suppose due to scrub but it is my only
my idea, I will check logs) BAD
2) it is recovering for objects degraded and misplaced GOOD
3) vm are not usable yet BAD
4) I see some pgs in state down+peering (I hope is not BAD)

Regarding 1) how I can put again that three hdds in the cluster? Should I
remove them from crush and start again?
Can I tell ceph that they are not bad?
Mario

Il giorno mer 29 giu 2016 alle ore 15:34 Lionel Bouton <
[email protected]> ha scritto:

> Hi,
>
> Le 29/06/2016 12:00, Mario Giammarco a écrit :
> > Now the problem is that ceph has put out two disks because scrub  has
> > failed (I think it is not a disk fault but due to mark-complete)
>
> There is something odd going on. I've only seen deep-scrub failing (ie
> detect one inconsistency and marking the pg so) so I'm not sure what
> happens in the case of a "simple" scrub failure but what should not
> happen is the whole OSD going down on scrub of deepscrub fairure which
> you seem to imply did happen.
> Do you have logs for these two failures giving a hint at what happened
> (probably /var/log/ceph/ceph-osd.<n>.log) ? Any kernel log pointing to
> hardware failure(s) around the time these events happened ?
>
> Another point : you said that you had one disk "broken". Usually ceph
> handles this case in the following manner :
> - the OSD detects the problem and commit suicide (unless it's configured
> to ignore IO errors which is not the default),
> - your cluster is then in degraded state with one OSD down/in,
> - after a timeout (several minutes), Ceph decides that the OSD won't
> come up again soon and marks the OSD "out" (so one OSD down/out),
> - as the OSD is out, crush adapts pg positions based on the remaining
> available OSDs and bring back all degraded pg to clean state by creating
> missing replicas while moving pgs around. You see a lot of IO, many pg
> in wait_backfill/backfilling states at this point,
> - when all is done the cluster is back to HEALTH_OK
>
> When your disk was broken and you waited 24 hours how far along this
> process was your cluster ?
>
> Best regards,
>
> Lionel
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to