You're running 0.87-6.  There were various fixes for this problem in
Firefly.  Were any of these snapshots created on early version of Firefly?

So far, every fix for this issue has gotten developers involved.  I'd see
if you can talk to some devs on IRC, or post to the ceph-devel mailing list.


My own experience is that I had to delete the affected PGs, and force
create them.  Hopefully there's a better answer now.



On Fri, Nov 7, 2014 at 8:10 PM, Chu Duc Minh <[email protected]> wrote:

> One of my OSDs have problems and can NOT be start. I tried to start many
> times but it always crash few minutes after start.
> I think about two reasons to make it crash:
> 1. A read/write request to this OSD, but due to the corrupted
> volume/snapshot/parent-image/..., it crash.
> 2. The recovering process can NOT work properly due to the corrupted
> volumes/snapshot/parent-image/...
>
> After many retry and check log, i guess the reason (2) is the main cause.
> Because  if (1) is the main cause, other OSDs (contain buggy
> volume/snapshot) will crash too.
>
> State of my ceph cluster (just few seconds before crash time):
>
>   111/57706299 objects degraded (0.001%)
>         14918 active+clean
>                    1 active+clean+scrubbing+deep
>                   52 active+recovery_wait+degraded
>                    2 active+recovering+degraded
>
>
> PS: i attach crash-dump log of that OSD in this email for your information.
>
> Thank you!
>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to