Just to be clear, this is from a cluster that was healthy, had a disk
replaced, and hasn't returned to healthy?  It's not a new cluster that has
never been healthy, right?

Assuming it's an existing cluster, how many OSDs did you replace?  It
almost looks like you replaced multiple OSDs at the same time, and lost
data because of it.

Can you give us the output of `ceph osd tree`, and `ceph pg 2.33 query`?


On Wed, Nov 19, 2014 at 2:14 PM, JIten Shah <jshah2...@me.com> wrote:

> After rebuilding a few OSD’s, I see that the pg’s are stuck in degraded
> mode. Sone are in the unclean and others are in the stale state. Somehow
> the MDS is also degraded. How do I recover the OSD’s and the MDS back to
> healthy ? Read through the documentation and on the web but no luck so far.
>
> pg 2.33 is stuck unclean since forever, current state
> stale+active+degraded+remapped, last acting [3]
> pg 0.30 is stuck unclean since forever, current state
> stale+active+degraded+remapped, last acting [3]
> pg 1.31 is stuck unclean since forever, current state
> stale+active+degraded, last acting [2]
> pg 2.32 is stuck unclean for 597129.903922, current state
> stale+active+degraded, last acting [2]
> pg 0.2f is stuck unclean for 597129.903951, current state
> stale+active+degraded, last acting [2]
> pg 1.2e is stuck unclean since forever, current state
> stale+active+degraded+remapped, last acting [3]
> pg 2.2d is stuck unclean since forever, current state
> stale+active+degraded+remapped, last acting [2]
> pg 0.2e is stuck unclean since forever, current state
> stale+active+degraded+remapped, last acting [3]
> pg 1.2f is stuck unclean for 597129.904015, current state
> stale+active+degraded, last acting [2]
> pg 2.2c is stuck unclean since forever, current state
> stale+active+degraded+remapped, last acting [3]
> pg 0.2d is stuck stale for 422844.566858, current state
> stale+active+degraded, last acting [2]
> pg 1.2c is stuck stale for 422598.539483, current state
> stale+active+degraded+remapped, last acting [3]
> pg 2.2f is stuck stale for 422598.539488, current state
> stale+active+degraded+remapped, last acting [3]
> pg 0.2c is stuck stale for 422598.539487, current state
> stale+active+degraded+remapped, last acting [3]
> pg 1.2d is stuck stale for 422598.539492, current state
> stale+active+degraded+remapped, last acting [3]
> pg 2.2e is stuck stale for 422598.539496, current state
> stale+active+degraded+remapped, last acting [3]
> pg 0.2b is stuck stale for 422598.539491, current state
> stale+active+degraded+remapped, last acting [3]
> pg 1.2a is stuck stale for 422598.539496, current state
> stale+active+degraded+remapped, last acting [3]
> pg 2.29 is stuck stale for 422598.539504, current state
> stale+active+degraded+remapped, last acting [3]
> .
> .
> .
> 6 ops are blocked > 2097.15 sec
> 3 ops are blocked > 2097.15 sec on osd.0
> 2 ops are blocked > 2097.15 sec on osd.2
> 1 ops are blocked > 2097.15 sec on osd.4
> 3 osds have slow requests
> recovery 40/60 objects degraded (66.667%)
> mds cluster is degraded
> mds.Lab-cephmon001 at X.X.16.111:6800/3424727 rank 0 is replaying journal
>
> —Jiten
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to