Yes, it was a healthy cluster and I had to rebuild because the OSD’s got 
accidentally created on the root disk. Out of 4 OSD’s I had to rebuild 3 of 
them.


[jshah@Lab-cephmon001 ~]$ ceph osd tree
# id    weight  type name       up/down reweight
-1      0.5     root default
-2      0.09999         host Lab-cephosd005
4       0.09999                 osd.4   up      1       
-3      0.09999         host Lab-cephosd001
0       0.09999                 osd.0   up      1       
-4      0.09999         host Lab-cephosd002
1       0.09999                 osd.1   up      1       
-5      0.09999         host Lab-cephosd003
2       0.09999                 osd.2   up      1       
-6      0.09999         host Lab-cephosd004
3       0.09999                 osd.3   up      1       


[jshah@Lab-cephmon001 ~]$ ceph pg 2.33 query
Error ENOENT: i don't have paid 2.33

—Jiten


On Nov 20, 2014, at 11:18 AM, Craig Lewis <cle...@centraldesktop.com> wrote:

> Just to be clear, this is from a cluster that was healthy, had a disk 
> replaced, and hasn't returned to healthy?  It's not a new cluster that has 
> never been healthy, right?
> 
> Assuming it's an existing cluster, how many OSDs did you replace?  It almost 
> looks like you replaced multiple OSDs at the same time, and lost data because 
> of it.
> 
> Can you give us the output of `ceph osd tree`, and `ceph pg 2.33 query`?
> 
> 
> On Wed, Nov 19, 2014 at 2:14 PM, JIten Shah <jshah2...@me.com> wrote:
> After rebuilding a few OSD’s, I see that the pg’s are stuck in degraded mode. 
> Sone are in the unclean and others are in the stale state. Somehow the MDS is 
> also degraded. How do I recover the OSD’s and the MDS back to healthy ? Read 
> through the documentation and on the web but no luck so far.
> 
> pg 2.33 is stuck unclean since forever, current state 
> stale+active+degraded+remapped, last acting [3]
> pg 0.30 is stuck unclean since forever, current state 
> stale+active+degraded+remapped, last acting [3]
> pg 1.31 is stuck unclean since forever, current state stale+active+degraded, 
> last acting [2]
> pg 2.32 is stuck unclean for 597129.903922, current state 
> stale+active+degraded, last acting [2]
> pg 0.2f is stuck unclean for 597129.903951, current state 
> stale+active+degraded, last acting [2]
> pg 1.2e is stuck unclean since forever, current state 
> stale+active+degraded+remapped, last acting [3]
> pg 2.2d is stuck unclean since forever, current state 
> stale+active+degraded+remapped, last acting [2]
> pg 0.2e is stuck unclean since forever, current state 
> stale+active+degraded+remapped, last acting [3]
> pg 1.2f is stuck unclean for 597129.904015, current state 
> stale+active+degraded, last acting [2]
> pg 2.2c is stuck unclean since forever, current state 
> stale+active+degraded+remapped, last acting [3]
> pg 0.2d is stuck stale for 422844.566858, current state 
> stale+active+degraded, last acting [2]
> pg 1.2c is stuck stale for 422598.539483, current state 
> stale+active+degraded+remapped, last acting [3]
> pg 2.2f is stuck stale for 422598.539488, current state 
> stale+active+degraded+remapped, last acting [3]
> pg 0.2c is stuck stale for 422598.539487, current state 
> stale+active+degraded+remapped, last acting [3]
> pg 1.2d is stuck stale for 422598.539492, current state 
> stale+active+degraded+remapped, last acting [3]
> pg 2.2e is stuck stale for 422598.539496, current state 
> stale+active+degraded+remapped, last acting [3]
> pg 0.2b is stuck stale for 422598.539491, current state 
> stale+active+degraded+remapped, last acting [3]
> pg 1.2a is stuck stale for 422598.539496, current state 
> stale+active+degraded+remapped, last acting [3]
> pg 2.29 is stuck stale for 422598.539504, current state 
> stale+active+degraded+remapped, last acting [3]
> .
> .
> .
> 6 ops are blocked > 2097.15 sec
> 3 ops are blocked > 2097.15 sec on osd.0
> 2 ops are blocked > 2097.15 sec on osd.2
> 1 ops are blocked > 2097.15 sec on osd.4
> 3 osds have slow requests
> recovery 40/60 objects degraded (66.667%)
> mds cluster is degraded
> mds.Lab-cephmon001 at X.X.16.111:6800/3424727 rank 0 is replaying journal
> 
> —Jiten
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to