Hi all :) ,

I need some help, I'm in a sad situation : i've lost 2 ceph server nodes 
physically (5 nodes initialy/ 5 monitors). So 3 nodes left : node1, node2, node3
On my first node leaving, I've updated the crush map to remove every osds 
running on those 2 lost servers :
Ceph osd crush remove osds && ceph auth del osds && ceph osd rm osds && ceph 
osd remove my2Lostnodes
So the crush map seems to be ok now on node1.
Ceph osd tree on node 1 returns that every osds running on node2 are "down 1" 
and "up 1" on node 3 and "up 1" on node1. Nevertheless on node3 every ceph * 
commands stay freezed, so I'm not sure the crush map has been updated on node2 
and node3. I don't know how to set ods on node 2 up again.
My node2 says it cannot connect to the cluster !

Ceph -s on node 1 gives me (so still 5 monitors):

    cluster 45d9195b-365e-491a-8853-34b46553db94
     health HEALTH_WARN 10016 pgs degraded; 10016 pgs stuck unclean; recovery 
181055/544038 objects degraded (33.280%); 11/33 in osds are down; noout flag(s) 
set; 2 mons down, quorum 0,1,2 node1,node2,node3; clock skew detected on 
mon.node2
     monmap e1: 5 mons at 
{node1=172.23.6.11:6789/0,node2=172.23.6.12:6789/0,node3=172.23.6.13:6789/0,node4=172.23.6.14:6789/0,node5=172.23.6.15:6789/0<http://172.23.6.14:6789/0,omcinfcph02d=172.23.6.15:6789/0,omcinfcph61d=172.23.6.11:6789/0,omcinfcph62d=172.23.6.12:6789/0,omcinfcph63d=172.23.6.13:6789/0>},
 election epoch 488, quorum 0,1,2 node1,node2,node3
     mdsmap e48: 1/1/1 up {0=node3=up:active}
     osdmap e3852: 33 osds: 22 up, 33 in
            flags noout
      pgmap v8189785: 10016 pgs, 9 pools, 705 GB data, 177 kobjects
            2122 GB used, 90051 GB / 92174 GB avail
            181055/544038 objects degraded (33.280%)
               10016 active+degraded
  client io 0 B/s rd, 233 kB/s wr, 22 op/s


Thx for your help !!

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to