Hi David, Apologies for the late response.
NodeB is mon+client, nodeC is client: Cheph health details: HEALTH_ERR 819 pgs are stuck inactive for more than 300 seconds; 883 pgs degraded; 64 pgs stale; 819 pgs stuck inactive; 1064 pgs stuck unclean; 883 pgs undersized; 22 requests are blocked > 32 sec; 3 osds have slow requests; recovery 2/8 objects degraded (25.000%); recovery 2/8 objects misplaced (25.000%); crush map has legacy tunables (require argonaut, min is firefly); crush map has straw_calc_version=0 pg 2.fc is stuck inactive since forever, current state undersized+degraded+peered, last acting [2] pg 2.fd is stuck inactive since forever, current state undersized+degraded+peered, last acting [0] pg 2.fe is stuck inactive since forever, current state undersized+degraded+peered, last acting [2] pg 2.ff is stuck inactive since forever, current state undersized+degraded+peered, last acting [1] pg 1.fb is stuck inactive for 493857.572982, current state undersized+degraded+peered, last acting [4] pg 2.f8 is stuck inactive since forever, current state undersized+degraded+peered, last acting [3] pg 1.fa is stuck inactive for 492185.443146, current state undersized+degraded+peered, last acting [0] pg 2.f9 is stuck inactive since forever, current state undersized+degraded+peered, last acting [0] pg 1.f9 is stuck inactive for 492185.452890, current state undersized+degraded+peered, last acting [2] pg 2.fa is stuck inactive since forever, current state undersized+degraded+peered, last acting [3] pg 1.f8 is stuck inactive for 492185.443324, current state undersized+degraded+peered, last acting [0] pg 2.fb is stuck inactive since forever, current state undersized+degraded+peered, last acting [2] . . . pg 1.fb is undersized+degraded+peered, acting [4] pg 2.ff is undersized+degraded+peered, acting [1] pg 2.fe is undersized+degraded+peered, acting [2] pg 2.fd is undersized+degraded+peered, acting [0] pg 2.fc is undersized+degraded+peered, acting [2] 3 ops are blocked > 536871 sec on osd.4 15 ops are blocked > 268435 sec on osd.4 1 ops are blocked > 262.144 sec on osd.4 2 ops are blocked > 268435 sec on osd.3 1 ops are blocked > 268435 sec on osd.1 3 osds have slow requests recovery 2/8 objects degraded (25.000%) recovery 2/8 objects misplaced (25.000%) crush map has legacy tunables (require argonaut, min is firefly); see http://ceph.com/docs/master/rados/operations/crush-map/#tunables crush map has straw_calc_version=0; see http://ceph.com/docs/master/rados/operations/crush-map/#tunables ceph osd stat cluster-admin@nodeB:~/.ssh/ceph-cluster$ cat ceph_osd_stat.txt osdmap e80: 10 osds: 5 up, 5 in; 558 remapped pgs flags sortbitwise ceph osd tree: cluster-admin@nodeB:~/.ssh/ceph-cluster$ ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 9.08691 root default -2 4.54346 host nodeB 5 0.90869 osd.5 down 0 1.00000 6 0.90869 osd.6 down 0 1.00000 7 0.90869 osd.7 down 0 1.00000 8 0.90869 osd.8 down 0 1.00000 9 0.90869 osd.9 down 0 1.00000 -3 4.54346 host nodeC 0 0.90869 osd.0 up 1.00000 1.00000 1 0.90869 osd.1 up 1.00000 1.00000 2 0.90869 osd.2 up 1.00000 1.00000 3 0.90869 osd.3 up 1.00000 1.00000 4 0.90869 osd.4 up 1.00000 1.00000 CrushMap: # begin crush map # devices device 0 osd.0 device 1 osd.1 device 2 osd.2 device 3 osd.3 device 4 osd.4 device 5 osd.5 device 6 osd.6 device 7 osd.7 device 8 osd.8 device 9 osd.9 # types type 0 osd type 1 host type 2 chassis type 3 rack type 4 row type 5 pdu type 6 pod type 7 room type 8 datacenter type 9 region type 10 root # buckets host nodeB { id -2 # do not change unnecessarily # weight 4.543 alg straw hash 0 # rjenkins1 item osd.5 weight 0.909 item osd.6 weight 0.909 item osd.7 weight 0.909 item osd.8 weight 0.909 item osd.9 weight 0.909 } host nodeC { id -3 # do not change unnecessarily # weight 4.543 alg straw hash 0 # rjenkins1 item osd.0 weight 0.909 item osd.1 weight 0.909 item osd.2 weight 0.909 item osd.3 weight 0.909 item osd.4 weight 0.909 } root default { id -1 # do not change unnecessarily # weight 9.087 alg straw hash 0 # rjenkins1 item nodeB weight 4.543 item nodeC weight 4.543 } # rules rule replicated_ruleset { ruleset 0 type replicated min_size 1 max_size 10 step take default step chooseleaf firstn 0 type host step emit } # end crush map ceph.conf cluster-admin@nodeB:~/.ssh/ceph-cluster$ cat /etc/ceph/ceph.conf [global] fsid = a04e9846-6c54-48ee-b26f-d6949d8bacb4 mon_initial_members = nodeB mon_host = <mon IP> auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx public_network = X.X.X.0/24 On Sat, Jun 18, 2016 at 12:15 PM, David <[email protected]> wrote: > Is this a test cluster that has never been healthy or a working cluster > which has just gone unhealthy? Have you changed anything? Are all hosts, > drives, network links working? More detail please. Any/all of the following > would help: > > ceph health detail > ceph osd stat > ceph osd tree > Your ceph.conf > Your crushmap > > On 17 Jun 2016 14:14, "Ishmael Tsoaela" <[email protected]> wrote: > > > > Hi All, > > > > please assist to fix the error: > > > > 1 X admin > > 2 X admin(hosting admin as well) > > > > 4 osd each node > > Please provide more detail, this suggests you should have 12 osd's but > your osd map shows 10 osd's, 5 of which are down. > > > > > > cluster a04e9846-6c54-48ee-b26f-d6949d8bacb4 > > health HEALTH_ERR > > 819 pgs are stuck inactive for more than 300 seconds > > 883 pgs degraded > > 64 pgs stale > > 819 pgs stuck inactive > > 245 pgs stuck unclean > > 883 pgs undersized > > 17 requests are blocked > 32 sec > > recovery 2/8 objects degraded (25.000%) > > recovery 2/8 objects misplaced (25.000%) > > crush map has legacy tunables (require argonaut, min is > firefly) > > crush map has straw_calc_version=0 > > monmap e1: 1 mons at {nodeB=155.232.195.4:6789/0} > > election epoch 7, quorum 0 nodeB > > osdmap e80: 10 osds: 5 up, 5 in; 558 remapped pgs > > flags sortbitwise > > pgmap v480: 1064 pgs, 3 pools, 6454 bytes data, 4 objects > > 25791 MB used, 4627 GB / 4652 GB avail > > 2/8 objects degraded (25.000%) > > 2/8 objects misplaced (25.000%) > > 819 undersized+degraded+peered > > 181 active > > 64 stale+active+undersized+degraded > > > > > > _______________________________________________ > > ceph-users mailing list > > [email protected] > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
