Well, we figured it out :) This mailing list post fixed our problem http://www.spinics.net/lists/ceph-users/msg24220.html
We had to mark the osds that were falsely reported as up, as down, and then restart all osd's Thanks! On Tue, Jan 5, 2016 at 6:43 PM, Mike Carlson <[email protected]> wrote: > Hey ceph-users > > We upgraded from hammer to infernalis, stopped all osd's to change the > user permissions from root to ceph, and all of our osd's are down (some say > they are up, but the status says it is booting) > > ceph -s > cluster cabd1728-2eca-4e18-a581-b4885364e5a4 > health HEALTH_WARN > 4 pgs backfilling > 2905 pgs degraded > 844 pgs peering > 1137 pgs stale > 2905 pgs stuck degraded > 2881 pgs stuck inactive > 1137 pgs stuck stale > 4192 pgs stuck unclean > 2905 pgs stuck undersized > 2905 pgs undersized > 1 requests are blocked > 32 sec > recovery 23553081/71720803 objects degraded (32.840%) > recovery 5450050/71720803 objects misplaced (7.599%) > mds cluster is degraded > nodown flag(s) set > monmap e1: 4 mons at {lts-mon= > 10.5.68.236:6789/0,lts-osd1=10.5.68.229:6789/0,lts-osd2=10.5.68.230:6789/0,lts-osd3=10.5.68.203:6789/0 > } > election epoch 1162, quorum 0,1,2,3 > lts-osd3,lts-osd1,lts-osd2,lts-mon > mdsmap e7102: 1/1/1 up {0=lts-osd1=up:replay} > osdmap e6858: 102 osds: 30 up, 30 in; 2473 remapped pgs > flags nodown > pgmap v6218348: 4192 pgs, 7 pools, 31604 GB data, 23331 kobjects > 32968 GB used, 78757 GB / 109 TB avail > 23553081/71720803 objects degraded (32.840%) > 5450050/71720803 objects misplaced (7.599%) > 1430 undersized+degraded+peered > 442 remapped+peering > 439 active+undersized+degraded+remapped > 322 stale+active+remapped > 227 stale+active+undersized+degraded > 207 stale+undersized+degraded+peered > 183 activating+undersized+degraded > 165 peering > 159 active+undersized+degraded > 123 stale+peering > 119 activating+undersized+degraded+remapped > 114 stale+remapped+peering > 107 active+remapped > 59 stale+activating+undersized+degraded > 57 stale+active+undersized+degraded+remapped > 21 stale+activating+undersized+degraded+remapped > 6 activating+remapped > 6 stale+activating+remapped > 4 undersized+degraded+remapped+backfilling+peered > 1 stale+remapped > 1 remapped > > > ceph osd tree > ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY > -1 371.27994 root default > -2 123.75998 host lts-osd1 > 0 3.64000 osd.0 down 0 1.00000 > 1 3.64000 osd.1 down 0 1.00000 > 2 3.64000 osd.2 down 0 1.00000 > 3 3.64000 osd.3 down 0 1.00000 > 4 3.64000 osd.4 down 0 1.00000 > 5 3.64000 osd.5 down 0 1.00000 > 6 3.64000 osd.6 down 0 1.00000 > 7 3.64000 osd.7 down 0 1.00000 > 8 3.64000 osd.8 down 0 1.00000 > 9 3.64000 osd.9 down 0 1.00000 > 10 3.64000 osd.10 down 0 1.00000 > 11 3.64000 osd.11 down 0 1.00000 > 12 3.64000 osd.12 down 0 1.00000 > 13 3.64000 osd.13 down 0 1.00000 > 14 3.64000 osd.14 down 0 1.00000 > 15 3.64000 osd.15 down 0 1.00000 > 16 3.64000 osd.16 down 0 1.00000 > 17 3.64000 osd.17 down 0 1.00000 > 18 3.64000 osd.18 down 0 1.00000 > 19 3.64000 osd.19 down 0 1.00000 > 20 3.64000 osd.20 down 0 1.00000 > 21 3.64000 osd.21 down 0 1.00000 > 22 3.64000 osd.22 down 0 1.00000 > 23 3.64000 osd.23 down 0 1.00000 > 24 3.64000 osd.24 down 0 1.00000 > 25 3.64000 osd.25 down 0 1.00000 > 26 3.64000 osd.26 down 0 1.00000 > 27 3.64000 osd.27 down 0 1.00000 > 28 3.64000 osd.28 down 0 1.00000 > 29 3.64000 osd.29 down 0 1.00000 > 30 3.64000 osd.30 down 0 1.00000 > 31 3.64000 osd.31 down 0 1.00000 > 32 3.64000 osd.32 down 0 1.00000 > 33 3.64000 osd.33 down 0 1.00000 > -3 123.75998 host lts-osd2 > 34 3.64000 osd.34 down 0 1.00000 > 35 3.64000 osd.35 down 0 1.00000 > 36 3.64000 osd.36 down 0 1.00000 > 37 3.64000 osd.37 down 0 1.00000 > 38 3.64000 osd.38 down 0 1.00000 > 39 3.64000 osd.39 down 0 1.00000 > 40 3.64000 osd.40 down 0 1.00000 > 41 3.64000 osd.41 down 0 1.00000 > 42 3.64000 osd.42 down 0 1.00000 > 43 3.64000 osd.43 down 0 1.00000 > 44 3.64000 osd.44 down 0 1.00000 > 45 3.64000 osd.45 down 0 1.00000 > 46 3.64000 osd.46 down 0 1.00000 > 47 3.64000 osd.47 down 0 1.00000 > 48 3.64000 osd.48 down 0 1.00000 > 49 3.64000 osd.49 down 0 1.00000 > 50 3.64000 osd.50 down 0 1.00000 > 51 3.64000 osd.51 down 0 1.00000 > 52 3.64000 osd.52 down 0 1.00000 > 53 3.64000 osd.53 down 0 1.00000 > 54 3.64000 osd.54 down 0 1.00000 > 55 3.64000 osd.55 down 0 1.00000 > 56 3.64000 osd.56 down 0 1.00000 > 57 3.64000 osd.57 down 0 1.00000 > 58 3.64000 osd.58 down 0 1.00000 > 59 3.64000 osd.59 down 0 1.00000 > 60 3.64000 osd.60 down 0 1.00000 > 61 3.64000 osd.61 down 0 1.00000 > 62 3.64000 osd.62 down 0 1.00000 > 63 3.64000 osd.63 down 0 1.00000 > 64 3.64000 osd.64 down 0 1.00000 > 65 3.64000 osd.65 down 0 1.00000 > 66 3.64000 osd.66 down 0 1.00000 > 67 3.64000 osd.67 down 0 1.00000 > -4 123.75998 host lts-osd3 > 68 3.64000 osd.68 down 0 1.00000 > 69 3.64000 osd.69 down 0 1.00000 > 70 3.64000 osd.70 down 0 1.00000 > 71 3.64000 osd.71 down 0 1.00000 > 72 3.64000 osd.72 up 1.00000 1.00000 > 73 3.64000 osd.73 up 1.00000 1.00000 > 74 3.64000 osd.74 up 1.00000 1.00000 > 75 3.64000 osd.75 up 1.00000 1.00000 > 76 3.64000 osd.76 up 1.00000 1.00000 > 77 3.64000 osd.77 up 1.00000 1.00000 > 78 3.64000 osd.78 up 1.00000 1.00000 > 79 3.64000 osd.79 up 1.00000 1.00000 > 80 3.64000 osd.80 up 1.00000 1.00000 > 81 3.64000 osd.81 up 1.00000 1.00000 > 82 3.64000 osd.82 up 1.00000 1.00000 > 83 3.64000 osd.83 up 1.00000 1.00000 > 84 3.64000 osd.84 up 1.00000 1.00000 > 85 3.64000 osd.85 up 1.00000 1.00000 > 86 3.64000 osd.86 up 1.00000 1.00000 > 87 3.64000 osd.87 up 1.00000 1.00000 > 88 3.64000 osd.88 up 1.00000 1.00000 > 89 3.64000 osd.89 up 1.00000 1.00000 > 90 3.64000 osd.90 up 1.00000 1.00000 > 91 3.64000 osd.91 up 1.00000 1.00000 > 92 3.64000 osd.92 up 1.00000 1.00000 > 93 3.64000 osd.93 up 1.00000 1.00000 > 94 3.64000 osd.94 up 1.00000 1.00000 > 95 3.64000 osd.95 up 1.00000 1.00000 > 96 3.64000 osd.96 up 1.00000 1.00000 > 97 3.64000 osd.97 up 1.00000 1.00000 > 98 3.64000 osd.98 up 1.00000 1.00000 > 99 3.64000 osd.99 up 1.00000 1.00000 > 100 3.64000 osd.100 up 1.00000 1.00000 > 101 3.64000 osd.101 up 1.00000 1.00000 > > > We have rebooted the cluster, all nodes are confirmed to have the > infernalis release, but nothing we do will get a osd back up and in the > cluster. > > >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
