Re: [ceph-users] New OSD missing from part of osd crush tree
On Thu, 10 Aug 2017, John Spray said: > On Thu, Aug 10, 2017 at 4:31 PM, Sean Purdy wrote: > > Luminous 12.1.1 rc And 12.2.1 stable > > We added a new disk and did: > > That worked, created osd.18, OSD has data. > > > > However, mgr output at http://localhost:7000/servers showed > > osd.18 under a blank hostname and not e.g. on the node we attached it to. > > Don't worry about this part. It's a mgr bug that it sometimes fails > to pick up the hostname for a service > (http://tracker.ceph.com/issues/20887) > > John Thanks. This still happens in 12.2.1 (I notice the bug isn't closed). mgrs have been restarted. It is consistently the same OSD that mgr can't find a hostname for. I'd have thought if it were a race condition, then different OSDs would show up detached. Oh well, no biggie right now. Sean ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] New OSD missing from part of osd crush tree
On Thu, Aug 10, 2017 at 4:31 PM, Sean Purdy wrote: > Luminous 12.1.1 rc > > > Our OSD osd.8 failed. So we removed that. > > > We added a new disk and did: > > $ ceph-deploy osd create --dmcrypt --bluestore store02:/dev/sdd > > That worked, created osd.18, OSD has data. > > However, mgr output at http://localhost:7000/servers showed > osd.18 under a blank hostname and not e.g. on the node we attached it to. Don't worry about this part. It's a mgr bug that it sometimes fails to pick up the hostname for a service (http://tracker.ceph.com/issues/20887) John > But it is working. "ceph osd tree" looks OK > > > The problem I see is: > When I do "ceph osd crush tree" I see the items list under the > name:default~hdd tree: > > device_class:hdd > name:store02~hdd > type:host > > but my new drive is missing under this name - there are 5 OSDs, not 6. > > > *However*, if I look further down under the name:default tree > > device_class:"" > name:store02 > type:host > > I see all devices I am expecting, including osd.18 > > > Is this something to worry about? Or is there something needs fixing? > Health is warning for scrubbing reasons. > > > Output of related commands below. > > > Thanks for any help, > > Sean Purdy > > > $ sudo ceph osd tree > ID CLASS WEIGHT TYPE NAMEUP/DOWN REWEIGHT PRI-AFF > -1 32.73651 root default > -3 10.91217 host store01 > 0 hdd 1.81870 osd.0 up 1.0 1.0 > 5 hdd 1.81870 osd.5 up 1.0 1.0 > 6 hdd 1.81870 osd.6 up 1.0 1.0 > 9 hdd 1.81870 osd.9 up 1.0 1.0 > 12 hdd 1.81870 osd.12up 1.0 1.0 > 15 hdd 1.81870 osd.15up 1.0 1.0 > -5 10.91217 host store02 > 1 hdd 1.81870 osd.1 up 1.0 1.0 > 7 hdd 1.81870 osd.7 up 1.0 1.0 > 10 hdd 1.81870 osd.10up 1.0 1.0 > 13 hdd 1.81870 osd.13up 1.0 1.0 > 16 hdd 1.81870 osd.16up 1.0 1.0 > 18 hdd 1.81870 osd.18up 1.0 1.0 > -7 10.91217 host store03 > 2 hdd 1.81870 osd.2 up 1.0 1.0 > 3 hdd 1.81870 osd.3 up 1.0 1.0 > 4 hdd 1.81870 osd.4 up 1.0 1.0 > 11 hdd 1.81870 osd.11up 1.0 1.0 > 14 hdd 1.81870 osd.14up 1.0 1.0 > 17 hdd 1.81870 osd.17up 1.0 1.0 > > > $ sudo ceph osd crush tree > [ > { > "id": -8, > "device_class": "hdd", > "name": "default~hdd", > "type": "root", > "type_id": 10, > "items": [ > { > "id": -2, > "device_class": "hdd", > "name": "store01~hdd", > "type": "host", > "type_id": 1, > "items": [ > { > "id": 0, > "device_class": "hdd", > "name": "osd.0", > "type": "osd", > "type_id": 0, > "crush_weight": 1.818695, > "depth": 2 > }, > { > "id": 5, > "device_class": "hdd", > "name": "osd.5", > "type": "osd", > "type_id": 0, > "crush_weight": 1.818695, > "depth": 2 > }, > { > "id": 6, > "device_class": "hdd", > "name": "osd.6", > "type": "osd", > "type_id": 0, > "crush_weight": 1.818695, > "depth": 2 > }, > { > "id": 9, > "device_class": "hdd", > "name": "osd.9", > "type": "osd", > "type_id": 0, > "crush_weight": 1.818695, > "depth": 2 > }, > { > "id": 12, > "device_class": "hdd", > "name": "osd.12", > "type": "osd", > "type_id": 0, > "crush_weight": 1.818695, > "depth": 2 > }, > { > "id": 15, > "device_class": "hdd", > "name": "osd.15", > "type": "osd", > "type_id": 0, >
Re: [ceph-users] New OSD missing from part of osd crush tree
Sage says a whole bunch of fixes for this have gone in since both then and 12.1.2. We should be pushing out a final 12.1.3 today for people to test on; can you try that and report back once it's out? -Greg On Thu, Aug 10, 2017 at 8:32 AM Sean Purdy wrote: > Luminous 12.1.1 rc > > > Our OSD osd.8 failed. So we removed that. > > > We added a new disk and did: > > $ ceph-deploy osd create --dmcrypt --bluestore store02:/dev/sdd > > That worked, created osd.18, OSD has data. > > However, mgr output at http://localhost:7000/servers showed > osd.18 under a blank hostname and not e.g. on the node we attached it to. > But it is working. "ceph osd tree" looks OK > > > The problem I see is: > When I do "ceph osd crush tree" I see the items list under the > name:default~hdd tree: > > device_class:hdd > name:store02~hdd > type:host > > but my new drive is missing under this name - there are 5 OSDs, not 6. > > > *However*, if I look further down under the name:default tree > > device_class:"" > name:store02 > type:host > > I see all devices I am expecting, including osd.18 > > > Is this something to worry about? Or is there something needs fixing? > Health is warning for scrubbing reasons. > > > Output of related commands below. > > > Thanks for any help, > > Sean Purdy > > > $ sudo ceph osd tree > ID CLASS WEIGHT TYPE NAMEUP/DOWN REWEIGHT PRI-AFF > -1 32.73651 root default > -3 10.91217 host store01 > 0 hdd 1.81870 osd.0 up 1.0 1.0 > 5 hdd 1.81870 osd.5 up 1.0 1.0 > 6 hdd 1.81870 osd.6 up 1.0 1.0 > 9 hdd 1.81870 osd.9 up 1.0 1.0 > 12 hdd 1.81870 osd.12up 1.0 1.0 > 15 hdd 1.81870 osd.15up 1.0 1.0 > -5 10.91217 host store02 > 1 hdd 1.81870 osd.1 up 1.0 1.0 > 7 hdd 1.81870 osd.7 up 1.0 1.0 > 10 hdd 1.81870 osd.10up 1.0 1.0 > 13 hdd 1.81870 osd.13up 1.0 1.0 > 16 hdd 1.81870 osd.16up 1.0 1.0 > 18 hdd 1.81870 osd.18up 1.0 1.0 > -7 10.91217 host store03 > 2 hdd 1.81870 osd.2 up 1.0 1.0 > 3 hdd 1.81870 osd.3 up 1.0 1.0 > 4 hdd 1.81870 osd.4 up 1.0 1.0 > 11 hdd 1.81870 osd.11up 1.0 1.0 > 14 hdd 1.81870 osd.14up 1.0 1.0 > 17 hdd 1.81870 osd.17up 1.0 1.0 > > > $ sudo ceph osd crush tree > [ > { > "id": -8, > "device_class": "hdd", > "name": "default~hdd", > "type": "root", > "type_id": 10, > "items": [ > { > "id": -2, > "device_class": "hdd", > "name": "store01~hdd", > "type": "host", > "type_id": 1, > "items": [ > { > "id": 0, > "device_class": "hdd", > "name": "osd.0", > "type": "osd", > "type_id": 0, > "crush_weight": 1.818695, > "depth": 2 > }, > { > "id": 5, > "device_class": "hdd", > "name": "osd.5", > "type": "osd", > "type_id": 0, > "crush_weight": 1.818695, > "depth": 2 > }, > { > "id": 6, > "device_class": "hdd", > "name": "osd.6", > "type": "osd", > "type_id": 0, > "crush_weight": 1.818695, > "depth": 2 > }, > { > "id": 9, > "device_class": "hdd", > "name": "osd.9", > "type": "osd", > "type_id": 0, > "crush_weight": 1.818695, > "depth": 2 > }, > { > "id": 12, > "device_class": "hdd", > "name": "osd.12", > "type": "osd", > "type_id": 0, > "crush_weight": 1.818695, > "depth": 2 > }, > { > "id": 15, > "device_class": "hdd", > "name": "osd.15", > "type": "osd", >