Re: [ceph-users] New OSD missing from part of osd crush tree

2017-09-29 Thread Sean Purdy
On Thu, 10 Aug 2017, John Spray said:
> On Thu, Aug 10, 2017 at 4:31 PM, Sean Purdy  wrote:
> > Luminous 12.1.1 rc

And 12.2.1 stable

> > We added a new disk and did:

> > That worked, created osd.18, OSD has data.
> >
> > However, mgr output at http://localhost:7000/servers showed
> > osd.18 under a blank hostname and not e.g. on the node we attached it to.
> 
> Don't worry about this part.  It's a mgr bug that it sometimes fails
> to pick up the hostname for a service
> (http://tracker.ceph.com/issues/20887)
> 
> John

Thanks.  This still happens in 12.2.1 (I notice the bug isn't closed).  mgrs 
have been restarted. It is consistently the same OSD that mgr can't find a 
hostname for.  I'd have thought if it were a race condition, then different 
OSDs would show up detached.

Oh well, no biggie right now.


Sean
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] New OSD missing from part of osd crush tree

2017-08-10 Thread John Spray
On Thu, Aug 10, 2017 at 4:31 PM, Sean Purdy  wrote:
> Luminous 12.1.1 rc
>
>
> Our OSD osd.8 failed.  So we removed that.
>
>
> We added a new disk and did:
>
> $ ceph-deploy osd create  --dmcrypt --bluestore store02:/dev/sdd
>
> That worked, created osd.18, OSD has data.
>
> However, mgr output at http://localhost:7000/servers showed
> osd.18 under a blank hostname and not e.g. on the node we attached it to.

Don't worry about this part.  It's a mgr bug that it sometimes fails
to pick up the hostname for a service
(http://tracker.ceph.com/issues/20887)

John

> But it is working.  "ceph osd tree" looks OK
>
>
> The problem I see is:
> When I do "ceph osd crush tree" I see the items list under the 
> name:default~hdd tree:
>
> device_class:hdd
> name:store02~hdd
> type:host
>
> but my new drive is missing under this name - there are 5 OSDs, not 6.
>
>
> *However*, if I look further down under the name:default tree
>
> device_class:""
> name:store02
> type:host
>
> I see all devices I am expecting, including osd.18
>
>
> Is this something to worry about?  Or is there something needs fixing?  
> Health is warning for scrubbing reasons.
>
>
> Output of related commands below.
>
>
> Thanks for any help,
>
> Sean Purdy
>
>
> $ sudo ceph osd tree
> ID CLASS WEIGHT   TYPE NAMEUP/DOWN REWEIGHT PRI-AFF
> -1   32.73651 root default
> -3   10.91217 host store01
>  0   hdd  1.81870 osd.0 up  1.0 1.0
>  5   hdd  1.81870 osd.5 up  1.0 1.0
>  6   hdd  1.81870 osd.6 up  1.0 1.0
>  9   hdd  1.81870 osd.9 up  1.0 1.0
> 12   hdd  1.81870 osd.12up  1.0 1.0
> 15   hdd  1.81870 osd.15up  1.0 1.0
> -5   10.91217 host store02
>  1   hdd  1.81870 osd.1 up  1.0 1.0
>  7   hdd  1.81870 osd.7 up  1.0 1.0
> 10   hdd  1.81870 osd.10up  1.0 1.0
> 13   hdd  1.81870 osd.13up  1.0 1.0
> 16   hdd  1.81870 osd.16up  1.0 1.0
> 18   hdd  1.81870 osd.18up  1.0 1.0
> -7   10.91217 host store03
>  2   hdd  1.81870 osd.2 up  1.0 1.0
>  3   hdd  1.81870 osd.3 up  1.0 1.0
>  4   hdd  1.81870 osd.4 up  1.0 1.0
> 11   hdd  1.81870 osd.11up  1.0 1.0
> 14   hdd  1.81870 osd.14up  1.0 1.0
> 17   hdd  1.81870 osd.17up  1.0 1.0
>
>
> $ sudo ceph osd crush tree
> [
> {
> "id": -8,
> "device_class": "hdd",
> "name": "default~hdd",
> "type": "root",
> "type_id": 10,
> "items": [
> {
> "id": -2,
> "device_class": "hdd",
> "name": "store01~hdd",
> "type": "host",
> "type_id": 1,
> "items": [
> {
> "id": 0,
> "device_class": "hdd",
> "name": "osd.0",
> "type": "osd",
> "type_id": 0,
> "crush_weight": 1.818695,
> "depth": 2
> },
> {
> "id": 5,
> "device_class": "hdd",
> "name": "osd.5",
> "type": "osd",
> "type_id": 0,
> "crush_weight": 1.818695,
> "depth": 2
> },
> {
> "id": 6,
> "device_class": "hdd",
> "name": "osd.6",
> "type": "osd",
> "type_id": 0,
> "crush_weight": 1.818695,
> "depth": 2
> },
> {
> "id": 9,
> "device_class": "hdd",
> "name": "osd.9",
> "type": "osd",
> "type_id": 0,
> "crush_weight": 1.818695,
> "depth": 2
> },
> {
> "id": 12,
> "device_class": "hdd",
> "name": "osd.12",
> "type": "osd",
> "type_id": 0,
> "crush_weight": 1.818695,
> "depth": 2
> },
> {
> "id": 15,
> "device_class": "hdd",
> "name": "osd.15",
> "type": "osd",
> "type_id": 0,
> 

Re: [ceph-users] New OSD missing from part of osd crush tree

2017-08-10 Thread Gregory Farnum
Sage says a whole bunch of fixes for this have gone in since both then and
12.1.2. We should be pushing out a final 12.1.3 today for people to test
on; can you try that and report back once it's out?
-Greg

On Thu, Aug 10, 2017 at 8:32 AM Sean Purdy  wrote:

> Luminous 12.1.1 rc
>
>
> Our OSD osd.8 failed.  So we removed that.
>
>
> We added a new disk and did:
>
> $ ceph-deploy osd create  --dmcrypt --bluestore store02:/dev/sdd
>
> That worked, created osd.18, OSD has data.
>
> However, mgr output at http://localhost:7000/servers showed
> osd.18 under a blank hostname and not e.g. on the node we attached it to.
> But it is working.  "ceph osd tree" looks OK
>
>
> The problem I see is:
> When I do "ceph osd crush tree" I see the items list under the
> name:default~hdd tree:
>
> device_class:hdd
> name:store02~hdd
> type:host
>
> but my new drive is missing under this name - there are 5 OSDs, not 6.
>
>
> *However*, if I look further down under the name:default tree
>
> device_class:""
> name:store02
> type:host
>
> I see all devices I am expecting, including osd.18
>
>
> Is this something to worry about?  Or is there something needs fixing?
> Health is warning for scrubbing reasons.
>
>
> Output of related commands below.
>
>
> Thanks for any help,
>
> Sean Purdy
>
>
> $ sudo ceph osd tree
> ID CLASS WEIGHT   TYPE NAMEUP/DOWN REWEIGHT PRI-AFF
> -1   32.73651 root default
> -3   10.91217 host store01
>  0   hdd  1.81870 osd.0 up  1.0 1.0
>  5   hdd  1.81870 osd.5 up  1.0 1.0
>  6   hdd  1.81870 osd.6 up  1.0 1.0
>  9   hdd  1.81870 osd.9 up  1.0 1.0
> 12   hdd  1.81870 osd.12up  1.0 1.0
> 15   hdd  1.81870 osd.15up  1.0 1.0
> -5   10.91217 host store02
>  1   hdd  1.81870 osd.1 up  1.0 1.0
>  7   hdd  1.81870 osd.7 up  1.0 1.0
> 10   hdd  1.81870 osd.10up  1.0 1.0
> 13   hdd  1.81870 osd.13up  1.0 1.0
> 16   hdd  1.81870 osd.16up  1.0 1.0
> 18   hdd  1.81870 osd.18up  1.0 1.0
> -7   10.91217 host store03
>  2   hdd  1.81870 osd.2 up  1.0 1.0
>  3   hdd  1.81870 osd.3 up  1.0 1.0
>  4   hdd  1.81870 osd.4 up  1.0 1.0
> 11   hdd  1.81870 osd.11up  1.0 1.0
> 14   hdd  1.81870 osd.14up  1.0 1.0
> 17   hdd  1.81870 osd.17up  1.0 1.0
>
>
> $ sudo ceph osd crush tree
> [
> {
> "id": -8,
> "device_class": "hdd",
> "name": "default~hdd",
> "type": "root",
> "type_id": 10,
> "items": [
> {
> "id": -2,
> "device_class": "hdd",
> "name": "store01~hdd",
> "type": "host",
> "type_id": 1,
> "items": [
> {
> "id": 0,
> "device_class": "hdd",
> "name": "osd.0",
> "type": "osd",
> "type_id": 0,
> "crush_weight": 1.818695,
> "depth": 2
> },
> {
> "id": 5,
> "device_class": "hdd",
> "name": "osd.5",
> "type": "osd",
> "type_id": 0,
> "crush_weight": 1.818695,
> "depth": 2
> },
> {
> "id": 6,
> "device_class": "hdd",
> "name": "osd.6",
> "type": "osd",
> "type_id": 0,
> "crush_weight": 1.818695,
> "depth": 2
> },
> {
> "id": 9,
> "device_class": "hdd",
> "name": "osd.9",
> "type": "osd",
> "type_id": 0,
> "crush_weight": 1.818695,
> "depth": 2
> },
> {
> "id": 12,
> "device_class": "hdd",
> "name": "osd.12",
> "type": "osd",
> "type_id": 0,
> "crush_weight": 1.818695,
> "depth": 2
> },
> {
> "id": 15,
> "device_class": "hdd",
> "name": "osd.15",
> "type": "osd",
>