I find that I cannot re-add a disk to a Ceph cluster after the OSD on the disk
is removed. Ceph seems to know about the existence of these disks, but not
about their "host:dev" information:
```
# ceph device ls
DEVICE HOST:DEV DAEMONS WEAR LIFE
EXPECTANCY
SAMSUNG_MZ7L37T6_S6KHNE0T100049
0% <-- should be host01:sda, was osd.0
SAMSUNG_MZ7L37T6_S6KHNE0T100050 host01:sdb osd.1 0%
SAMSUNG_MZ7L37T6_S6KHNE0T100052
0% <-- should be host02:sda
SAMSUNG_MZ7L37T6_S6KHNE0T100053 host01:sde osd.9 0%
SAMSUNG_MZ7L37T6_S6KHNE0T100061 host01:sdf osd.11 0%
SAMSUNG_MZ7L37T6_S6KHNE0T100062
0%
SAMSUNG_MZ7L37T6_S6KHNE0T100063 host01:sdc osd.5 0%
SAMSUNG_MZ7L37T6_S6KHNE0T100064 host01:sdg osd.13 0%
SAMSUNG_MZ7L37T6_S6KHNE0T100065
0%
SAMSUNG_MZ7L37T6_S6KHNE0T100066 host01:sdd osd.7 0%
SAMSUNG_MZ7L37T6_S6KHNE0T100067
0%
SAMSUNG_MZ7L37T6_S6KHNE0T100068
0% <-- should be host02:sdb
SAMSUNG_MZ7L37T6_S6KHNE0T100069
0%
SAMSUNG_MZ7L37T6_S6KHNE0T100070
0%
SAMSUNG_MZ7L37T6_S6KHNE0T100071
0%
SAMSUNG_MZ7L37T6_S6KHNE0T100072 host01:sdh osd.15 0%
SAMSUNG_MZQL27T6HBLA-00B7C_S6CVNG0T321600 host03:nvme4n1 osd.20 0%
... obmitted ...
SAMSUNG_MZQL27T6HBLA-00B7C_S6CVNG0T321608 host03:nvme8n1 osd.22 0%
```
For disk "SAMSUNG_MZ7L37T6_S6KHNE0T100049", the "HOST:DEV" field is empty,
while I believe it should be "host01:sda", as I have confirmed by running
`smartctl -i /dev/sda" on host01.
I guess the missing information is the reason that OSDs cannot be created,
either manually or automatically on these devices. I have tried:
1. `ceph orch daemon add osd host01:/dev/sda` Prints "Created no osd(s) on host
host01; already created?"
2. `ceph orch apply osd --all-available-devices` adds nothing
I arrived at such situation while testing if draining a host works: I drained
host02, removed it and added it back via:
```
ceph orch host drain host02
ceph orch host rm host02
ceph orch host add host02 <internal_ip> --labels _admin
```
I am running Ceph 17.2.6 on ubuntu. Ceph was deployed via cephadm. FYI the
relavent orchestrator spec for osd:
```
# ceph orch ls osd --export
service_type: osd
service_name: osd
unmanaged: true
spec:
filter_logic: AND
objectstore: bluestore
---
service_type: osd
service_id: all-available-devices
service_name: osd.all-available-devices
placement:
host_pattern: '*'
spec:
data_devices:
all: true
filter_logic: AND
objectstore: bluestore
```
Any thoughts on what maybe wrong here? Is there a way that I can tell Ceph "you
are wrong about the whereabouts of these disks, forget what you know and fetch
disk information afresh"?
Any help much appreciated!
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]