[ceph-users] Re: [v19.2.3] All OSDs are not created with a managed spec

Eugen Block Mon, 18 Aug 2025 04:43:33 -0700

Hi,

the last message you sent is normal for an OSD that hasn't reportedback its status yet.

I would check the ceph-volume.log and the cephadm.log, maybe an OSDlog as well if it tried to boot. If this is a test cluster, did youproperly wipe all disks before trying to deploy OSDs? For me 'cephadmceph-volume lvm zap --destroy /dev/sdx /dev/sdy /dev/sdz ...' (locallyon the host) has been working great for years now. You can zap themwith the orchestrator as well, but only one disk a time, so a for loopwould be useful.

There have been reports every now and then from users who tried todeploy many disks per host, I don't have a link available right now.And I haven't had the chance yet to deploy multiple hosts/many OSDswith 19.2.3, so there might be a regression in ceph-volume.


Regards,
Eugen

Zitat von Gilles Mocellin <gilles.mocel...@nuagelibre.org>:

Le 2025-08-18 11:47, Gilles Mocellin a écrit :
Le 2025-08-18 11:30, Gilles Mocellin a écrit :
On Hi Cephers,

I'm building a new Squid cluster with cephadm on Ubuntu 24.04.
After expanding my cluster in the Dashboard (adding my 7 hosts),
I choose throughput_optimized proflie wich create a generic specfor hybrid HDD/SSD :
service_type: osd
service_id: throughput_optimized
service_name: osd.throughput_optimized
placement:
 host_pattern: '*'
spec:
 data_devices:
   rotational: 1
 db_devices:
   rotational: 0
 encrypted: true
 filter_logic: AND
 objectstore: bluestore
The cluster is for a LAB environment, on each of the 7 nodes, Ihave 17 HDD SAS 1.2TB drives and 1 SSD SAS Enterprise 400GB drive.On my first try, only 28 OSD where created (out of the 119), theothers appeared as down, but they won't start, I didn't findsystemd units created on the hosts.But, the VGs and LVs where created, there are 17 LVs on the SSDfor WAL/DB of the 17 HDD (yes, small : 29GB).
On my second try, it creates 72 OSDs, still it stops, and nevertries to continue, or re-create the down OSDs.
I didn't manage to find them, but it seems I saw some OSD creationtimeout in the logs...
What can I do to have my missing OSD created ?
Some additional information :

ceph -s
 cluster:
   id:     3ebf83bf-7927-11f0-9f3a-246e96bd90a4
   health: HEALTH_OK

 services:
mon: 5 daemons, quorumfidcl-lyo1-sto-sds-lab-01,fidcl-lyo1-sto-sds-lab-02,fidcl-lyo1-sto-sds-lab-03,fidcl-lyo1-sto-sds-lab-04,fidcl-lyo1-sto-sds-lab-05 (age3d)mgr: fidcl-lyo1-sto-sds-lab-01.ovbjpb(active, since 3d),standbys: fidcl-lyo1-sto-sds-lab-02.nqdhpl,fidcl-lyo1-sto-sds-lab-03.cizytz
   osd: 119 osds: 72 up (since 3d), 89 in (since 41m)

 data:
   pools:   1 pools, 1 pgs
   objects: 2 objects, 769 KiB
   usage:   1.5 TiB used, 79 TiB / 80 TiB avail
   pgs:     1 active+clean


One of the "missing" (not fully created) OSD, is present, but not found :

ceph osd find 6
{
   "osd": 6,
   "addrs": {
       "addrvec": []
   },
   "osd_fsid": "f8745284-8026-4713-8329-cd57cb6842f7",
   "crush_location": {}
}

ceph osd info 6
osd.6 down out weight 0 up_from 0 up_thru 0 down_at 0last_clean_interval [0,0) autoout,exists,newf8745284-8026-4713-8329-cd57cb6842f7
The missing OSDs are not chown with the device list command :
ceph device ls | grep osd.6
is empty...
Also, the MGR reports that on all not fully created OSD :
Aug 18 10:09:41 fidcl-lyo1-sto-sds-lab-01 ceph-mgr[4186078]: mgrget_metadata_python Requested missing service osd.99
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: [v19.2.3] All OSDs are not created with a managed spec

Reply via email to