Hi Gilles,

As for me, the timeouts you experience repeatedly since your first tests are not normal at all and should be fixed before anything else. Ceph, because of its very distributed architecture, is very sensitive to network problems and I'm afraid you cannot expect it to work properly until you have a reliable network connection.

I tend to remember that you said your cluster was deployed on a virtualized infrastructure. Are you sure that there is not something in the virtualization layer that prevents network connections to work properly?

Best regards,

Michel
Sent from my mobile
Le 21 août 2025 12:02:02 Gilles Mocellin <gilles.mocel...@nuagelibre.org> a écrit :

Hi,

Having timeout problems with my cluster, I try to finsh, recreate OSDs
that failed.

My config is hybrid with 17 HDD and 1 SSD by server.
My OSD spec is the standard :

service_type: osd
service_id: throughput_optimized
service_name: osd.throughput_optimized
placement:
  host_pattern: '*'
spec:
  data_devices:
    rotational: 1
  db_devices:
    rotational: 0
  objectstore: bluestore
  encrypted: true
  filter_logic: AND

When I remove an OSD with :
ceph orch osd rm ID --zap

It is finaly re-created, but without DB device.
And I can see the ceph-volume commands :

Zapping :
cephadm ['--image',
'quay.io/ceph/ceph@sha256:7c69e59beaeea61ca714e71cb84ff6d5e533db7f1fd84143dd9ba6649a5fd2ec',
'--timeout', '2395', 'ceph-volume', '--fsid',
'8ec7575a-7de5-11f0-a78a-246e96bd90a4', '--', 'lvm', 'zap', '--osd-id',
'82', '--destroy']

Recreate :
cephadm ['--env', 'CEPH_VOLUME_OSDSPEC_AFFINITY=throughput_optimized',
'--image',
'quay.io/ceph/ceph@sha256:7c69e59beaeea61ca714e71cb84ff6d5e533db7f1fd84143dd9ba6649a5fd2ec',
'--timeout', '2395', 'ceph-volume', '--fsid',
'8ec7575a-7de5-11f0-a78a-246e96bd90a4', '--config-json', '-', '--',
'lvm', 'batch', '--no-auto', '/dev/sde', '--dmcrypt', '--osd-ids', '82',
'--yes', '--no-systemd']

I think it should have the argument --db-devices ?
I can see that option when the OSD spec was applied, it launch for all
drives on the host :

cephadm ['--env', 'CEPH_VOLUME_OSDSPEC_AFFINITY=throughput_optimized',
'--image',
'quay.io/ceph/ceph@sha256:7c69e59beaeea61ca714e71cb84ff6d5e533db7f1fd84143dd9ba6649a5fd2ec',
'--timeout', '2395', 'ceph-volume', '--fsid',
'8ec7575a-7de5-11f0-a78a-246e96bd90a4', '--config-json', '-', '--',
'lvm', 'batch', '--no-auto', '/dev/sda', '/dev/sdb', '/dev/sdc',
'/dev/sdd', '/dev/sde', '/dev/sdf', '/dev/sdg', '/dev/sdh', '/dev/sdi',
'/dev/sdj', '/dev/sdk', '/dev/sdl', '/dev/sdm', '/dev/sdn', '/dev/sdo',
'/dev/sdp', '/dev/sdq', '--db-devices', '/dev/sdr', '--dmcrypt',
'--yes', '--no-systemd']

There is the --db-devices argument.

Is it a bug or I do it wrong ?

I have several OSD that are now without DB device, I want to recreate
them with the DB device.
And also, I want that every time a new drive is created, it will be with
a DB device, or fails if no room, but not create without DB device if my
OSD SPEC tells it.
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to