[ceph-users] Re: 18.2.2: Upgrade not starting (ceph orch upgrade)

Michel Jouvin Wed, 30 Apr 2025 07:01:16 -0700

Hi,

Thanks for all the feedback and suggestions. Summary of the summary:after stopping the removal for the OSD waiting to be zapped (because ofthe no longer available disk), the upgrade started immediately and ranwell. The cluster is now running 18.2.6! And as said previously byEugen, I confirm that in 18.2.6, removed OSDs are no longer consideredstray daemons. I still have the feeling that Ceph could give more usefulinformation if:

- a cephadm message at INFO level (and visible with 'ceph orch upgradestatus' would report that the upgrade cannot proceed because ofdescribed reason. This information could be given once, a few minutesafter entering the upgrade command is no daemon has been upgraded yet,for example.

- a message at INFO level was informing that the zap operation failed(suggesting to use DEBUG level for more information)


About Anthony's last question, yes the 2 OSDs were destroyed as showed by:

# ceph osd tree|grep destroyed
253    hdd    16.37108                  osd.253 destroyed         0  1.00000
381    hdd    16.37108                  osd.381 destroyed         0  1.00000

@Eugen regarding what I said about osd.381 being picked up by Ceph toreplace the failed osd.381 OSD, I think it is the conjunction of thefact that osd.all-available-devices service placement was not set tounmanaged (something we tend to do normally but as we add a few serversrecently we changed it and forgot to set it back to unmanaged) and thatin the initial removal I zapped the device. Because of this, the deviceappeared to be free for use... May be it should be better documentedthat you should not zap a device intended for definitive removal if youdon't have osd.all-available-devices service placement was set tounmanaged...


Thanks again. Best regards,

Michel

Le 30/04/2025 à 15:41, Eugen Block a écrit :

Hm, I thought there was an excerpt from the osd tree, but apparentlynot? Could you then please confirm that the OSDs are in fact marked asdestroyed in the osd tree?
Zitat von Anthony D'Atri <anthony.da...@gmail.com>:
I'm not entirely sure what the orchestrator will do except forclearing the pending state, and since the OSDs are already marked asdestroyed in the crush tree,
Do we know that they are? The thread shows some log messages, butnot unless I’m missing it evidence that they were marked. When I raninto a similar issue recently, they were not marked destroyed in theCRUSH tree.
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: 18.2.2: Upgrade not starting (ceph orch upgrade)

Reply via email to