Hi Ceph Users,

I am planning a major upgrade for our production cluster from Reef (18.2.7)
to Squid (19.2.3) and would like to seek advice regarding stability and
potential risks.

Infrastructure Overview:

+ Deployment: Cephadm.
+ Cluster Size: 7 Nodes total.
+ Hardware/Virtualization: Each node is a Virtual Machine hosted on Proxmox.
+ OSD Layout: 7 OSDs total (7 OSD per node).
+ Other Daemons: 5 MONs, 7 MDSs, 3 MGRs. Services include RGW, CephFS, and
Block Devices.
+ Pool Type: Replicated.

The Concern: We are currently running stable on Reef 18.2.7. However, I
have been following recent discussions on the mailing list and tracker
regarding critical failures when upgrading to Squid, specifically
concerning OSD crashes and data corruption.
I am particularly worried about the issues reported in these threads, where
users experienced failures during or after the upgrade:
+ https://www.mail-archive.com/[email protected]/msg30399.html
+ https://www.mail-archive.com/[email protected]/msg31238.html
+ https://tracker.ceph.com/issues/70390

Given our deployment topology and the jump to a new major version (Squid),
I have a few questions:
+ Stability & Success Stories: Is Ceph Squid 19.2.3 considered safe
regarding the OSD crash/corruption bugs mentioned in the links above? Has
anyone in the community successfully completed the upgrade from 18.2.7 to
19.2.3 without issues? Confirmation of a clean upgrade path would be very
reassuring.
+ Upgrade Path: Are there any known regressions or critical "gotchas" when
moving from 18.2.7 directly to 19.2.3 in a virtualized environment?

Any experiences or warnings from those running Squid in similar
environments would be greatly appreciated.

Thank you,
Van Tran
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to