Question is then why this is happening only on s3 cluster and not also on NO s3 cluster while write io during recovery/backfilling is actually much higher than on s3?
On Tue, Feb 3, 2026 at 3:27 PM Eugen Block via ceph-users < [email protected]> wrote: > I concur with Robert's statement, having HDDs only could explain what > you're describing. Not sure where you get your number from (10 DB devices > max per NVMe), but the docs [0] state to not have more than 15 OSDs per > NVMe: > > > DB/WAL offload (optional) > > 1x SSD partition per HDD OSD 4-5x HDD OSDs per DB/WAL SATA SSD <= 15 HDD > OSDs per DB/WAL NVMe SSD > But you're correct about the SPOF, if one NVMe dies, all OSDs that have > their DB/WAL on that NVMe die as well. > > [0] > > https://docs.ceph.com/en/latest/start/hardware-recommendations/#minimum-hardware-recommendations > > Am Di., 3. Feb. 2026 um 15:00 Uhr schrieb Rok Jaklič via ceph-users < > [email protected]>: > > > We have 28 OSDs per host and we can only have 2 NVMe per host (one being > > used for OS) ... and if I remember correctly there is max 10 OSDs/NVMe > > recommended, that's why we decided to go just for HDD based clusters at > the > > beinging. > > > > We have 2 clusters this way, one being for HPC (no radosgw/s3) and other > > for "users" (radosgw/s3), running over 4 years now ... works ok, > > performance is ok, just we have this problem where we have to do gentle > > reweight of a failed OSDs. > > > > Thanks for the info, we will consider NVMe ... although then there is > SPOF > > for those OSDs which have DB on NVMe? > > > > Rok > > > > On Tue, Feb 3, 2026 at 2:35 PM Robert Sander via ceph-users < > > [email protected]> wrote: > > > > > Am 03.02.26 um 2:31 PM schrieb Rok Jaklič: > > > > On Tue, Feb 3, 2026 at 2:26 PM Robert Sander via ceph-users <ceph- > > > > [email protected] <mailto:[email protected]>> wrote: > > > > > > > >> Am 03.02.26 um 2:18 PM schrieb Rok Jaklič via ceph-users: > > > >> > > > >> > 2 OSD(s) experiencing slow operations in > BlueStore > > > >> > 2 OSD(s) experiencing stalled read in db device > of > > > >> BlueFS > > > >> > > > >> Are your OSDs HDD only? > > > > > > > > > > > Yes. > > > > > > > > Does not affect users much. Usually those messages appear when we are > > > > reweighting and changing failed disks. > > > > > > These HDDs will be maxed out with the recovery work and cannot serve > > > anything else any more. > > > > > > I have seen HDD only clusters going into the "spiral of death" because > > > the HDDs cannot answer fast enough. OSDs randomly dropping out making > > > the whole system unstable. > > > > > > The RocksDB is such a random IO application that it is not suitable for > > > HDDs. It should always be put on flash storage (SSD/NVMe). > > > > > > Regards > > > -- > > > Robert Sander > > > Linux Consultant > > > > > > Heinlein Consulting GmbH > > > Schwedter Str. 8/9b, 10119 Berlin > > > > > > https://www.heinlein-support.de > > > > > > Tel: +49 30 405051 - 0 > > > Fax: +49 30 405051 - 19 > > > > > > Amtsgericht Berlin-Charlottenburg - HRB 220009 B > > > Geschäftsführer: Peer Heinlein - Sitz: Berlin > > > _______________________________________________ > > > ceph-users mailing list -- [email protected] > > > To unsubscribe send an email to [email protected] > > > > > _______________________________________________ > > ceph-users mailing list -- [email protected] > > To unsubscribe send an email to [email protected] > > > _______________________________________________ > ceph-users mailing list -- [email protected] > To unsubscribe send an email to [email protected] > _______________________________________________ ceph-users mailing list -- [email protected] To unsubscribe send an email to [email protected]
