[ceph-users] Re: radosgw (s3) stops when setting weight to 0 on one OSD

Eugen Block via ceph-users Tue, 03 Feb 2026 06:28:03 -0800

I concur with Robert's statement, having HDDs only could explain what
you're describing. Not sure where you get your number from (10 DB devices
max per NVMe), but the docs [0] state to not have more than 15 OSDs per
NVMe:


> DB/WAL offload (optional)

1x SSD partition per HDD OSD 4-5x HDD OSDs per DB/WAL SATA SSD <= 15 HDD
OSDs per DB/WAL NVMe SSD
But you're correct about the SPOF, if one NVMe dies, all OSDs that have
their DB/WAL on that NVMe die as well.

[0]
https://docs.ceph.com/en/latest/start/hardware-recommendations/#minimum-hardware-recommendations

Am Di., 3. Feb. 2026 um 15:00 Uhr schrieb Rok Jaklič via ceph-users <
[email protected]>:

> We have 28 OSDs per host and we can only have 2 NVMe per host (one being
> used for OS) ... and if I remember correctly there is max 10 OSDs/NVMe
> recommended, that's why we decided to go just for HDD based clusters at the
> beinging.
>
> We have 2 clusters this way, one being for HPC (no radosgw/s3) and other
> for "users" (radosgw/s3), running over 4 years now ... works ok,
> performance is ok, just we have this problem where we have to do gentle
> reweight of a failed OSDs.
>
> Thanks for the info, we will consider NVMe ... although then there is SPOF
> for those OSDs which have DB on NVMe?
>
> Rok
>
> On Tue, Feb 3, 2026 at 2:35 PM Robert Sander via ceph-users <
> [email protected]> wrote:
>
> > Am 03.02.26 um 2:31 PM schrieb Rok Jaklič:
> > > On Tue, Feb 3, 2026 at 2:26 PM Robert Sander via ceph-users <ceph-
> > > [email protected] <mailto:[email protected]>> wrote:
> > >
> > >>     Am 03.02.26 um 2:18 PM schrieb Rok Jaklič via ceph-users:
> > >>
> > >>      >              2 OSD(s) experiencing slow operations in BlueStore
> > >>      >              2 OSD(s) experiencing stalled read in db device of
> > >>     BlueFS
> > >>
> > >>     Are your OSDs HDD only?
> > >
> >
> > > Yes.
> > >
> > > Does not affect users much. Usually those messages appear when we are
> > > reweighting and changing failed disks.
> >
> > These HDDs will be maxed out with the recovery work and cannot serve
> > anything else any more.
> >
> > I have seen HDD only clusters going into the "spiral of death" because
> > the HDDs cannot answer fast enough. OSDs randomly dropping out making
> > the whole system unstable.
> >
> > The RocksDB is such a random IO application that it is not suitable for
> > HDDs. It should always be put on flash storage (SSD/NVMe).
> >
> > Regards
> > --
> > Robert Sander
> > Linux Consultant
> >
> > Heinlein Consulting GmbH
> > Schwedter Str. 8/9b, 10119 Berlin
> >
> > https://www.heinlein-support.de
> >
> > Tel: +49 30 405051 - 0
> > Fax: +49 30 405051 - 19
> >
> > Amtsgericht Berlin-Charlottenburg - HRB 220009 B
> > Geschäftsführer: Peer Heinlein - Sitz: Berlin
> > _______________________________________________
> > ceph-users mailing list -- [email protected]
> > To unsubscribe send an email to [email protected]
> >
> _______________________________________________
> ceph-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
>
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] Re: radosgw (s3) stops when setting weight to 0 on one OSD

Reply via email to