Did you do any of that testing to involve a degraded cluster, backfilling,
peering, etc? A healthy cluster running normally uses sometimes 4x less
memory and CPU resources as a cluster consistently peering and degraded.

On Sat, Aug 12, 2017, 2:40 PM Nick Fisk <n...@fisk.me.uk> wrote:

> I was under the impression the memory requirements for Bluestore would be
> around 2-3GB per OSD regardless of capacity.
>
> CPU wise, I would lean towards working out how much total Ghz you require
> and then get whatever CPU you need to get there, but with a preference of
> Ghz over cores. Yes, there will be a slight overhead to having more threads
> running on a lower number of cores, but I believe this is fairly minimal in
> comparison to the speed boost obtained by the single threaded portion of
> the
> data path in each OSD from running on a faster Ghz core. Each PG takes a
> lock for each operation and so any other requests for the same PG will
> queue
> up and be processed sequentially. The faster you can process through this
> stage the better. I'm pretty sure if you graphed PG activity on an average
> cluster, you would see a high skew to a certain number of PG's being hit
> more often than others. I think Mark N has been experiencing the effects of
> the PG locking in recent tests.
>
> Also don't forget to make sure your CPUs are running at c-state C1 and max
> Freq. This can sometimes give up to a 4x reduction in latency.
>
> Also, if you look at the number of threads running on a OSD node, it will
> be
> in the 10's of 100's of threads, each OSD process itself has several
> threads. So don't think that 12 OSD's=12 core processor.
>
> I did some tests to measure cpu usage per IO, which you may find useful.
>
> http://www.sys-pro.co.uk/how-many-mhz-does-a-ceph-io-need/
>
> I can max out 12x7.2k disks on a E3 1240 CPU and its only running at about
> 15-20%.
>
> I haven't done any proper Bluestore tests, but from some rough testing the
> CPU usage wasn't too dissimilar from Filestore.
>
> Depending on if you are running hdd's or ssd's and how many per node. I
> would possibly look at the single socket E3's or E5's.
>
> Although saying that, the recent AMD and Intel announcements also have some
> potentially interesting single socket Ceph potentials in the mix.
>
> Hope that helps.
>
> Nick
>
> > -----Original Message-----
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> > Of Stijn De Weirdt
> > Sent: 12 August 2017 14:41
> > To: David Turner <drakonst...@gmail.com>; ceph-users@lists.ceph.com
> > Subject: Re: [ceph-users] luminous/bluetsore osd memory requirements
> >
> > hi david,
> >
> > sure i understand that. but how bad does it get when you oversubscribe
> > OSDs? if context switching itself is dominant, then using HT should
> > allow to run double the amount of OSDs on same CPU (on OSD per HT
> > core); but if the issue is actual cpu cycles, HT won't help that much
> > either (1 OSD per HT core vs 2 OSD per phys core).
> >
> > i guess the reason for this is that OSD processes have lots of threads?
> >
> > maybe i can run some tests on a ceph test cluster myself ;)
> >
> > stijn
> >
> >
> > On 08/12/2017 03:13 PM, David Turner wrote:
> > > The reason for an entire core peer osd is that it's trying to avoid
> > > context switching your CPU to death. If you have a quad-core
> > > processor with HT, I wouldn't recommend more than 8 osds on the box.
> > > I probably would go with 7 myself to keep one core available for
> > > system operations. This recommendation has nothing to do with GHz.
> > > Higher GHz per core will likely improve your cluster latency. Of
> > > course if your use case says that you only need very minimal
> > > through-put... There is no need to hit or exceed the recommendation.
> > > The number of cores recommendation is not changing for bluestore. It
> > > might add a recommendation of how fast your processor should be...
> > > But making it based on how much GHz per TB is an invitation to context
> switch to death.
> > >
> > > On Sat, Aug 12, 2017, 8:40 AM Stijn De Weirdt
> > > <stijn.dewei...@ugent.be>
> > > wrote:
> > >
> > >> hi all,
> > >>
> > >> thanks for all the feedback. it's clear we should stick to the
> > >> 1GB/TB for the memory.
> > >>
> > >> any (changes to) recommendation for the CPU? in particular, is it
> > >> still the rather vague "1 HT core per OSD" (or was it "1 1Ghz HT
> > >> core per OSD"? it would be nice if we had some numbers like
> > >> required specint per TB and/or per Gbs. also any indication how
> > >> much more cpu EC uses (10%, 100%, ...)?
> > >>
> > >> i'm aware that this also depeneds on the use case, but i'll take
> > >> any pointers i can get. we will probably end up overprovisioning,
> > >> but it would be nice if we can avoid a whole cpu (32GB dimms are
> > >> cheap, so lots of ram with single socket is really possible these
> days).
> > >>
> > >> stijn
> > >>
> > >> On 08/10/2017 05:30 PM, Gregory Farnum wrote:
> > >>> This has been discussed a lot in the performance meetings so I've
> > >>> added Mark to discuss. My naive recollection is that the
> > >>> per-terabyte recommendation will be more realistic  than it was in
> > >>> the past (an effective increase in memory needs), but also that it
> > >>> will be under much better control than previously.
> > >>>
> > >>> On Thu, Aug 10, 2017 at 1:35 AM Stijn De Weirdt
> > >>> <stijn.dewei...@ugent.be
> > >>>
> > >>> wrote:
> > >>>
> > >>>> hi all,
> > >>>>
> > >>>> we are planning to purchse new OSD hardware, and we are wondering
> > >>>> if for upcoming luminous with bluestore OSDs, anything wrt the
> > >>>> hardware recommendations from
> > >>>> http://docs.ceph.com/docs/master/start/hardware-recommendations/
> > >>>> will be different, esp the memory/cpu part. i understand from
> > >>>> colleagues that the async messenger makes a big difference in
> > >>>> memory usage (maybe also cpu load?); but we are also interested
> > >>>> in
> > the "1GB of RAM per TB"
> > >>>> recommendation/requirement.
> > >>>>
> > >>>> many thanks,
> > >>>>
> > >>>> stijn
> > >>>> _______________________________________________
> > >>>> ceph-users mailing list
> > >>>> ceph-users@lists.ceph.com
> > >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > >>>>
> > >>>
> > >> _______________________________________________
> > >> ceph-users mailing list
> > >> ceph-users@lists.ceph.com
> > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > >>
> > >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to