Re: [ceph-users] luminous/bluetsore osd memory requirements
On 08/14/2017 02:42 PM, Nick Fisk wrote: -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Ronny Aasen Sent: 14 August 2017 18:55 To: ceph-users@lists.ceph.com Subject: Re: [ceph-users] luminous/bluetsore osd memory requirements On 10.08.2017 17:30, Gregory Farnum wrote: This has been discussed a lot in the performance meetings so I've added Mark to discuss. My naive recollection is that the per-terabyte recommendation will be more realistic than it was in the past (an effective increase in memory needs), but also that it will be under much better control than previously. Is there any way to tune or reduce the memory footprint? perhaps by sacrificing performace ? our jewel cluster osd servers is maxed out on memory. And with the added memory requirements I fear we may not be able to upgrade to luminous/bluestore.. Check out this PR, it shows the settings to set memory used for cache and their defaults https://github.com/ceph/ceph/pull/16157 Hey guys, sorry for the late reply. The gist of it is that memory is used in bluestore is a couple of different ways: 1) various internal buffers and such 2) bluestore specific cache (unencoded onodes, extents, etc) 3) rocksdb block cache 3a) encoded data from bluestore 3b) bloom filters and table indexes 4) other rocksdb memory/etc Right now when you set the bluestore cache size it first favors rocksdb block cache up to 512MB and then start favoring bluestore onode cache after that. Even without bloom filters that seems to improve bluestore performance with small cache sizes. With bloom filters it's likely even more important to feed whatever you can to rocksdb's block cache to keep the index and bloom filters in memory as much as possible. It's unclear right now how quickly we should let the block cache grow as the number of objects increases. Prior to using bloom filters it appeared that favoring the onode cache was better. Now we probably both want to favor the bloom filters and bluestore's onode cache. So the first order of business is to see how changing the bluestore cache size hurts you. Bluestore's default behavior of favoring the rocksdb block cache (and specifically the bloom filters) first is probably still decent but you may want to play around with it if you expect a lot of small objects and limited memory. For really low memory scenarios you could also try reducing the rocksdb buffer sizes, but smaller buffers are going to give you higher write-amp. It's possible this PR may help though: https://github.com/ceph/rocksdb/pull/19 You might be able to lower memory further with smaller PG/OSD maps, but at some point you start hitting diminishing returns. Mark kind regards Ronny Aasen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] luminous/bluetsore osd memory requirements
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Ronny Aasen > Sent: 14 August 2017 18:55 > To: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] luminous/bluetsore osd memory requirements > > On 10.08.2017 17:30, Gregory Farnum wrote: > > This has been discussed a lot in the performance meetings so I've > > added Mark to discuss. My naive recollection is that the per-terabyte > > recommendation will be more realistic than it was in the past (an > > effective increase in memory needs), but also that it will be under > > much better control than previously. > > > Is there any way to tune or reduce the memory footprint? perhaps by > sacrificing performace ? our jewel cluster osd servers is maxed out on > memory. And with the added memory requirements I fear we may not be > able to upgrade to luminous/bluestore.. Check out this PR, it shows the settings to set memory used for cache and their defaults https://github.com/ceph/ceph/pull/16157 > > kind regards > Ronny Aasen > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] luminous/bluetsore osd memory requirements
On 10.08.2017 17:30, Gregory Farnum wrote: This has been discussed a lot in the performance meetings so I've added Mark to discuss. My naive recollection is that the per-terabyte recommendation will be more realistic than it was in the past (an effective increase in memory needs), but also that it will be under much better control than previously. Is there any way to tune or reduce the memory footprint? perhaps by sacrificing performace ? our jewel cluster osd servers is maxed out on memory. And with the added memory requirements I fear we may not be able to upgrade to luminous/bluestore.. kind regards Ronny Aasen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] luminous/bluetsore osd memory requirements
Hi there, can someone share her/his experiences regarding this question? Maybe differentiated according to the different available algorithms? Sat, 12 Aug 2017 14:40:05 +0200 Stijn De Weirdt==> Gregory Farnum , Mark Nelson , "ceph-users@lists.ceph.com" : > also any indication how much more cpu EC uses (10%, > 100%, ...)? I would be interested in the hardware recommendations for the newly introduced ceph-mgr daemon also. The big search engines don't tell me anything about this yet. Thanks in advance Lars ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] luminous/bluetsore osd memory requirements
Hi David, No serious testing but I have various disks fail, nodes go offline…etc over the last 12 months and I’m still only seeing 15-20% CPU max for user+system. From: David Turner [mailto:drakonst...@gmail.com] Sent: 12 August 2017 21:20 To: n...@fisk.me.uk; ceph-users@lists.ceph.com Subject: Re: [ceph-users] luminous/bluetsore osd memory requirements Did you do any of that testing to involve a degraded cluster, backfilling, peering, etc? A healthy cluster running normally uses sometimes 4x less memory and CPU resources as a cluster consistently peering and degraded. On Sat, Aug 12, 2017, 2:40 PM Nick Fisk <n...@fisk.me.uk <mailto:n...@fisk.me.uk> > wrote: I was under the impression the memory requirements for Bluestore would be around 2-3GB per OSD regardless of capacity. CPU wise, I would lean towards working out how much total Ghz you require and then get whatever CPU you need to get there, but with a preference of Ghz over cores. Yes, there will be a slight overhead to having more threads running on a lower number of cores, but I believe this is fairly minimal in comparison to the speed boost obtained by the single threaded portion of the data path in each OSD from running on a faster Ghz core. Each PG takes a lock for each operation and so any other requests for the same PG will queue up and be processed sequentially. The faster you can process through this stage the better. I'm pretty sure if you graphed PG activity on an average cluster, you would see a high skew to a certain number of PG's being hit more often than others. I think Mark N has been experiencing the effects of the PG locking in recent tests. Also don't forget to make sure your CPUs are running at c-state C1 and max Freq. This can sometimes give up to a 4x reduction in latency. Also, if you look at the number of threads running on a OSD node, it will be in the 10's of 100's of threads, each OSD process itself has several threads. So don't think that 12 OSD's=12 core processor. I did some tests to measure cpu usage per IO, which you may find useful. http://www.sys-pro.co.uk/how-many-mhz-does-a-ceph-io-need/ I can max out 12x7.2k disks on a E3 1240 CPU and its only running at about 15-20%. I haven't done any proper Bluestore tests, but from some rough testing the CPU usage wasn't too dissimilar from Filestore. Depending on if you are running hdd's or ssd's and how many per node. I would possibly look at the single socket E3's or E5's. Although saying that, the recent AMD and Intel announcements also have some potentially interesting single socket Ceph potentials in the mix. Hope that helps. Nick > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com > <mailto:ceph-users-boun...@lists.ceph.com> ] On Behalf > Of Stijn De Weirdt > Sent: 12 August 2017 14:41 > To: David Turner <drakonst...@gmail.com <mailto:drakonst...@gmail.com> >; > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > Subject: Re: [ceph-users] luminous/bluetsore osd memory requirements > > hi david, > > sure i understand that. but how bad does it get when you oversubscribe > OSDs? if context switching itself is dominant, then using HT should > allow to run double the amount of OSDs on same CPU (on OSD per HT > core); but if the issue is actual cpu cycles, HT won't help that much > either (1 OSD per HT core vs 2 OSD per phys core). > > i guess the reason for this is that OSD processes have lots of threads? > > maybe i can run some tests on a ceph test cluster myself ;) > > stijn > > > On 08/12/2017 03:13 PM, David Turner wrote: > > The reason for an entire core peer osd is that it's trying to avoid > > context switching your CPU to death. If you have a quad-core > > processor with HT, I wouldn't recommend more than 8 osds on the box. > > I probably would go with 7 myself to keep one core available for > > system operations. This recommendation has nothing to do with GHz. > > Higher GHz per core will likely improve your cluster latency. Of > > course if your use case says that you only need very minimal > > through-put... There is no need to hit or exceed the recommendation. > > The number of cores recommendation is not changing for bluestore. It > > might add a recommendation of how fast your processor should be... > > But making it based on how much GHz per TB is an invitation to context switch to death. > > > > On Sat, Aug 12, 2017, 8:40 AM Stijn De Weirdt > > <stijn.dewei...@ugent.be <mailto:stijn.dewei...@ugent.be> > > > wrote: > > > >> hi all, > >> > >> thanks for all the feedback. it's clear we should stick to the > >> 1GB/TB for the memory. > >> > >> any (changes to) recommendation for the CPU? in particular, i
Re: [ceph-users] luminous/bluetsore osd memory requirements
Did you do any of that testing to involve a degraded cluster, backfilling, peering, etc? A healthy cluster running normally uses sometimes 4x less memory and CPU resources as a cluster consistently peering and degraded. On Sat, Aug 12, 2017, 2:40 PM Nick Fisk <n...@fisk.me.uk> wrote: > I was under the impression the memory requirements for Bluestore would be > around 2-3GB per OSD regardless of capacity. > > CPU wise, I would lean towards working out how much total Ghz you require > and then get whatever CPU you need to get there, but with a preference of > Ghz over cores. Yes, there will be a slight overhead to having more threads > running on a lower number of cores, but I believe this is fairly minimal in > comparison to the speed boost obtained by the single threaded portion of > the > data path in each OSD from running on a faster Ghz core. Each PG takes a > lock for each operation and so any other requests for the same PG will > queue > up and be processed sequentially. The faster you can process through this > stage the better. I'm pretty sure if you graphed PG activity on an average > cluster, you would see a high skew to a certain number of PG's being hit > more often than others. I think Mark N has been experiencing the effects of > the PG locking in recent tests. > > Also don't forget to make sure your CPUs are running at c-state C1 and max > Freq. This can sometimes give up to a 4x reduction in latency. > > Also, if you look at the number of threads running on a OSD node, it will > be > in the 10's of 100's of threads, each OSD process itself has several > threads. So don't think that 12 OSD's=12 core processor. > > I did some tests to measure cpu usage per IO, which you may find useful. > > http://www.sys-pro.co.uk/how-many-mhz-does-a-ceph-io-need/ > > I can max out 12x7.2k disks on a E3 1240 CPU and its only running at about > 15-20%. > > I haven't done any proper Bluestore tests, but from some rough testing the > CPU usage wasn't too dissimilar from Filestore. > > Depending on if you are running hdd's or ssd's and how many per node. I > would possibly look at the single socket E3's or E5's. > > Although saying that, the recent AMD and Intel announcements also have some > potentially interesting single socket Ceph potentials in the mix. > > Hope that helps. > > Nick > > > -Original Message- > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf > > Of Stijn De Weirdt > > Sent: 12 August 2017 14:41 > > To: David Turner <drakonst...@gmail.com>; ceph-users@lists.ceph.com > > Subject: Re: [ceph-users] luminous/bluetsore osd memory requirements > > > > hi david, > > > > sure i understand that. but how bad does it get when you oversubscribe > > OSDs? if context switching itself is dominant, then using HT should > > allow to run double the amount of OSDs on same CPU (on OSD per HT > > core); but if the issue is actual cpu cycles, HT won't help that much > > either (1 OSD per HT core vs 2 OSD per phys core). > > > > i guess the reason for this is that OSD processes have lots of threads? > > > > maybe i can run some tests on a ceph test cluster myself ;) > > > > stijn > > > > > > On 08/12/2017 03:13 PM, David Turner wrote: > > > The reason for an entire core peer osd is that it's trying to avoid > > > context switching your CPU to death. If you have a quad-core > > > processor with HT, I wouldn't recommend more than 8 osds on the box. > > > I probably would go with 7 myself to keep one core available for > > > system operations. This recommendation has nothing to do with GHz. > > > Higher GHz per core will likely improve your cluster latency. Of > > > course if your use case says that you only need very minimal > > > through-put... There is no need to hit or exceed the recommendation. > > > The number of cores recommendation is not changing for bluestore. It > > > might add a recommendation of how fast your processor should be... > > > But making it based on how much GHz per TB is an invitation to context > switch to death. > > > > > > On Sat, Aug 12, 2017, 8:40 AM Stijn De Weirdt > > > <stijn.dewei...@ugent.be> > > > wrote: > > > > > >> hi all, > > >> > > >> thanks for all the feedback. it's clear we should stick to the > > >> 1GB/TB for the memory. > > >> > > >> any (changes to) recommendation for the CPU? in particular, is it > > >> still the rather vague "1 HT core per OSD" (or was it "1 1Ghz HT > > >> core per O
Re: [ceph-users] luminous/bluetsore osd memory requirements
I was under the impression the memory requirements for Bluestore would be around 2-3GB per OSD regardless of capacity. CPU wise, I would lean towards working out how much total Ghz you require and then get whatever CPU you need to get there, but with a preference of Ghz over cores. Yes, there will be a slight overhead to having more threads running on a lower number of cores, but I believe this is fairly minimal in comparison to the speed boost obtained by the single threaded portion of the data path in each OSD from running on a faster Ghz core. Each PG takes a lock for each operation and so any other requests for the same PG will queue up and be processed sequentially. The faster you can process through this stage the better. I'm pretty sure if you graphed PG activity on an average cluster, you would see a high skew to a certain number of PG's being hit more often than others. I think Mark N has been experiencing the effects of the PG locking in recent tests. Also don't forget to make sure your CPUs are running at c-state C1 and max Freq. This can sometimes give up to a 4x reduction in latency. Also, if you look at the number of threads running on a OSD node, it will be in the 10's of 100's of threads, each OSD process itself has several threads. So don't think that 12 OSD's=12 core processor. I did some tests to measure cpu usage per IO, which you may find useful. http://www.sys-pro.co.uk/how-many-mhz-does-a-ceph-io-need/ I can max out 12x7.2k disks on a E3 1240 CPU and its only running at about 15-20%. I haven't done any proper Bluestore tests, but from some rough testing the CPU usage wasn't too dissimilar from Filestore. Depending on if you are running hdd's or ssd's and how many per node. I would possibly look at the single socket E3's or E5's. Although saying that, the recent AMD and Intel announcements also have some potentially interesting single socket Ceph potentials in the mix. Hope that helps. Nick > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf > Of Stijn De Weirdt > Sent: 12 August 2017 14:41 > To: David Turner <drakonst...@gmail.com>; ceph-users@lists.ceph.com > Subject: Re: [ceph-users] luminous/bluetsore osd memory requirements > > hi david, > > sure i understand that. but how bad does it get when you oversubscribe > OSDs? if context switching itself is dominant, then using HT should > allow to run double the amount of OSDs on same CPU (on OSD per HT > core); but if the issue is actual cpu cycles, HT won't help that much > either (1 OSD per HT core vs 2 OSD per phys core). > > i guess the reason for this is that OSD processes have lots of threads? > > maybe i can run some tests on a ceph test cluster myself ;) > > stijn > > > On 08/12/2017 03:13 PM, David Turner wrote: > > The reason for an entire core peer osd is that it's trying to avoid > > context switching your CPU to death. If you have a quad-core > > processor with HT, I wouldn't recommend more than 8 osds on the box. > > I probably would go with 7 myself to keep one core available for > > system operations. This recommendation has nothing to do with GHz. > > Higher GHz per core will likely improve your cluster latency. Of > > course if your use case says that you only need very minimal > > through-put... There is no need to hit or exceed the recommendation. > > The number of cores recommendation is not changing for bluestore. It > > might add a recommendation of how fast your processor should be... > > But making it based on how much GHz per TB is an invitation to context switch to death. > > > > On Sat, Aug 12, 2017, 8:40 AM Stijn De Weirdt > > <stijn.dewei...@ugent.be> > > wrote: > > > >> hi all, > >> > >> thanks for all the feedback. it's clear we should stick to the > >> 1GB/TB for the memory. > >> > >> any (changes to) recommendation for the CPU? in particular, is it > >> still the rather vague "1 HT core per OSD" (or was it "1 1Ghz HT > >> core per OSD"? it would be nice if we had some numbers like > >> required specint per TB and/or per Gbs. also any indication how > >> much more cpu EC uses (10%, 100%, ...)? > >> > >> i'm aware that this also depeneds on the use case, but i'll take > >> any pointers i can get. we will probably end up overprovisioning, > >> but it would be nice if we can avoid a whole cpu (32GB dimms are > >> cheap, so lots of ram with single socket is really possible these days). > >> > >> stijn > >> > >> On 08/10/2017 05:30 PM, Gregory Farnum wrote: > >>> This has been discussed a lot in the performance meetings so I've
Re: [ceph-users] luminous/bluetsore osd memory requirements
hi david, sure i understand that. but how bad does it get when you oversubscribe OSDs? if context switching itself is dominant, then using HT should allow to run double the amount of OSDs on same CPU (on OSD per HT core); but if the issue is actual cpu cycles, HT won't help that much either (1 OSD per HT core vs 2 OSD per phys core). i guess the reason for this is that OSD processes have lots of threads? maybe i can run some tests on a ceph test cluster myself ;) stijn On 08/12/2017 03:13 PM, David Turner wrote: > The reason for an entire core peer osd is that it's trying to avoid context > switching your CPU to death. If you have a quad-core processor with HT, I > wouldn't recommend more than 8 osds on the box. I probably would go with 7 > myself to keep one core available for system operations. This > recommendation has nothing to do with GHz. Higher GHz per core will likely > improve your cluster latency. Of course if your use case says that you only > need very minimal through-put... There is no need to hit or exceed the > recommendation. The number of cores recommendation is not changing for > bluestore. It might add a recommendation of how fast your processor should > be... But making it based on how much GHz per TB is an invitation to > context switch to death. > > On Sat, Aug 12, 2017, 8:40 AM Stijn De Weirdt> wrote: > >> hi all, >> >> thanks for all the feedback. it's clear we should stick to the 1GB/TB >> for the memory. >> >> any (changes to) recommendation for the CPU? in particular, is it still >> the rather vague "1 HT core per OSD" (or was it "1 1Ghz HT core per >> OSD"? it would be nice if we had some numbers like required specint per >> TB and/or per Gbs. also any indication how much more cpu EC uses (10%, >> 100%, ...)? >> >> i'm aware that this also depeneds on the use case, but i'll take any >> pointers i can get. we will probably end up overprovisioning, but it >> would be nice if we can avoid a whole cpu (32GB dimms are cheap, so lots >> of ram with single socket is really possible these days). >> >> stijn >> >> On 08/10/2017 05:30 PM, Gregory Farnum wrote: >>> This has been discussed a lot in the performance meetings so I've added >>> Mark to discuss. My naive recollection is that the per-terabyte >>> recommendation will be more realistic than it was in the past (an >>> effective increase in memory needs), but also that it will be under much >>> better control than previously. >>> >>> On Thu, Aug 10, 2017 at 1:35 AM Stijn De Weirdt >> >>> wrote: >>> hi all, we are planning to purchse new OSD hardware, and we are wondering if for upcoming luminous with bluestore OSDs, anything wrt the hardware recommendations from http://docs.ceph.com/docs/master/start/hardware-recommendations/ will be different, esp the memory/cpu part. i understand from colleagues that the async messenger makes a big difference in memory usage (maybe also cpu load?); but we are also interested in the "1GB of RAM per TB" recommendation/requirement. many thanks, stijn ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] luminous/bluetsore osd memory requirements
The reason for an entire core peer osd is that it's trying to avoid context switching your CPU to death. If you have a quad-core processor with HT, I wouldn't recommend more than 8 osds on the box. I probably would go with 7 myself to keep one core available for system operations. This recommendation has nothing to do with GHz. Higher GHz per core will likely improve your cluster latency. Of course if your use case says that you only need very minimal through-put... There is no need to hit or exceed the recommendation. The number of cores recommendation is not changing for bluestore. It might add a recommendation of how fast your processor should be... But making it based on how much GHz per TB is an invitation to context switch to death. On Sat, Aug 12, 2017, 8:40 AM Stijn De Weirdtwrote: > hi all, > > thanks for all the feedback. it's clear we should stick to the 1GB/TB > for the memory. > > any (changes to) recommendation for the CPU? in particular, is it still > the rather vague "1 HT core per OSD" (or was it "1 1Ghz HT core per > OSD"? it would be nice if we had some numbers like required specint per > TB and/or per Gbs. also any indication how much more cpu EC uses (10%, > 100%, ...)? > > i'm aware that this also depeneds on the use case, but i'll take any > pointers i can get. we will probably end up overprovisioning, but it > would be nice if we can avoid a whole cpu (32GB dimms are cheap, so lots > of ram with single socket is really possible these days). > > stijn > > On 08/10/2017 05:30 PM, Gregory Farnum wrote: > > This has been discussed a lot in the performance meetings so I've added > > Mark to discuss. My naive recollection is that the per-terabyte > > recommendation will be more realistic than it was in the past (an > > effective increase in memory needs), but also that it will be under much > > better control than previously. > > > > On Thu, Aug 10, 2017 at 1:35 AM Stijn De Weirdt > > > wrote: > > > >> hi all, > >> > >> we are planning to purchse new OSD hardware, and we are wondering if for > >> upcoming luminous with bluestore OSDs, anything wrt the hardware > >> recommendations from > >> http://docs.ceph.com/docs/master/start/hardware-recommendations/ > >> will be different, esp the memory/cpu part. i understand from colleagues > >> that the async messenger makes a big difference in memory usage (maybe > >> also cpu load?); but we are also interested in the "1GB of RAM per TB" > >> recommendation/requirement. > >> > >> many thanks, > >> > >> stijn > >> ___ > >> ceph-users mailing list > >> ceph-users@lists.ceph.com > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] luminous/bluetsore osd memory requirements
hi all, thanks for all the feedback. it's clear we should stick to the 1GB/TB for the memory. any (changes to) recommendation for the CPU? in particular, is it still the rather vague "1 HT core per OSD" (or was it "1 1Ghz HT core per OSD"? it would be nice if we had some numbers like required specint per TB and/or per Gbs. also any indication how much more cpu EC uses (10%, 100%, ...)? i'm aware that this also depeneds on the use case, but i'll take any pointers i can get. we will probably end up overprovisioning, but it would be nice if we can avoid a whole cpu (32GB dimms are cheap, so lots of ram with single socket is really possible these days). stijn On 08/10/2017 05:30 PM, Gregory Farnum wrote: > This has been discussed a lot in the performance meetings so I've added > Mark to discuss. My naive recollection is that the per-terabyte > recommendation will be more realistic than it was in the past (an > effective increase in memory needs), but also that it will be under much > better control than previously. > > On Thu, Aug 10, 2017 at 1:35 AM Stijn De Weirdt> wrote: > >> hi all, >> >> we are planning to purchse new OSD hardware, and we are wondering if for >> upcoming luminous with bluestore OSDs, anything wrt the hardware >> recommendations from >> http://docs.ceph.com/docs/master/start/hardware-recommendations/ >> will be different, esp the memory/cpu part. i understand from colleagues >> that the async messenger makes a big difference in memory usage (maybe >> also cpu load?); but we are also interested in the "1GB of RAM per TB" >> recommendation/requirement. >> >> many thanks, >> >> stijn >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] luminous/bluetsore osd memory requirements
This has been discussed a lot in the performance meetings so I've added Mark to discuss. My naive recollection is that the per-terabyte recommendation will be more realistic than it was in the past (an effective increase in memory needs), but also that it will be under much better control than previously. On Thu, Aug 10, 2017 at 1:35 AM Stijn De Weirdtwrote: > hi all, > > we are planning to purchse new OSD hardware, and we are wondering if for > upcoming luminous with bluestore OSDs, anything wrt the hardware > recommendations from > http://docs.ceph.com/docs/master/start/hardware-recommendations/ > will be different, esp the memory/cpu part. i understand from colleagues > that the async messenger makes a big difference in memory usage (maybe > also cpu load?); but we are also interested in the "1GB of RAM per TB" > recommendation/requirement. > > many thanks, > > stijn > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] luminous/bluetsore osd memory requirements
> Op 10 augustus 2017 om 11:14 schreef Marcus Haarmann > <marcus.haarm...@midoco.de>: > > > Hi, > > we have done some testing with bluestore and found that the memory > consumption of the osd > processes is depending not on the real data amount stored but on the number > of stored > objects. > This means that e.g. a block device of 100 GB which spreads over 100 objects > has a different > memory usage than storing 1000 smaller objects (the bluestore blocksize > should be tuned for > that kind of setup). (100 objects of size 4k to 100k had a memory > consumption of ~4GB on the osd > on standard block size, while the amount of data was only ~15GB). Yes, the amount of objects and PGs will determine how much Memory a OSD will use. > So it depends on the usage, a cephfs stores each file as a single object, > while the rbd is configured > to allocate larger objects. > Not true in this case. Both CephFS and RBD stripe over 4MB RADOS objects. So a 1024MB file in CephFS will result in 256 RADOS objects of 4MB in size. This is configurable using directory layouts, but 4MB is the default. Wido > Marcus Haarmann > > > Von: "Stijn De Weirdt" <stijn.dewei...@ugent.be> > An: "ceph-users" <ceph-users@lists.ceph.com> > Gesendet: Donnerstag, 10. August 2017 10:34:48 > Betreff: [ceph-users] luminous/bluetsore osd memory requirements > > hi all, > > we are planning to purchse new OSD hardware, and we are wondering if for > upcoming luminous with bluestore OSDs, anything wrt the hardware > recommendations from > http://docs.ceph.com/docs/master/start/hardware-recommendations/ > will be different, esp the memory/cpu part. i understand from colleagues > that the async messenger makes a big difference in memory usage (maybe > also cpu load?); but we are also interested in the "1GB of RAM per TB" > recommendation/requirement. > > many thanks, > > stijn > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] luminous/bluetsore osd memory requirements
Hi, we have done some testing with bluestore and found that the memory consumption of the osd processes is depending not on the real data amount stored but on the number of stored objects. This means that e.g. a block device of 100 GB which spreads over 100 objects has a different memory usage than storing 1000 smaller objects (the bluestore blocksize should be tuned for that kind of setup). (100 objects of size 4k to 100k had a memory consumption of ~4GB on the osd on standard block size, while the amount of data was only ~15GB). So it depends on the usage, a cephfs stores each file as a single object, while the rbd is configured to allocate larger objects. Marcus Haarmann Von: "Stijn De Weirdt" <stijn.dewei...@ugent.be> An: "ceph-users" <ceph-users@lists.ceph.com> Gesendet: Donnerstag, 10. August 2017 10:34:48 Betreff: [ceph-users] luminous/bluetsore osd memory requirements hi all, we are planning to purchse new OSD hardware, and we are wondering if for upcoming luminous with bluestore OSDs, anything wrt the hardware recommendations from http://docs.ceph.com/docs/master/start/hardware-recommendations/ will be different, esp the memory/cpu part. i understand from colleagues that the async messenger makes a big difference in memory usage (maybe also cpu load?); but we are also interested in the "1GB of RAM per TB" recommendation/requirement. many thanks, stijn ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] luminous/bluetsore osd memory requirements
hi all, we are planning to purchse new OSD hardware, and we are wondering if for upcoming luminous with bluestore OSDs, anything wrt the hardware recommendations from http://docs.ceph.com/docs/master/start/hardware-recommendations/ will be different, esp the memory/cpu part. i understand from colleagues that the async messenger makes a big difference in memory usage (maybe also cpu load?); but we are also interested in the "1GB of RAM per TB" recommendation/requirement. many thanks, stijn ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com