On Fri, Nov 15, 2019 at 4:04 PM Kristof Coucke <kristof.cou...@gmail.com> wrote: > > Hi Paul, > > Thank you for the answer. > I didn't thought of that approach... (Using the NVMe for the meta data pool > of RGW). > > From where do you get the limitation of 1.3TB?
13 OSDs/Server * 10 Servers * 30 GB/OSD usable DB space / 3 (Replica) > > I don't get that one... > > Br, > > Kristof > > Op vr 15 nov. 2019 om 15:26 schreef Paul Emmerich <paul.emmer...@croit.io>: >> >> On Fri, Nov 15, 2019 at 3:16 PM Kristof Coucke <kristof.cou...@gmail.com> >> wrote: >> > We’ve configured a Ceph cluster with 10 nodes, each having 13 large disks >> > (14TB) and 2 NVMe disks (1,6TB). >> > The recommendations I’ve read in the online documentation, state that the >> > db block device should be around 4%~5% of the slow device. So, the >> > block.db should be somewhere between 600GB and 700GB as a best practice. >> >> That recommendation is unfortunately not based on any facts :( >> How much you really need depends on your actual usage. >> >> > However… I was thinking to only reserve 200GB per OSD as fast device… >> > Which is 1/3 of the recommendation… >> >> For various weird internal reason it'll only use ~30 GB in the steady >> state during operation before spilling over at the moment, 300 GB >> would be the next magical number >> (search mailing list for details) >> >> >> > Is it recommended to still use it as a block.db >> >> yes >> >> > or is it recommended to only use it as a WAL device? >> >> no, there is no advantage to that if it's that large >> >> >> > Should I just split the NVMe in three and only configure 3 OSDs to use the >> > system? (This would mean that the performace shall be degraded to the >> > speed of the slowest device…) >> >> no >> >> > We’ll only use the system thru the RGW (No CephFS, nor block device), and >> > we’ll store “a lot” of small files on it… (Millions of files a day) >> >> the current setup gives you around ~1.3 TB of usable metadata space >> which may or may not be enough, really depends on how much "a lot" is >> and how small "small" is. >> >> It might be better to use the NVMe disks as dedicated OSDs and map all >> metadata pools onto them directly, that allows you to fully utilize >> the space for RGW metadata (but not Ceph metadata in the data pools) >> without running into weird db size restrictions. >> There are advantages and disadvantages to both approaches >> >> Paul >> >> > >> > >> > >> > The reason I’m asking it, is that I’ve been able to break the test system >> > (long story), causing OSDs to fail as they ran out of space… Expanding the >> > disks (the block DB device as well as the main block device) failed with >> > the ceph-bluestore-tool… >> > >> > >> > >> > Thanks for your answer! >> > >> > >> > >> > Kristof >> > >> > _______________________________________________ >> > ceph-users mailing list >> > ceph-users@lists.ceph.com >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com