Re: [ceph-users] NVMe disk - size

Paul Emmerich Fri, 15 Nov 2019 07:06:34 -0800

On Fri, Nov 15, 2019 at 4:04 PM Kristof Coucke <kristof.cou...@gmail.com> wrote:
>
> Hi Paul,
>
> Thank you for the answer.
> I didn't thought of that approach... (Using the NVMe for the meta data pool 
> of RGW).
>
> From where do you get the limitation of 1.3TB?


13 OSDs/Server * 10 Servers * 30 GB/OSD usable DB space / 3 (Replica)


>
> I don't get that one...
>
> Br,
>
> Kristof
>
> Op vr 15 nov. 2019 om 15:26 schreef Paul Emmerich <paul.emmer...@croit.io>:
>>
>> On Fri, Nov 15, 2019 at 3:16 PM Kristof Coucke <kristof.cou...@gmail.com> 
>> wrote:
>> > We’ve configured a Ceph cluster with 10 nodes, each having 13 large disks 
>> > (14TB) and 2 NVMe disks (1,6TB).
>> > The recommendations I’ve read in the online documentation, state that the 
>> > db block device should be around 4%~5% of the slow device. So, the 
>> > block.db should be somewhere between 600GB and 700GB as a best practice.
>>
>> That recommendation is unfortunately not based on any facts :(
>> How much you really need depends on your actual usage.
>>
>> > However… I was thinking to only reserve 200GB per OSD as fast device… 
>> > Which is 1/3 of the recommendation…
>>
>> For various weird internal reason it'll only use ~30 GB in the steady
>> state during operation before spilling over at the moment, 300 GB
>> would be the next magical number
>> (search mailing list for details)
>>
>>
>> > Is it recommended to still use it as a block.db
>>
>> yes
>>
>> > or is it recommended to only use it as a WAL device?
>>
>> no, there is no advantage to that if it's that large
>>
>>
>> > Should I just split the NVMe in three and only configure 3 OSDs to use the 
>> > system? (This would mean that the performace shall be degraded to the 
>> > speed of the slowest device…)
>>
>> no
>>
>> > We’ll only use the system thru the RGW (No CephFS, nor block device), and 
>> > we’ll store “a lot” of small files on it… (Millions of files a day)
>>
>> the current setup gives you around ~1.3 TB of usable metadata space
>> which may or may not be enough, really depends on how much "a lot" is
>> and how small "small" is.
>>
>> It might be better to use the NVMe disks as dedicated OSDs and map all
>> metadata pools onto them directly, that allows you to fully utilize
>> the space for RGW metadata (but not Ceph metadata in the data pools)
>> without running into weird db size restrictions.
>> There are advantages and disadvantages to both approaches
>>
>> Paul
>>
>> >
>> >
>> >
>> > The reason I’m asking it, is that I’ve been able to break the test system 
>> > (long story), causing OSDs to fail as they ran out of space… Expanding the 
>> > disks (the block DB device as well as the main block device) failed with 
>> > the ceph-bluestore-tool…
>> >
>> >
>> >
>> > Thanks for your answer!
>> >
>> >
>> >
>> > Kristof
>> >
>> > _______________________________________________
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] NVMe disk - size

Reply via email to