Re: [ceph-users] Bluestore disk colocation using NVRAM, SSD and SATA

Mark Nelson Thu, 21 Sep 2017 07:18:49 -0700


On 09/21/2017 03:19 AM, Maged Mokhtar wrote:

On 2017-09-21 10:01, Dietmar Rieder wrote:

Hi,

I'm in the same situation (NVMEs, SSDs, SAS HDDs). I asked the same
questions to myself.
For now I decided to use the NVMEs as wal and db devices for the SAS
HDDs and on the SSDs I colocate wal and  db.

However, I'm still wonderin how (to what size) and if I should change
the default sizes of wal and db.

Dietmar

On 09/21/2017 01:18 AM, Alejandro Comisario wrote:

But for example, on the same server i have 3 disks technologies to
deploy pools, SSD, SAS and SATA.
The NVME were bought just thinking on the journal for SATA and SAS,
since journals for SSD were colocated.

But now, exactly the same scenario, should i trust the NVME for the SSD
pool ? are there that much of a  gain ? against colocating block.* on
the same SSD?

best.

On Wed, Sep 20, 2017 at 6:36 PM, Nigel Williams
<[email protected] <mailto:[email protected]>
<mailto:[email protected]
<mailto:[email protected]>>> wrote:

    On 21 September 2017 at 04:53, Maximiliano Venesio
    <[email protected] <mailto:[email protected]>
<mailto:[email protected] <mailto:[email protected]>>> wrote:

        Hi guys i'm reading different documents about bluestore, and it
        never recommends to use NVRAM to store the bluefs db,
        nevertheless the official documentation says that, is better to
        use the faster device to put the block.db in.


    Likely not mentioned since no one yet has had the opportunity to
    test it.

        So how do i have to deploy using bluestore, regarding where i
        should put block.wal and block.db ?


    block.* would be best on your NVRAM device, like this:

    ceph-deploy osd create --bluestore c0osd-136:/dev/sda --block-wal
    /dev/nvme0n1 --block-db /dev/nvme0n1



    _______________________________________________
    ceph-users mailing list
    [email protected] <mailto:[email protected]>
<mailto:[email protected] <mailto:[email protected]>>
    http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
    <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>




--
*Alejandro Comisario*
*CTO | NUBELIU*
E-mail: [email protected] <mailto:[email protected]>
<mailto:[email protected] <mailto:[email protected]>>Cell: +54 9
11 3770 1857
_
www.nubeliu.com <http://www.nubeliu.com> <http://www.nubeliu.com/>



_______________________________________________
ceph-users mailing list
[email protected] <mailto:[email protected]>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


_______________________________________________
ceph-users mailing list
[email protected] <mailto:[email protected]>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




My guess is for wal: you are dealing with a 2 step io operation so in
case it is collocated on your SSDs your iops for small writes will be
halfed. The decision is if you add a small NVMEs as wal for 4 or 5
(large) SSDs, you will double their iops for small io sized. This is not
the case for db.

For wal size:  512 MB is recommended ( ceph-disk default )

For db size: a "few" GB..probably 10GB is a good number. I guess we will
hear more in the future.

There's a pretty good chance that if you are writing out lots of smallRGW or rados objects you'll blow past 10GB of metadata once rocksdbspace-amp is factored in. I can pretty routinely do it when writing outmillions of rados objects per OSD. Bluestore will switch to writemetadata out to the block disk and in this case it might not be that badof a transition (NVMe to SSD). If you have spare room, you might aswell give the DB partition whatever you have available on the device. Aharder question is how much fast storage to buy for the WAL/DB. It'snot straight forward, and rocksdb can be tuned in various ways to favorreducing space/write/read amplification, but not all 3 at once. Rightnow we are likely favoring reducing write-amplification over space/readamp, but one could imagine that with a small amount of incredibly faststorage it might be better to favor reducing space-amp.


Mark


Maged Mokhtar




_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Bluestore disk colocation using NVRAM, SSD and SATA

Reply via email to