Re: [ceph-users] Performance, and how much wiggle room there is with tunables

Mark Nelson Fri, 10 Nov 2017 10:36:32 -0800


On 11/10/2017 12:21 PM, Maged Mokhtar wrote:

Hi Mark,

It will be interesting to know:

The impact of replication. I guess it will decrease by a higher factor
than the replica count.

I assume you mean the 30K IOPS per OSD is what the client sees, if so
the OSD raw disk itself will be doing more IOPS, is this correct and if
so what is the factor ( the less the better efficiency).

In those tests it's 1x replication with 1 OSD. You do lose more than 3Xfor 3X replication, but it's not necessarily easy to tell how muchdepending on the network, kernel, etc.


Are you running 1 OSD per physical drive or multiple..any recommendations ?

In those tests 1 OSD per NVMe. You can do better if you put multipleOSDs on the same drive, both for filestore and bluestore.


Mark


Cheers /Maged

On 2017-11-10 18:51, Mark Nelson wrote:

FWIW, on very fast drives you can achieve at least 1.4GB/s and 30K+
write IOPS per OSD (before replication).  It's quite possible to do
better but those are recent numbers on a mostly default bluestore
configuration that I'm fairly confident to share.  It takes a lot of
CPU, but it's possible.

Mark

On 11/10/2017 10:35 AM, Robert Stanford wrote:


 Thank you for that excellent observation.  Are there any rumors / has
anyone had experience with faster clusters, on faster networks?  I
wonder how Ceph can get ("it depends"), of course, but I wonder about
numbers people have seen.

On Fri, Nov 10, 2017 at 10:31 AM, Denes Dolhay <de...@denkesys.com
<mailto:de...@denkesys.com>
<mailto:de...@denkesys.com <mailto:de...@denkesys.com>>> wrote:

    So you are using a 40 / 100 gbit connection all the way to your client?

    John's question is valid because 10 gbit = 1.25GB/s ... subtract
    some ethernet, ip, tcp and protocol overhead take into account some
    additional network factors and you are about there...


    Denes


    On 11/10/2017 05:10 PM, Robert Stanford wrote:


     The bandwidth of the network is much higher than that.  The
    bandwidth I mentioned came from "rados bench" output, under the
    "Bandwidth (MB/sec)" row.  I see from comparing mine to others
    online that mine is pretty good (relatively).  But I'd like to get
    much more than that.

    Does "rados bench" show a near maximum of what a cluster can do?
    Or is it possible that I can tune it to get more bandwidth?
    |
    |

    On Fri, Nov 10, 2017 at 3:43 AM, John Spray <jsp...@redhat.com
<mailto:jsp...@redhat.com>
    <mailto:jsp...@redhat.com <mailto:jsp...@redhat.com>>> wrote:

        On Fri, Nov 10, 2017 at 4:29 AM, Robert Stanford
        <rstanford8...@gmail.com
<mailto:rstanford8...@gmail.com> <mailto:rstanford8...@gmail.com
<mailto:rstanford8...@gmail.com>>> wrote:
        >
        >  In my cluster, rados bench shows about 1GB/s bandwidth.
        I've done some
        > tuning:
        >
        > [osd]
        > osd op threads = 8
        > osd disk threads = 4
        > osd recovery max active = 7
        >
        >
        > I was hoping to get much better bandwidth.  My network can
        handle it, and my
        > disks are pretty fast as well.  Are there any major tunables
        I can play with
        > to increase what will be reported by "rados bench"?  Am I
        pretty much stuck
        > around the bandwidth it reported?

        Are you sure your 1GB/s isn't just the NIC bandwidth limit of the
        client you're running rados bench from?

        John

        >
        >  Thank you
        >
        > _______________________________________________
        > ceph-users mailing list
        > ceph-users@lists.ceph.com
<mailto:ceph-users@lists.ceph.com> <mailto:ceph-users@lists.ceph.com
<mailto:ceph-users@lists.ceph.com>>
        > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
        <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
        >




    _______________________________________________
    ceph-users mailing list
    ceph-users@lists.ceph.com
<mailto:ceph-users@lists.ceph.com> <mailto:ceph-users@lists.ceph.com
<mailto:ceph-users@lists.ceph.com>>
    http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
    <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>



    _______________________________________________
    ceph-users mailing list
    ceph-users@lists.ceph.com
<mailto:ceph-users@lists.ceph.com> <mailto:ceph-users@lists.ceph.com
<mailto:ceph-users@lists.ceph.com>>
    http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
    <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>




_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Performance, and how much wiggle room there is with tunables

Reply via email to