On Mon, Dec 14, 2020 at 12:46 PM Philip Brown <pbr...@medata.com> wrote: > > Further experimentation with fio's -rw flag, setting to rw=read, and > rw=randwrite, in addition to the original rw=randrw, indicates that it is > tied to writes. > > Possibly some kind of buffer flush delay or cache sync delay when using rbd > device, even though fio specified --direct=1 ?
It might be worthwhile testing with a more realistic io-depth instead of 256 in case you are hitting weird limits due to an untested corner case? Does the performance still degrade with "--iodepth=16" or "--iodepth=32"? > > > ----- Original Message ----- > From: "Philip Brown" <pbr...@medata.com> > To: "dillaman" <dilla...@redhat.com> > Cc: "ceph-users" <ceph-users@ceph.io> > Sent: Monday, December 14, 2020 9:01:21 AM > Subject: Re: [ceph-users] performance degredation every 30 seconds > > Aha.... Insightful question! > running rados bench write to the same pool, does not exhibit any problems. It > consistently shows around 480M/sec throughput, every second. > > So this would seem to be something to do with using rbd devices. Which we > need to do. > > For what it's worth, I'm using Micron 5200 Pro SSDs on all nodes. > > > ----- Original Message ----- > From: "Jason Dillaman" <jdill...@redhat.com> > To: "Philip Brown" <pbr...@medata.com> > Cc: "ceph-users" <ceph-users@ceph.io> > Sent: Monday, December 14, 2020 8:33:09 AM > Subject: Re: [ceph-users] performance degredation every 30 seconds > > On Mon, Dec 14, 2020 at 11:28 AM Philip Brown <pbr...@medata.com> wrote: > > > > > > I have a new 3 node octopus cluster, set up on SSDs. > > > > I'm running fio to benchmark the setup, with > > > > fio --filename=/dev/rbd0 --direct=1 --rw=randrw --bs=4k --ioengine=libaio > > --iodepth=256 --numjobs=1 --time_based --group_reporting > > --name=iops-test-job --runtime=120 --eta-newline=1 > > > > > > > > However, I notice that, approximately every 30 seconds, performance tanks > > for a bit. > > > > Any ideas on why, and better yet, how to get rid of the problem? > > Does the same issue appear when running a direct rados bench? What > brand are your SSDs (i.e. are they data center grade)? > -- Jason _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io