Hi Somnath,
I think you hit the nail on the head, setting librbd to not use TCP_NODELAY
shows the same behaviour as with krbd.
Mark if you are still interested here are the two latency reports
Queue Depth=1
slat (usec): min=24, max=210, avg=39.40, stdev=11.54
clat (usec): min=310, max=78268, avg=769.48, stdev=1764.41
lat (usec): min=341, max=78298, avg=808.88, stdev=1764.39
clat percentiles (usec):
| 1.00th=[ 462], 5.00th=[ 466], 10.00th=[ 474], 20.00th=[ 620],
| 30.00th=[ 620], 40.00th=[ 628], 50.00th=[ 636], 60.00th=[ 772],
| 70.00th=[ 772], 80.00th=[ 788], 90.00th=[ 924], 95.00th=[ 940],
| 99.00th=[ 1080], 99.50th=[ 1384], 99.90th=[33536], 99.95th=[45312],
| 99.99th=[63232]
bw (KB /s): min= 2000, max= 5880, per=100.00%, avg=4951.16, stdev=877.96
lat (usec) : 500=12.71%, 750=40.82%, 1000=45.41%
lat (msec) : 2=0.69%, 4=0.13%, 10=0.04%, 20=0.04%, 50=0.11%
lat (msec) : 100=0.05%
Queue Depth =2
slat (usec): min=21, max=135, avg=38.72, stdev=13.18
clat (usec): min=346, max=77340, avg=6450.22, stdev=13390.20
lat (usec): min=377, max=77368, avg=6488.94, stdev=13389.56
clat percentiles (usec):
| 1.00th=[ 462], 5.00th=[ 470], 10.00th=[ 498], 20.00th=[ 612],
| 30.00th=[ 628], 40.00th=[ 652], 50.00th=[ 684], 60.00th=[ 772],
| 70.00th=[ 820], 80.00th=[ 996], 90.00th=[37120], 95.00th=[38656],
| 99.00th=[40192], 99.50th=[40704], 99.90th=[45312], 99.95th=[64768],
| 99.99th=[77312]
bw (KB /s): min= 931, max= 1611, per=99.42%, avg=1223.84, stdev=186.30
lat (usec) : 500=11.37%, 750=42.60%, 1000=26.11%
lat (msec) : 2=3.37%, 4=0.71%, 10=0.16%, 20=0.16%, 50=15.45%
lat (msec) : 100=0.06%
Many Thanks,
Nick
> -----Original Message-----
> From: ceph-users [mailto:[email protected]] On Behalf Of
> Somnath Roy
> Sent: 06 March 2015 16:02
> To: Alexandre DERUMIER; Nick Fisk
> Cc: ceph-users
> Subject: Re: [ceph-users] Strange krbd behaviour with queue depths
>
> Nick,
> I think this is because of the krbd you are using is using Naggle's algorithm
> i.e
> TCP_NODELAY = false by default.
> The latest krbd module should have the TCP_NODELAY = true by default.
> You may want to try that. But, I think it is available in the latest kernel
> only.
> Librbd is running with TCP_NODELAY = true by default, you may want to try
> with ms_tcp_nodelay = false to simulate the similar behavior with librbd.
>
> Thanks & Regards
> Somnath
>
> -----Original Message-----
> From: ceph-users [mailto:[email protected]] On Behalf Of
> Alexandre DERUMIER
> Sent: Friday, March 06, 2015 3:59 AM
> To: Nick Fisk
> Cc: ceph-users
> Subject: Re: [ceph-users] Strange krbd behaviour with queue depths
>
> Hi, do you have tried with differents io schedulers to compare ?
>
>
> ----- Mail original -----
> De: "Nick Fisk" <[email protected]>
> À: "ceph-users" <[email protected]>
> Envoyé: Jeudi 5 Mars 2015 18:17:27
> Objet: [ceph-users] Strange krbd behaviour with queue depths
>
>
>
> I’m seeing a strange queue depth behaviour with a kernel mapped RBD,
> librbd does not show this problem.
>
>
>
> Cluster is comprised of 4 nodes, 10GB networking, not including OSDs as test
> sample is small so fits in page cache.
>
>
>
> Running fio against a kernel mapped RBD
>
> fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test
> --filename=/dev/rbd/cache1/test2 --bs=4k --readwrite=randread --
> iodepth=1 --runtime=10 --size=1g
>
>
>
> Queue Depth
>
> IOPS
>
>
> 1
>
> 2021
>
>
> 2
>
> 288
>
>
> 4
>
> 376
>
>
> 8
>
> 601
>
>
> 16
>
> 1272
>
>
> 32
>
> 2467
>
>
> 64
>
> 16901
>
>
> 128
>
> 44060
>
>
>
> See how initially I get a very high number of IOs at queue depth 1, but this
> drops dramatically as soon as I start increasing the queue depth. It’s not
> until
> a depth or 32 IOs that I start to get similar performance. Incidentally when
> changing the read type to sequential instead of random the oddity goes
> away.
>
>
>
> Running fio with librbd engine and the same test options I get the following
>
>
>
> Queue Depth
>
> IOPS
>
>
> 1
>
> 1492
>
>
> 2
>
> 3232
>
>
> 4
>
> 7099
>
>
> 8
>
> 13875
>
>
> 16
>
> 18759
>
>
> 32
>
> 17998
>
>
> 64
>
> 18104
>
>
> 128
>
> 18589
>
>
>
>
>
> As you can see the performance scales up nicely, although the top end IO’s
> seem limited to around 18k. I don’t know if this is due to kernel/userspace
> performance differences or if there is a lower max queue depth limit in
> librbd.
>
>
>
> Both tests were run on a small sample size to force the OSD data into page
> cache to rule out any device latency.
>
>
>
> Does anyone know why kernel mapped RBD’s show this weird behaviour? I
> don’t think it can be OSD/ceph config related as it only happens with krbd’s.
>
>
>
> Nick
>
>
>
>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> ________________________________
>
> PLEASE NOTE: The information contained in this electronic mail message is
> intended only for the use of the designated recipient(s) named above. If the
> reader of this message is not the intended recipient, you are hereby notified
> that you have received this message in error and that any review,
> dissemination, distribution, or copying of this message is strictly
> prohibited. If
> you have received this communication in error, please notify the sender by
> telephone or e-mail (as shown above) immediately and destroy any and all
> copies of this message in your possession (whether hard copies or
> electronically stored copies).
>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com